11 results on '"Frazer KA"'
Search Results
2. Accurate detection and genotyping of SNPs utilizing population sequencing data.
- Author
-
Bansal V, Harismendy O, Tewhey R, Murray SS, Schork NJ, Topol EJ, and Frazer KA
- Subjects
- Base Sequence, Chromosomes, Human, Pair 9 genetics, Data Collection methods, Data Collection standards, False Positive Reactions, Gene Expression Profiling, Genome, Human genetics, Genome-Wide Association Study, Genotype, Humans, Information Storage and Retrieval methods, Information Storage and Retrieval standards, Meta-Analysis as Topic, Molecular Sequence Data, Oligonucleotide Array Sequence Analysis, Reproducibility of Results, Sequence Analysis, DNA instrumentation, Validation Studies as Topic, Genetics, Population instrumentation, Genetics, Population methods, Genetics, Population standards, Genetics, Population statistics & numerical data, Polymorphism, Single Nucleotide genetics, Sequence Analysis, DNA methods
- Abstract
Next-generation sequencing technologies have made it possible to sequence targeted regions of the human genome in hundreds of individuals. Deep sequencing represents a powerful approach for the discovery of the complete spectrum of DNA sequence variants in functionally important genomic intervals. Current methods for single nucleotide polymorphism (SNP) detection are designed to detect SNPs from single individual sequence data sets. Here, we describe a novel method SNIP-Seq (single nucleotide polymorphism identification from population sequence data) that leverages sequence data from a population of individuals to detect SNPs and assign genotypes to individuals. To evaluate our method, we utilized sequence data from a 200-kilobase (kb) region on chromosome 9p21 of the human genome. This region was sequenced in 48 individuals (five sequenced in duplicate) using the Illumina GA platform. Using this data set, we demonstrate that our method is highly accurate for detecting variants and can filter out false SNPs that are attributable to sequencing errors. The concordance of sequencing-based genotype assignments between duplicate samples was 98.8%. The 200-kb region was independently sequenced to a high depth of coverage using two sequence pools containing the 48 individuals. Many of the novel SNPs identified by SNIP-Seq from the individual sequencing were validated by the pooled sequencing data and were subsequently confirmed by Sanger sequencing. We estimate that SNIP-Seq achieves a low false-positive rate of approximately 2%, improving upon the higher false-positive rate for existing methods that do not utilize population sequence data. Collectively, these results suggest that analysis of population sequencing data is a powerful approach for the accurate detection of SNPs and the assignment of genotypes to individual samples.
- Published
- 2010
- Full Text
- View/download PDF
3. Analysis of allelic differential expression in human white blood cells.
- Author
-
Pant PV, Tao H, Beilharz EJ, Ballinger DG, Cox DR, and Frazer KA
- Subjects
- Base Sequence, Exons, Genomic Imprinting, Heterozygote, Humans, Linkage Disequilibrium, Models, Genetic, Molecular Sequence Data, Oligonucleotide Array Sequence Analysis methods, Polymorphism, Single Nucleotide, Reproducibility of Results, Alleles, Gene Expression, Leukocytes metabolism
- Abstract
Allelic variation of gene expression is common in humans, and is of interest because of its potential contribution to variation in heritable traits. To identify human genes with allelic expression differences, we genotype DNA and examine mRNA isolated from the white blood cells of 12 unrelated individuals using oligonucleotide arrays containing 8406 exonic SNPs. Of the exonic SNPs, 1983, located in 1389 genes, are both expressed in the white blood cells and heterozygous in at least one of the 12 individuals, and thus can be examined for differential allelic expression. Of the 1389 genes, 731 (53%) show allele expression differences in at least one individual. To gain insight into the regulatory mechanisms governing allelic expression differences, we analyze a set of 60 genes containing exonic SNPs that are heterozygous in three or more samples, and for which all heterozygotes display differential expression. We find three patterns of allelic expression, suggesting different underlying regulatory mechanisms. Exonic SNPs in three of the 60 genes are monoallelically expressed in the human white blood cells, and when examined in families show expression of only the maternal copy, consistent with regulation by imprinting. Approximately one-third of the genes have the same allele expressed more highly in all heterozygotes, suggesting that their regulation is predominantly influenced by cis-elements in strong linkage disequilibrium with the assayed exonic SNP. The remaining two-thirds of the genes have different alleles expressed more highly in different heterozygotes, suggesting that their expression differences are influenced by factors not in strong linkage disequilibrium with the assayed exonic SNP.
- Published
- 2006
- Full Text
- View/download PDF
4. Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome.
- Author
-
Frazer KA, Wade CM, Hinds DA, Patil N, Cox DR, and Daly MJ
- Subjects
- Animals, Genetic Variation, Haplotypes, Mice genetics, Mice, Inbred BALB C, Mice, Inbred C3H, Mice, Inbred C57BL, Molecular Sequence Data, Physical Chromosome Mapping, Polymorphism, Single Nucleotide, Base Sequence, Genome, Mice, Inbred Strains genetics, Oligonucleotide Array Sequence Analysis methods, Phylogeny
- Abstract
High-density SNP screening of panels of inbred mouse strains has been proposed as a method to accelerate the identification of genes associated with complex biomedical phenotypes. To evaluate the potential of these studies, a more detailed understanding of the fine structure of sequence variation across inbred mouse strains is needed. Here, we use high-density oligonucleotide arrays to discover an extremely dense set of SNPs in 13 classical and two wild-derived inbred strains in five genomic intervals totaling 4.6 Mb of DNA sequence, and then analyze the segmental haplotype structure defined by these high-density SNPs. This analysis reveals segments ranging from 12 to 608 kb in length within which the inbred strains have a simple and distinct phylogenetic relationship with typically two or three clades accounting for the 13 classical strains examined. The phylogenetic relationships among strains change abruptly and unpredictably from segment to segment, and are distinct in each of the five genomic regions examined. The data suggest that at least 12 strains would need to be resequenced for exhaustive SNP discovery in every region of the mouse genome, that approximately 97% of the variation among inbred strains is ancestral (between clades) and approximately 3% private (within clades), and provides critical insights into the proposed use of panels of inbred strains to identify genes underlying quantitative trait loci., (Copyright 2004 Cold Spring Harbor Laboratory Press ISSN)
- Published
- 2004
- Full Text
- View/download PDF
5. Noncoding sequences conserved in a limited number of mammals in the SIM2 interval are frequently functional.
- Author
-
Frazer KA, Tao H, Osoegawa K, de Jong PJ, Chen X, Doherty MF, and Cox DR
- Subjects
- Animals, Basic Helix-Loop-Helix Transcription Factors, Cats, Cattle, Chromosome Deletion, Chromosomes, Artificial, Bacterial genetics, Chromosomes, Human, Pair 21 genetics, Cloning, Molecular, Computational Biology methods, Conserved Sequence physiology, DNA, Intergenic classification, DNA, Intergenic physiology, Dogs, Evolution, Molecular, Horses genetics, Humans, Macaca mulatta genetics, Mice, Pan troglodytes genetics, Regulatory Sequences, Nucleic Acid, Sequence Homology, Nucleic Acid, Swine genetics, Transcription Factors classification, Transcription Factors physiology, Conserved Sequence genetics, DNA, Intergenic genetics, Transcription Factors genetics
- Abstract
Cross-species DNA sequence comparison is a fundamental method for identifying biologically important elements, because functional sequences are evolutionarily conserved, wheres nonfunctional sequences drift. A recent genome-wide comparison of human and mouse DNA discovered over 200,000 conserved noncoding sequences with unknown function. Multispecies DNA comparison has been proposed as a method to prioritize these conserved noncoding sequences for functional analysis based on the hypothesis that elements present in many species are more likely to be functional than elements present in limited numbers of species. Here, we perform a comparative analysis of the single-minded 2 (SIM2) gene interval on human chromosome 21 with horse, cow, pig, dog, cat, and mouse DNA. We classify conserved sequences based on the number of mammals in which they are present, and experimentally test sequences in each class for function. As hypothesized, conserved sequences present in many mammals are frequently functional. Additionally, we demonstrate that sequences conserved in a limited number of mammals are also frequently functional. Examination of genomic deletions in chimpanzee and rhesus macaque DNA showed that several putatively functional conserved noncoding human sequences were absent in these primates. These findings suggest that functional conserved noncoding human sequences can be missing in other mammals, even closely related primate species.
- Published
- 2004
- Full Text
- View/download PDF
6. Genomic DNA insertions and deletions occur frequently between humans and nonhuman primates.
- Author
-
Frazer KA, Chen X, Hinds DA, Pant PV, Patil N, and Cox DR
- Subjects
- Animals, Cebidae genetics, Chromosome Mapping methods, Chromosomes genetics, Chromosomes, Human, Pair 21 genetics, Gene Rearrangement genetics, Humans, Macaca mulatta genetics, Oligonucleotide Array Sequence Analysis methods, Phenotype, Polymerase Chain Reaction methods, Pongo pygmaeus genetics, Synteny genetics, Chromosome Deletion, DNA genetics, Gene Amplification genetics, Gene Frequency genetics, Pan troglodytes genetics
- Abstract
Comparative DNA sequence studies between humans and nonhuman primates will be important for understanding the genetic basis of the phenotypic differences between these species. Here we compare approximately 27 Mb of human chromosome 21 with chimpanzee DNA sequences identifying 57 genomic rearrangements (deletions and insertions ranging in size from 0.2 to 8.0 kb) between the two species. These rearrangements are distributed along the entire length of chromosome 21, with approximately 35% found in genomic intervals encoding genes (genic intervals), and have occurred in the genomes of both humans and chimpanzees. Comparison of approximately 9 Mb of human chromosome 21 with orangutan, rhesus macaque, and woolly monkey DNA sequences identified a combined total of 114 genomic rearrangements between humans and nonhuman primates. Analysis of these rearrangements revealed that they are randomly distributed with respect to genic and nongenic intervals and identified one deletion that has likely resulted in the inactivation of a gene (beta1,3-galactosyltransferase) in the woolly monkey. Our data show that genomic rearrangements have occurred frequently during primate genome evolution and significantly contribute to the DNA differences between these species. These DNA rearrangements are commonly found in genic intervals, and thus provide natural starting points for focused investigations of qualitative and quantitative gene expression differences between humans and other primates.
- Published
- 2003
- Full Text
- View/download PDF
7. Cross-species sequence comparisons: a review of methods and available resources.
- Author
-
Frazer KA, Elnitski L, Church DM, Dubchak I, and Hardison RC
- Subjects
- Animals, Computational Biology methods, DNA genetics, Humans, Molecular Sequence Data, Sequence Homology, Nucleic Acid, Species Specificity, Base Sequence genetics, Databases, Nucleic Acid supply & distribution, Sequence Alignment methods
- Abstract
With the availability of whole-genome sequences for an increasing number of species, we are now faced with the challenge of decoding the information contained within these DNA sequences. Comparative analysis of DNA sequences from multiple species at varying evolutionary distances is a powerful approach for identifying coding and functional noncoding sequences, as well as sequences that are unique for a given organism. In this review, we outline the strategy for choosing DNA sequences from different species for comparative analyses and describe the methods used and the resources publicly available for these studies.
- Published
- 2003
- Full Text
- View/download PDF
8. Evolutionarily conserved sequences on human chromosome 21.
- Author
-
Frazer KA, Sheehan JB, Stokowski RP, Chen X, Hosseini R, Cheng JF, Fodor SP, Cox DR, and Patil N
- Subjects
- Animals, Chromosomes, Artificial, Bacterial genetics, DNA genetics, Dogs, Genes, Overlapping genetics, Humans, Mice, Oligonucleotide Array Sequence Analysis methods, Sensitivity and Specificity, Synteny genetics, Chromosomes, Human, Pair 21 genetics, Conserved Sequence genetics, Evolution, Molecular
- Abstract
Comparison of human sequences with the DNA of other mammals is an excellent means of identifying functional elements in the human genome. Here we describe the utility of high-density oligonucleotide arrays as a rapid approach for comparing human sequences with the DNA of multiple species whose sequences are not presently available. High-density arrays representing approximately 22.5 Mb of nonrepetitive human chromosome 21 sequence were synthesized and then hybridized with mouse and dog DNA to identify sequences conserved between humans and mice (human-mouse elements) and between humans and dogs (human-dog elements). Our data show that sequence comparison of multiple species provides a powerful empiric method for identifying actively conserved elements in the human genome. A large fraction of these evolutionarily conserved elements are present in regions on chromosome 21 that do not encode known genes.
- Published
- 2001
- Full Text
- View/download PDF
9. Active conservation of noncoding sequences revealed by three-way species comparisons.
- Author
-
Dubchak I, Brudno M, Loots GG, Pachter L, Mayor C, Rubin EM, and Frazer KA
- Subjects
- Animals, Dogs, Humans, Mice, Molecular Sequence Data, Species Specificity, Untranslated Regions isolation & purification, Conserved Sequence genetics, Sequence Alignment, Untranslated Regions genetics
- Abstract
Human and mouse genomic sequence comparisons are being increasingly used to search for evolutionarily conserved gene regulatory elements. Large-scale human-mouse DNA comparison studies have discovered numerous conserved noncoding sequences of which only a fraction has been functionally investigated A question therefore remains as to whether most of these noncoding sequences are conserved because of functional constraints or are the result of a lack of divergence time.
- Published
- 2000
- Full Text
- View/download PDF
10. PipMaker--a web server for aligning two genomic DNA sequences.
- Author
-
Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, and Miller W
- Subjects
- Animals, Base Sequence genetics, Caenorhabditis elegans genetics, Computational Biology, DNA genetics, Escherichia coli genetics, Genes, Bacterial, Genes, Helminth, Genes, Protozoan, Humans, Internet trends, Mice, Molecular Sequence Data, Salmonella typhimurium genetics, Sequence Alignment methods, Sequence Alignment trends, DNA chemistry, Internet statistics & numerical data, Sequence Alignment statistics & numerical data, Software
- Abstract
PipMaker (http://bio.cse.psu.edu) is a World-Wide Web site for comparing two long DNA sequences to identify conserved segments and for producing informative, high-resolution displays of the resulting alignments. One display is a percent identity plot (pip), which shows both the position in one sequence and the degree of similarity for each aligning segment between the two sequences in a compact and easily understandable form. Positions along the horizontal axis can be labeled with features such as exons of genes and repetitive elements, and colors can be used to clarify and enhance the display. The web site also provides a plot of the locations of those segments in both species (similar to a dot plot). PipMaker is appropriate for comparing genomic sequences from any two related species, although the types of information that can be inferred (e.g., protein-coding regions and cis-regulatory elements) depend on the level of conservation and the time and divergence rate since the separation of the species. Gene regulatory elements are often detectable as similar, noncoding sequences in species that diverged as much as 100-300 million years ago, such as humans and mice, Caenorhabditis elegans and C. briggsae, or Escherichia coli and Salmonella spp. PipMaker supports analysis of unfinished or "working draft" sequences by permitting one of the two sequences to be in unoriented and unordered contigs.
- Published
- 2000
- Full Text
- View/download PDF
11. Computational and biological analysis of 680 kb of DNA sequence from the human 5q31 cytokine gene cluster region.
- Author
-
Frazer KA, Ueda Y, Zhu Y, Gifford VR, Garofalo MR, Mohandas N, Martin CH, Palazzolo MJ, Cheng JF, and Rubin EM
- Subjects
- Animals, Blotting, Northern, Chromosome Mapping methods, Chromosomes, Artificial, Yeast, Humans, Interleukins genetics, Mice, Mice, Transgenic, Molecular Sequence Data, Polymerase Chain Reaction, Protein Biosynthesis, Proteins genetics, RNA genetics, Sequence Analysis, DNA methods, Sequence Homology, Nucleic Acid, Sequence Tagged Sites, Software, Transcription, Genetic, Chromosomes, Human, Pair 5, Computational Biology methods, Cytokines genetics, Multigene Family
- Abstract
With the human genome project advancing into what will be a 7- to 10-year DNA sequencing phase, we are presented with the challenge of developing strategies to convert genomic sequence data, as they become available, into biologically meaningful information. We have analyzed 680 kb of noncontiguous DNA sequence from a 1-Mb region of human chromosome 5q31, coupling computational analysis with gene expression studies of tissues isolated from humans as well as from mice containing human YAC transgenes. This genomic interval has been noted previously for containing the cytokine gene cluster and a quantitative trait locus associated with inflammatory diseases. Our analysis identified and verified expression of 16 new genes, as well as 7 previously known genes. Of the total of 23 genes in this region, 78% had similarity matches to sequences in protein databases and 83% had exact expressed sequence tag (EST) database matches. Comparative mapping studies of eight of the new human genes discovered in the 5q31 region revealed that all are located in the syntenic region of mouse chromosome 11q. Our analysis demonstrates an approach for examining human sequence as it is made available from large sequencing programs and has resulted in the discovery of several biomedically important genes, including a cyclin, a transcription factor that is homologous to an oncogene, a protein involved in DNA repair, and several new members of a family of transporter proteins.
- Published
- 1997
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.