59 results on '"Joseph K. Pickrell"'
Search Results
2. Low-pass sequencing plus imputation using avidity sequencing displays comparable imputation accuracy to sequencing by synthesis while reducing duplicates
- Author
-
Jeremiah H. Li, Karrah Findley, Joseph K. Pickrell, Kelly Blease, Junhua Zhao, and Semyon Kruglyak
- Abstract
Low-pass sequencing with genotype imputation has been adopted as a cost-effective method for genotyping. The most widely used method of short-read sequencing uses sequencing by synthesis (SBS). Here we perform a study of a novel sequencing technology — avidity sequencing. In this short note, we compare the performance of imputation from low-pass libraries sequenced on an Element AVITI system (which utilizes avidity sequencing) to those sequenced on an Illumina NovaSeq 6000 (which utilizes SBS) with an SP flow cell for the same set of biological samples across a range of genetic ancestries. We observed dramatically lower duplication rates in the data deriving from the AVITI system compared to the NovaSeq 6000, resulting in higher effective coverage given a fixed number of sequenced bases, and comparable imputation accuracy performance between sequencing chemistries across ancestries. This study demonstrates that avidity sequencing is a viable alternative to the standard SBS chemistries for applications involving low-pass sequencing plus imputation.
- Published
- 2022
- Full Text
- View/download PDF
3. Low-pass sequencing increases the power of GWAS and decreases measurement error of polygenic risk scores compared to genotyping arrays
- Author
-
Chase A. Mazur, Tomaz Berisa, Joseph K. Pickrell, and Jeremiah H. Li
- Subjects
Genotype ,Concordance ,Genome-wide association study ,Computational biology ,Biology ,Genome ,Statistical power ,03 medical and health sciences ,0302 clinical medicine ,Risk Factors ,Genetics ,Humans ,1000 Genomes Project ,Genotyping ,Genetics (clinical) ,030304 developmental biology ,Genetic association ,0303 health sciences ,Genome, Human ,Research ,Haplotype ,High-Throughput Nucleotide Sequencing ,Minor allele frequency ,Haplotypes ,030217 neurology & neurosurgery ,Imputation (genetics) ,Genome-Wide Association Study - Abstract
Low-pass sequencing (sequencing a genome to an average depth less than 1x coverage) combined with genotype imputation has been proposed as an alternative to genotyping arrays for trait mapping and calculation of polygenic scores; however, the current literature is largely limited to simulation- and downsampling-based approaches. To empirically assess the relative performance of these technologies for different applications, we performed low-pass sequencing (targeting coverage levels of 0.5x and 1x) and array genotyping (using the Illumina Global Screening Array) on 120 DNA samples derived from African and European-ancestry individuals that are part of the 1000 Genomes Project. We then imputed both the sequencing data and the genotyping array data to the 1000 Genomes Phase 3 haplotype reference panel using a leave-one-out design. First, we evaluated overall imputation accuracy from these different assays as measured by genotype concordance; we introduce the concept of effective coverage that accounts for evenness of sequencing and show that this metric is a better predictor of imputation accuracy than nominal mapped coverage for low-pass sequencing data. Next, we evaluated overall power for genome-wide association studies (GWAS) as measured by the squared correlation between imputed and true genotypes. In the African individuals, at common variants (> 5% minor allele frequency), imputation r 2 averaged 0.83 for the array data and ranged from 0.89 to 0.95 for the low-pass sequencing data, corresponding to an effective 7 – 15% increase in GWAS discovery power. For the same variants in the European individuals, imputation r 2 averaged 0.91 for the array data and ranged from 0.92-0.96 for the low-pass sequencing data, corresponding to an effective 1-6% increase in GWAS discovery power. Finally, we computed polygenic risk scores for breast cancer and coronary artery disease from the different assays. We observed consistently lower measurement error for risk scores computed from low-pass sequencing data above an effective coverage of ∼ 0.5x. The mean squared error of the array-based estimates was three to four times that of the estimates from samples sequenced at an effective coverage of ∼ 1.2x for coronary artery disease, with qualitatively similar results for breast cancer. We conclude that low-pass sequencing plus imputation, in addition to providing a substantial increase in statistical power for genome wide association studies, provides increased accuracy for polygenic risk prediction at effective coverages of ∼ 0.5x and higher.
- Published
- 2021
- Full Text
- View/download PDF
4. Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations
- Author
-
NeuroGAP-Psychosis Study Team, Roxanne James, Carter P. Newman, Rocky E. Stroud, Abebaw Fekadu, Joseph Kyebuzibwa, Sheila Dodge, Elizabeth G. Atkinson, Lerato Majara, Dan J. Stein, Zukiswa Zingela, Wilfred E. Injera, Lori B. Chibnik, Mark J. Daly, Raj Ramesar, Karestan C. Koenen, Anne Stevenson, Tamrat Abebe, Solomon Teferra, Gabriel Kigen, Timothy DeSmet, Symon M. Kariuki, Joseph K. Pickrell, Stella Gichuru, Melkam Alemayehu, Fred K. Ashaba, Welelta Shiferaw, Lukoye Atwoli, Dickens Akena, Edith Kwobah, Charles R. Newton, Tera Bowers, Steven Ferriera, Rehema M. Mwema, Benjamin M. Neale, Bizu Gelaye, Alicia R. Martin, Henry Musinguzi, Celia van der Merwe, Sinéad B. Chapman, and Team, NeuroGAP-Psychosis Study
- Subjects
Test data generation ,Concordance ,Sequencing data ,DNA Mutational Analysis ,Genomics ,Genome-wide association study ,Computational biology ,Variation (game tree) ,Biology ,Genome ,European descent ,Article ,03 medical and health sciences ,0302 clinical medicine ,Genetics ,Humans ,Genotyping ,Genetics (clinical) ,030304 developmental biology ,Whole genome sequencing ,0303 health sciences ,Health Equity ,Whole Genome Sequencing ,Genome, Human ,Microbiota ,Genetic Variation ,Genetics, Population ,Africa ,030217 neurology & neurosurgery ,Genome-Wide Association Study - Abstract
BackgroundGenetic studies of biomedical phenotypes in underrepresented populations identify disproportionate numbers of novel associations. However, current genomics infrastructure--including most genotyping arrays and sequenced reference panels--best serves populations of European descent. A critical step for facilitating genetic studies in underrepresented populations is to ensure that genetic technologies accurately capture variation in all populations. Here, we quantify the accuracy of low-coverage sequencing in diverse African populations.ResultsWe sequenced the whole genomes of 91 individuals to high-coverage (≥20X) from the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study, in which participants were recruited from Ethiopia, Kenya, South Africa, and Uganda. We empirically tested two data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole genome sequencing data. We show that low-coverage sequencing at a depth of ≥4X captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1X) performed comparable to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation, with 4X sequencing detecting 45% of singletons and 95% of common variants identified in high-coverage African whole genomes.ConclusionThese results indicate that low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, including those that capture variation most common in Europeans and Africans. Low-coverage sequencing effectively identifies novel variation (particularly in underrepresented populations), and presents opportunities to enhance variant discovery at a similar cost to traditional approaches.
- Published
- 2020
5. Relative matching using low coverage sequencing
- Author
-
Daphna Weissglas-Volkov, Regev Schweiger, Joseph K. Pickrell, Jeremiah H. Li, Ella Petter, Tal Shor, Lior Almog, Tomaz Berisa, Yoav Naveh, Bar Shahino, Malka Aker, Yaniv Erlich, Shai Carmi, and Oron Navon
- Subjects
Matching (statistics) ,integumentary system ,Computer science ,business.industry ,Genetic genealogy ,Genomic data ,Population genetics ,Pattern recognition ,chemistry.chemical_compound ,chemistry ,Genomic information ,Artificial intelligence ,business ,Genotyping ,DNA - Abstract
Finding familial relatives using DNA has multiple applications, in genetic genealogy, population genetics, and forensics. So far, most relative matching algorithms rely on detecting identity-by-descent (IBD) segments with high quality genotype data. Recently, low coverage sequencing (LCS) has received growing attention as a promising cost-effective method to ascertain genomic information. However, with higher error rates, it is unclear whether existing IBD detection can work on LCS datasets. Here, we developed and tested a framework for relative matching using sequencing with 1× coverage (1×LCS). We started by exploring the error characteristics of this method compared to array data. Our results show that after some optimization 1×LCS can exhibit the same genotyping discordance rates as the discordance between two array platforms. Using this observation, we developed a hybrid framework for relative matching and tuned this framework with >2,700 pairs of confirmed genealogical relatives that were genotyped using heterogenous datasets. We then obtained array and 1×LCS on 19 samples and use our framework to find relatives in a database of over 3 million individuals. The total length of shared segments obtained by 1×LCS was virtually indistinguishable to genotyping arrays for matches with a total sharing >200cM (second cousins or closer). For more distant relatives, as long as those were detected by both technologies, the total length obtained by LCS and by genotyping arrays was highly correlated, with no evidence of over- or underestimation. Taken together, our results show that 1×LCS can be a valid alternative to arrays for relative matching, opening the possibility for further democratization of genomic data.
- Published
- 2020
- Full Text
- View/download PDF
6. Population genetics of the coral Acropora millepora : Toward genomic prediction of bleaching
- Author
-
Luke A. Morris, Peter Andolfatto, Luke Sarre, Mikhail V. Matz, Joseph K. Pickrell, Neal E. Cantin, Yi Liao, Molly Przeworski, Zachary L. Fuller, Julie Peng, Line K. Bay, Jihanne Shepherd, and Veronique J. L. Mocellin
- Subjects
geography ,education.field_of_study ,Multidisciplinary ,geography.geographical_feature_category ,biology ,Coral ,Population ,Population genetics ,Genomics ,Coral reef ,biology.organism_classification ,Acropora millepora ,Evolutionary biology ,education ,Reef ,Local adaptation - Abstract
Conservation help from genomics Corals worldwide are under threat from rising sea temperatures and pollution. One response to heat stress is coral bleaching—the loss of photosynthetic endosymbionts that provide energy for the coral. Fuller et al. present a high-resolution genome of the coral Acropora millepora (see the Perspective by Bay and Guerrero). They were able to perform population genetic analyses with samples sequenced at lower coverage and conduct genome-wide association studies. These data were combined to generate a polygenic risk score for bleaching that can be used in coral conservation. Science this issue p. eaba4674 ; see also p. 249
- Published
- 2020
- Full Text
- View/download PDF
7. Detecting Polygenic Adaptation in Admixture Graphs
- Author
-
Fernando Racimo, Jeremy J. Berg, and Joseph K. Pickrell
- Subjects
0301 basic medicine ,Multifactorial Inheritance ,Population ,Adaptation, Biological ,Genome-wide association study ,Investigations ,Biology ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,symbols.namesake ,Genetics ,Humans ,Computer Simulation ,Selection, Genetic ,education ,Allele frequency ,Selection (genetic algorithm) ,Interpretability ,education.field_of_study ,Models, Genetic ,Genome, Human ,Markov chain Monte Carlo ,Genomics ,Markov Chains ,Genetics, Population ,030104 developmental biology ,Evolutionary biology ,symbols ,Trait ,Adaptation ,Algorithms ,Genome-Wide Association Study - Abstract
Polygenic adaptation occurs when natural selection changes the average value of a complex trait in a population, via small shifts in allele frequencies at many loci. Here, Racimo, Berg, and Pickrell present a method... An open question in human evolution is the importance of polygenic adaptation: adaptive changes in the mean of a multifactorial trait due to shifts in allele frequencies across many loci. In recent years, several methods have been developed to detect polygenic adaptation using loci identified in genome-wide association studies (GWAS). Though powerful, these methods suffer from limited interpretability: they can detect which sets of populations have evidence for polygenic adaptation, but are unable to reveal where in the history of multiple populations these processes occurred. To address this, we created a method to detect polygenic adaptation in an admixture graph, which is a representation of the historical divergences and admixture events relating different populations through time. We developed a Markov chain Monte Carlo (MCMC) algorithm to infer branch-specific parameters reflecting the strength of selection in each branch of a graph. Additionally, we developed a set of summary statistics that are fast to compute and can indicate which branches are most likely to have experienced polygenic adaptation. We show via simulations that this method—which we call PolyGraph—has good power to detect polygenic adaptation, and applied it to human population genomic data from around the world. We also provide evidence that variants associated with several traits, including height, educational attainment, and self-reported unibrow, have been influenced by polygenic adaptation in different populations during human evolution.
- Published
- 2018
- Full Text
- View/download PDF
8. DNA.Land is a framework to collect genomes and phenomes in the era of abundant genetic information
- Author
-
Jie Yuan, Dina Zielinski, Assaf Gordon, Daniel Speyer, Joseph K. Pickrell, Richard Aufrichtig, and Yaniv Erlich
- Subjects
0301 basic medicine ,Datasets as Topic ,Genomics ,Biology ,Phenome ,Personal autonomy ,Genome ,User-Computer Interface ,03 medical and health sciences ,0302 clinical medicine ,Patient Portals ,Databases, Genetic ,Genetics ,medicine ,Humans ,Genetic Testing ,Precision Medicine ,Genetic Association Studies ,Genetic testing ,medicine.diagnostic_test ,Commerce ,DNA ,Sequence Analysis, DNA ,Biobank ,High-Throughput Screening Assays ,Phenotype ,ComputingMethodologies_PATTERNRECOGNITION ,030104 developmental biology ,Evolutionary biology ,030220 oncology & carcinogenesis ,Personal Autonomy ,Crowdsourcing - Abstract
Creating large genome/phenome collections can require consortium-scale resources. DNA.Land is a digital biobank that collects genetic data from individuals tested by consumer genomic companies using a fraction of the resources of traditional studies.
- Published
- 2018
- Full Text
- View/download PDF
9. Contrasting Determinants of Mutation Rates in Germline and Soma
- Author
-
Hongjian Qi, Molly Przeworski, Yufeng Shen, Chen Chen, and Joseph K. Pickrell
- Subjects
0301 basic medicine ,Mutation rate ,mutation rate ,Somatic cell ,Germline mosaicism ,Investigations ,Biology ,Germline ,Histones ,03 medical and health sciences ,0302 clinical medicine ,Germline mutation ,Neoplasms ,Genetics ,Humans ,human ,Population and Evolutionary Genetics ,Germ-Line Mutation ,Base Composition ,Replication timing ,Models, Genetic ,Point mutation ,Molecular biology ,strand asymmetry ,Gene Expression Regulation, Neoplastic ,030104 developmental biology ,Histone ,DNA methylation ,biology.protein ,germline mutations ,somatic mutations ,030217 neurology & neurosurgery - Abstract
A number of genomic features influence regional mutation rates in germline and soma. To examine if some factors behave differently in the two tissue..., Recent studies of somatic and germline mutations have led to the identification of a number of factors that influence point mutation rates, including CpG methylation, expression levels, replication timing, and GC content. Intriguingly, some of the effects appear to differ between soma and germline: in particular, whereas mutation rates have been reported to decrease with expression levels in tumors, no clear effect has been detected in the germline. Distinct approaches were taken to analyze the data, however, so it is hard to know whether these apparent differences are real. To enable a cleaner comparison, we considered a statistical model in which the mutation rate of a coding region is predicted by GC content, expression levels, replication timing, and two histone repressive marks. We applied this model to both a set of germline mutations identified in exomes and to exonic somatic mutations in four types of tumors. Most determinants of mutations are shared: notably, we detected an effect of expression levels on both germline and somatic mutation rates. Moreover, in all tissues considered, higher expression levels are associated with greater strand asymmetry of mutations. However, mutation rates increase with expression levels in testis (and, more tentatively, in ovary), whereas they decrease with expression levels in somatic tissues. This contrast points to differences in damage or repair rates during transcription in soma and germline.
- Published
- 2017
- Full Text
- View/download PDF
10. Population genetics of the coral Acropora millepora: Towards a genomic predictor of bleaching
- Author
-
Neal E. Cantin, Line K. Bay, Yi Liao, Joseph K. Pickrell, Molly Przeworski, Luke A. Morris, Zachary L. Fuller, Peter Andolfatto, Julie Peng, Veronique J. L. Mocellin, Luke Sarre, Jihanne Shepherd, and Mikhail V. Matz
- Subjects
0106 biological sciences ,0303 health sciences ,geography ,geography.geographical_feature_category ,biology ,Coral ,Population genetics ,Genomics ,biology.organism_classification ,Balancing selection ,010603 evolutionary biology ,01 natural sciences ,Genome ,03 medical and health sciences ,Acropora millepora ,Evolutionary biology ,14. Life underwater ,Reef ,030304 developmental biology ,Local adaptation - Abstract
Although reef-building corals are rapidly declining worldwide, responses to bleaching vary both within and among species. Because these inter-individual differences are partly heritable, they should in principle be predictable from genomic data. Towards that goal, we generated a chromosome-scale genome assembly for the coralAcropora millepora. We then obtained whole genome sequences for 237 phenotyped samples collected at 12 reefs distributed along the Great Barrier Reef, among which we inferred very little population structure. Scanning the genome for evidence of local adaptation, we detected signatures of long-term balancing selection in the heat-shock co-chaperonesacsin. We further used 213 of the samples to conduct a genome-wide association study of visual bleaching score, incorporating the polygenic score derived from it into a predictive model for bleaching in the wild. These results set the stage for the use of genomics-based approaches in conservation strategies.
- Published
- 2019
- Full Text
- View/download PDF
11. Comparing low-pass sequencing and genotyping for trait mapping in pharmacogenetics
- Author
-
Charles J. Cox, Joseph K. Pickrell, Dana J. Fraser, Karen King, Kaja A. Wasik, Jeremiah H. Li, and Tomaz Berisa
- Subjects
Trait mapping ,Genotype ,Genotyping Techniques ,lcsh:QH426-470 ,040301 veterinary sciences ,lcsh:Biotechnology ,Concordance ,Single-nucleotide polymorphism ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Genome ,0403 veterinary science ,03 medical and health sciences ,lcsh:TP248.13-248.65 ,Genetics ,Humans ,Genotyping ,030304 developmental biology ,Genetic association ,0303 health sciences ,Low-pass sequencing ,Genotype imputation ,High-Throughput Nucleotide Sequencing ,04 agricultural and veterinary sciences ,lcsh:Genetics ,Pharmacogenetics ,Sample size determination ,Imputation (genetics) ,Research Article ,Genome-Wide Association Study ,Biotechnology - Abstract
BackgroundLow pass sequencing has been proposed as a cost-effective alternative to genotyping arrays to identify genetic variants that influence multifactorial traits in humans. For common diseases this typically has required both large sample sizes and comprehensive variant discovery. Genotyping arrays are also routinely used to perform pharmacogenetic (PGx) experiments where sample sizes are likely to be significantly smaller, but clinically relevant effect sizes likely to be larger.ResultsTo assess how low pass sequencing would compare to array based genotyping for PGx we compared a low-pass assay (in which 1x coverage or less of a target genome is sequenced) along with software for genotype imputation to standard approaches. We sequenced 79 individuals to 1x genome coverage and genotyped the same samples on the Affymetrix Axiom Biobank Precision Medicine Research Array (PMRA). We then down-sampled the sequencing data to 0.8x, 0.6x, and 0.4x coverage, and performed imputation. Both the genotype data and the sequencing data were further used to impute human leukocyte antigen (HLA) genotypes for all samples. We compared the sequencing data and the genotyping array data in terms of four metrics: overall concordance, concordance at single nucleotide polymorphisms in pharmacogenetics-related genes, concordance in imputed HLA genotypes, and imputation r2. Overall concordance between the two assays ranged from 98.2% (for 0.4x coverage sequencing) to 99.2% (for 1x coverage sequencing), with qualitatively similar numbers for the subsets of variants most important in pharmacogenetics. At common single nucleotide polymorphisms (SNPs), the mean imputation r2from the genotyping array was 0.90, which was comparable to the imputation r2from 0.4x coverage sequencing, while the mean imputation r2from 1x sequencing data was 0.96.ConclusionsThese results indicate that low-pass sequencing to a depth above 0.4x coverage attains higher power for association studies when compared to the PMRA and should be considered as a competitive alternative to genotyping arrays for trait mapping in pharmacogenetics.
- Published
- 2019
- Full Text
- View/download PDF
12. Assessment of Imputation from Low-Pass Sequencing to Predict Merit of Beef Steers
- Author
-
Larry A. Kuehn, Warren M. Snelling, Joseph K. Pickrell, Brittney N Keel, Jeremiah H. Li, Amanda K. Lindholm-Perry, and Jesse L. Hoff
- Subjects
Male ,0301 basic medicine ,Genotype ,lcsh:QH426-470 ,Marbled meat ,imputation ,Computational biology ,Breeding ,Beef cattle ,Biology ,Polymorphism, Single Nucleotide ,Article ,03 medical and health sciences ,beef cattle ,Genetics ,Animals ,Animal Husbandry ,genomic prediction ,Genetics (clinical) ,Molecular breeding ,0402 animal and dairy science ,Genomics ,Sequence Analysis, DNA ,04 agricultural and veterinary sciences ,sequence ,040201 dairy & animal science ,United States ,Red Meat ,lcsh:Genetics ,Phenotype ,030104 developmental biology ,Cattle ,Imputation (genetics) ,SNP array - Abstract
Decreasing costs are making low coverage sequencing with imputation to a comprehensive reference panel an attractive alternative to obtain functional variant genotypes that can increase the accuracy of genomic prediction. To assess the potential of low-pass sequencing, genomic sequence of 77 steers sequenced to >, 10X coverage was downsampled to 1X and imputed to a reference of 946 cattle representing multiple Bos taurus and Bos indicus-influenced breeds. Genotypes for nearly 60 million variants detected in the reference were imputed from the downsampled sequence. The imputed genotypes strongly agreed with the SNP array genotypes (r¯, =0.99) and the genotypes called from the transcript sequence (r¯, =0.97). Effects of BovineSNP50 and GGP-F250 variants on birth weight, postweaning gain, and marbling were solved without the steers&rsquo, phenotypes and genotypes, then applied to their genotypes, to predict the molecular breeding values (MBV). The steers&rsquo, MBV were similar when using imputed and array genotypes. Replacing array variants with functional sequence variants might allow more robust MBV. Imputation from low coverage sequence offers a viable, low-cost approach to obtain functional variant genotypes that could improve genomic prediction.
- Published
- 2020
- Full Text
- View/download PDF
13. DNA.Land: A digital biobank using a massive crowdsourcing approach
- Author
-
Yaniv Erlich, Daniel Speyer, Assaf Gordon, Richard Aufrichtig, Jie Yuan, Joseph K. Pickrell, and Dina Zielinski
- Subjects
business.industry ,Computer science ,Human subject research ,Precision medicine ,Crowdsourcing ,Genome ,Data science ,Biobank ,Task (project management) ,World Wide Web ,chemistry.chemical_compound ,chemistry ,Scale (social sciences) ,business ,DNA - Abstract
Precision medicine necessitates large scale collections of genomes and phenomes. Despite decreases in the costs of genomic technologies, collecting these types of information at scale is still a daunting task that poses logistical challenges and requires consortium-scale resources. Here, we describe DNA.Land, a digital biobank to collect genome and phenomes with a fraction of the resources of traditional studies at the same scale. Our approach relies on crowd-sourcing data from the rapidly growing number of individuals that have access to their own genomic datasets through Direct-to-Consumer (DTC) companies. To recruit participants, we developed a series of automatic return-of-results features in DNA.Land that increase users’ engagement while stratifying human subject research protection. So far, DNA.Land has collected over 43,000 genomes in 20 months of operation, orders of magnitude higher than previous digital attempts by academic groups. We report lessons learned in running a digital biobank, our technical framework, and our approach regarding ethical, legal, and social implications.
- Published
- 2017
- Full Text
- View/download PDF
14. Toward a new history and geography of human genes informed by ancient DNA
- Author
-
Joseph K. Pickrell and David Reich
- Subjects
Population ,Globe ,Article ,Evolution, Molecular ,Genetics ,medicine ,Humans ,Selection, Genetic ,education ,History, Ancient ,Selection (genetic algorithm) ,education.field_of_study ,Natural selection ,Geography ,Genome, Human ,Historical Article ,DNA ,Genetics, Population ,Phenotype ,Ancient DNA ,medicine.anatomical_structure ,Evolutionary biology ,Africa ,Genetic structure ,Adaptation - Abstract
Genetic information contains a record of the history of our species, and technological advances have transformed our ability to access this record. Many studies have used genome-wide data from populations today to learn about the peopling of the globe and subsequent adaptation to local conditions. Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture and population replacement subsequent to the initial out-of-Africa expansion have altered the genetic structure of most of the world’s human populations. In light of this, we argue that it is time to critically re-evaluate current models of the peopling of the globe, as well as the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically-known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.
- Published
- 2014
- Full Text
- View/download PDF
15. Approximately independent linkage disequilibrium blocks in human populations
- Author
-
Tomaz Berisa and Joseph K. Pickrell
- Subjects
Genetic Markers ,Statistics and Probability ,Linkage disequilibrium ,Computer science ,Genomics ,Genome-wide association study ,Computational biology ,Biology ,01 natural sciences ,Biochemistry ,Genome ,Linkage Disequilibrium ,010104 statistics & probability ,03 medical and health sciences ,0302 clinical medicine ,Humans ,natural sciences ,0101 mathematics ,Association mapping ,Molecular Biology ,030304 developmental biology ,Genetic association ,Genetics ,0303 health sciences ,Genome, Human ,Chromosome Mapping ,Tag SNP ,Applications Notes ,3. Good health ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Genetic marker ,Human genome ,Algorithms ,Software ,030217 neurology & neurosurgery ,Genome-Wide Association Study - Abstract
Summary: We present a method to identify approximately independent blocks of linkage disequilibrium in the human genome. These blocks enable automated analysis of multiple genome-wide association studies. Availability and implementation: code: http://bitbucket.org/nygcresearch/ldetect; data: http://bitbucket.org/nygcresearch/ldetect-data. Contact: tberisa@nygenome.org Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2015
- Full Text
- View/download PDF
16. Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity
- Author
-
Guido, Brandt, Wolfgang, Haak, Christina J, Adler, Christina, Roth, Anna, Szécsényi-Nagy, Sarah, Karimnia, Sabine, Möller-Rieker, Harald, Meller, Robert, Ganslmeier, Susanne, Friederich, Veit, Dresely, Nicole, Nicklisch, Joseph K, Pickrell, Frank, Sirocko, David, Reich, Alan, Cooper, Kurt W, Alt, and Janet S, Ziegle
- Subjects
Mitochondrial DNA ,media_common.quotation_subject ,Molecular Sequence Data ,Population ,Biology ,DNA, Mitochondrial ,Article ,Genetic drift ,Bronze Age ,Genetic variation ,Humans ,education ,History, Ancient ,media_common ,Transients and Migrants ,Genetics ,education.field_of_study ,Genetic diversity ,Multidisciplinary ,Base Sequence ,Genetic Drift ,Genetic Variation ,Agriculture ,Europe ,Ancient DNA ,Evolutionary biology ,Diversity (politics) - Abstract
The Origins of Europeans To investigate the genetic origins of modern Europeans, Brandt et al. (p. 257 ) examined ancient mitochondrial DNA (mtDNA) and were able to identify genetic differences in 364 Central Europeans spanning the early Neolithic to the Early Bronze Age. Observed changes in mitochondrial haplotypes corresponded with hypothesized human migration across Eurasia and revealed the complexity of the demographic changes and evidence of a Late Neolithic origin for the European mtDNA gene pool. This transect through time reveals four key population events associated with well-known archaeological cultures, which involved genetic influx into Central Europe from various directions at various times.
- Published
- 2013
- Full Text
- View/download PDF
17. Inferring Admixture Histories of Human Populations Using Linkage Disequilibrium
- Author
-
Priya Moorjani, Po-Ru Loh, Mark Lipson, David Reich, Nick Patterson, Bonnie Berger, Joseph K. Pickrell, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Mathematics, Loh, Po-Ru, Lipson, Mark, and Berger, Bonnie
- Subjects
Linkage disequilibrium ,q-bio.PE ,Human Migration ,Population ,Population genetics ,Human genetic variation ,Biology ,Investigations ,Linkage Disequilibrium ,03 medical and health sciences ,0302 clinical medicine ,Genetic ,Japan ,Population Groups ,Models ,Genetics ,Humans ,Africa, Central ,Quantitative Biology - Populations and Evolution ,education ,Central ,Allele frequency ,Population and Evolutionary Genetics ,Phylogeny ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Phylogenetic tree ,Models, Genetic ,Populations and Evolution (q-bio.PE) ,Robustness (evolution) ,3. Good health ,Genetic distance ,Italy ,FOS: Biological sciences ,Africa ,admixture ,030217 neurology & neurosurgery ,Software ,Developmental Biology - Abstract
Author Manuscript date February 9, 2013, Long-range migrations and the resulting admixtures between populations have been important forces shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture events that previous formal tests cannot. We further show that we can uncover phylogenetic relationships among populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the calculations. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese., National Science Foundation (U.S.). Graduate Research Fellowship Program, National Institutes of Health (U.S.). (Training Grant 5T32HG004947-04), Simons Foundation
- Published
- 2013
18. The population genetics of human disease: the case of recessive, lethal mutations
- Author
-
Carlos Eduardo G. Amorim, Zachary Baker, Ziyue Gao, Joseph K. Pickrell, Jose Francisco Diesel, Imran S. Haque, Yuval B. Simons, and Molly Przeworski
- Subjects
0301 basic medicine ,Cancer Research ,Mutation rate ,lcsh:QH426-470 ,Population genetics ,Population ,Biology ,Balancing selection ,Compound heterozygosity ,symbols.namesake ,03 medical and health sciences ,Genetics ,Lethal allele ,Mutation frequency ,education ,Molecular Biology ,Allele frequency ,Genetics (clinical) ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,education.field_of_study ,0303 health sciences ,Medical genetics ,030305 genetics & heredity ,Mutation (Biology) ,lcsh:Genetics ,030104 developmental biology ,Mutation (genetic algorithm) ,Mendelian inheritance ,symbols - Abstract
Do the frequencies of disease mutations in human populations reflect a simple balance between mutation and purifying selection? What other factors shape the prevalence of disease mutations? To begin to answer these questions, we focused on one of the simplest cases: recessive mutations that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set of 417 Mendelian mutations in 32 genes, reported to cause a recessive, lethal Mendelian disease. We then considered analytic models of mutation-selection balance in infinite and finite populations of constant sizes and simulations of purifying selection in a more realistic demographic setting, and tested how well these models fit allele frequencies estimated from 33,370 individuals of European ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially elevated rate, and three other mutation types. The observed frequency for CpG transitions is slightly higher than expectation but close, whereas the frequencies observed for the three other mutation types are an order of magnitude higher than expected. This discrepancy is even larger when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into account. In principle, higher than expected frequencies of disease mutations could be due to widespread errors in reporting causal variants, compensation by other mutations, or balancing selection. It is unclear why these factors would have a greater impact on variants with lower mutation rates, however. We argue instead that the unexpectedly high frequency of disease mutations and the relationship to the mutation rate likely reflect an ascertainment bias: of all the mutations that cause recessive lethal diseases, those that by chance have reached higher frequencies are more likely to have been identified and thus to have been included in this study. Beyond the specific application, this study highlights the parameters likely to be important in shaping the frequencies of Mendelian disease alleles.Author SummaryWhat determines the frequencies of disease mutations in human populations? To begin to answer this question, we focus on one of the simplest cases: mutations that cause completely recessive, lethal Mendelian diseases. We first review theory about what to expect from mutation and selection in a population of finite size and further generate predictions based on simulations using a realistic demographic scenario of human evolution. For a highly mutable type of mutations, such as transitions at CpG sites, we find that the predictions are close to the observed frequencies of recessive lethal disease mutations. For less mutable types, however, predictions substantially under-estimate the observed frequency. We discuss possible explanations for the discrepancy and point to a complication that, to our knowledge, is not widely appreciated: that there exists ascertainment bias in disease mutation discovery. Specifically, we suggest that alleles that have been identified to date are likely the ones that by chance have reached higher frequencies and are thus more likely to have been mapped. More generally, our study highlights the factors that influence the frequencies of Mendelian disease alleles.
- Published
- 2016
- Full Text
- View/download PDF
19. Genetic variants linked to education predict longevity
- Author
-
Chris Power, Gail Davies, Ilaria Gandin, Panagiotis Deloukas, Jennifer E. Huffman, Pascal Timshel, Albert V. Smith, A. Kong, Paul Lichtenstein, Joseph K. Pickrell, Philipp Koellinger, P. L. De Jager, Reedik Mägi, G. B. Chen, Neil Pendleton, B. V. Halldórsson, George Dedoussis, Antti-Pekka Sarin, Natalia Pervjakova, Veikko Salomaa, Simona Vaccargiu, Ozren Polasek, K. H. Jöckel, Elisabeth Steinhagen-Thiessen, Y. Milaneschi, Jessica D. Faul, Patricia A. Boyle, Patrik K. E. Magnusson, Igor Rudan, Christopher P. Nelson, Vilmundur Gudnason, John Attia, Jürgen Wellmann, Kristi Läll, Konstantin Strauch, Stuart J. Ritchie, Markus Perola, Nicola Pirastu, Klaus Bønnelykke, Robert Karlsson, R. de Vlaming, Liisa Keltigangas-Jarvinen, Thomas Meitinger, Riccardo E. Marioni, Anu Loukola, Barbera Franke, Reinhold Schmidt, Maël Lebreton, Sven Oskarsson, E. Mihailov, Harm-Jan Westra, David R. Weir, Aldi T. Kraja, Niek Verweij, Peter M. Visscher, Hans-Jörgen Grabe, Johannes H. Brandsma, Mark Adams, R. J. Scott, G. Thorleifsson, Tõnu Esko, Mika Kähönen, Saskia P. Hagenaars, Patrick Turley, Johannes Waage, Peter Lichtner, Dragana Vuckovic, Antonietta Robino, Henry Völzke, Lydia Quaye, C. de Leeuw, Marika Kaakinen, Wei Zhao, Abdel Abdellaoui, Reka Nagy, Pedro Marques-Vidal, Johan G. Eriksson, Alan F. Wright, Andres Metspalu, Lavinia Paternoster, Momoko Horikoshi, Jan A. Staessen, Tarunveer S. Ahluwalia, Tian Liu, Martin Kroh, Aldo Rustichini, Giorgia Girotto, Cristina Venturini, Lili Milani, Jennifer A. Smith, Ginevra Biino, Tessel E. Galesloot, Michael A. Horan, Gerardus A. Meddens, James F. Wilson, Francesco Cucca, Peter Vollenweider, Erika Salvi, P. J. van der Most, Jari Lahti, Campbell A, David Laibson, Andrew Bakshi, Wolfgang Hoffmann, Tomi Mäki-Opas, Andreas J. Forstner, C M van Duijn, Nicholas G. Martin, Jonathan Marten, Ute Bültmann, Olli T. Raitakari, David A. Bennett, A.G. Uitterlinden, J. E. De Neve, Ingrid B. Borecki, WD Hill, Bo Jacobsson, Antti Latvala, Katri Räikkönen, Michael B. Miller, Jonathan P. Beauchamp, S. J. van der Lee, Ilja Demuth, Stavroula Kanoni, Veronique Vitart, Elina Hyppönen, N. Eklund, Francesco P. Cappuccio, Robert F. Krueger, Maria Pina Concas, Jaime Derringer, F. J.A. Van Rooij, Helena Schmidt, Patrick J. F. Groenen, Valur Emilsson, Rico Rueedi, Aysu Okbay, Georg Homuth, Edith Hofer, W. E. R. Ollier, Hannah Campbell, Paolo Gasparini, Mark Alan Fontana, Magnus Johannesson, Seppo Koskinen, Christopher F. Chabris, Jouke-Jan Hottenga, Christine Meisinger, Kari Stefansson, Jun Ding, Tia Sorensen, Brenda W.J.H. Penninx, Michelle N. Meyer, James J. Lee, Diego Vozzi, Gonneke Willemsen, K. Petrovic, Sarah E. Medland, Mary F. Feitosa, Henning Tiemeier, L. J. Launer, William G. Iacono, Massimo Mangino, Tune H. Pers, S. E. Baumeister, Christopher Oldmeadow, Grant W. Montgomery, Marjo-Riitta Järvelin, Jaakko Kaprio, Catharine R. Gale, S.F.W. Meddens, Kevin Thom, Klaus Berger, Pablo V. Gejman, Lude Franke, Gyda Bjornsdottir, Daniel J. Benjamin, Steven F. Lehrer, Krista Fischer, Alan R. Sanders, S. Ulivi, Katharina E. Schraut, Tim D. Spector, Amy Hofman, Matt McGue, Terho Lehtimäki, D. C. Liewald, Hans Bisgaard, L. Eisele, Astanand Jugessur, George Davey Smith, T.B. Harris, A.R. Thurik, Cornelius A. Rietveld, David Schlessinger, Z. Kutalik, David J. Porteous, Lynne J. Hocking, N J Timpson, A. Palotie, Lambertus A. Kiemeney, Ian J. Deary, Sharon L.R. Kardia, Peter K. Joshi, Nilesh J. Samani, Michael A. Province, Börge Schmidt, Richa Gupta, Carmen Amador, Erin B. Ware, Joyce Y. Tung, Ioanna-Panagiota Kalafati, Lars Bertram, Caroline Hayward, P. van der Harst, Penelope A. Lind, Kadri Kaasik, N.A. Furlotte, Sarah E. Harris, B. St Pourcain, Susan M. Ring, Zhihong Zhu, Alexander Teumer, Behrooz Z. Alizadeh, Judith M. Vonk, Blair H. Smith, A Payton, Wouter J. Peyrot, Jacob Gratten, Douglas F. Levinson, C Gieger, Leanne M. Hall, Andrew Heath, Mario Pirastu, Peter Eibich, Nancy L. Pedersen, Ronny Myhre, Antonio Terracciano, David M. Evans, Raymond A. Poot, Uwe Völker, Dorret I. Boomsma, Clemens Baumbach, Unnur Thorsteinsdottir, Ivana Kolcic, Jia-Shu Yang, Dalton Conley, A. A. Vinkhuyzen, Danielle Posthuma, Karl-Oskar Lindgren, Olga Rostapshova, Jonas Bacelis, Daniele Cusi, Yong Qian, Bjarni Gunnarsson, George McMahon, Elizabeth G. Holliday, Pamela A. F. Madden, David A. Hinds, David Cesarini, Jianxin Shi, Najaf Amin, Dale R. Nyholt, Applied Economics, Epidemiology, Real World Studies in PharmacoEpidemiology, -Genetics, -Economics and -Therapy (PEGET), Groningen Institute for Gastro Intestinal Genetics and Immunology (3GI), Groningen Research Institute for Asthma and COPD (GRIAC), Aletta Jacobs School of Public Health, Public Health Research (PHR), Stem Cell Aging Leukemia and Lymphoma (SALL), Cardiovascular Centre (CVC), Amsterdam Neuroscience - Complex Trait Genetics, Psychiatry, Amsterdam Neuroscience - Mood, Anxiety, Psychosis, Stress & Sleep, EMGO - Mental health, Complex Trait Genetics, Biological Psychology, Marioni, RE, Ritchie, SJ, Joshi, PK, Hagenaars, SP, Hypponen, E, Benjamin, DJ, Social Science Genetic Association Consortium, Marioni, Re, Ritchie, Sj, Joshi, Pk, Hagenaars, Sp, Okbay, A, Fischer, K, Adams, Mj, Hill, Wd, Davies, G, Nagy, R, Amador, C, Läll, K, Metspalu, A, Liewald, Dc, Campbell, A, Wilson, Jf, Hayward, C, Esko, T, Porteous, Dj, Gale, Cr, Deary, Ij, Beauchamp, Jp, Fontana, Ma, Lee, Jj, Pers, Th, Rietveld, Ca, Turley, P, Chen, Gb, Emilsson, V, Meddens, Sf, Oskarsson, S, Pickrell, Jk, Thom, K, Timshel, P, de Vlaming, R, Abdellaoui, A, Ahluwalia, T, Bacelis, J, Baumbach, C, Bjornsdottir, G, Brandsma, Jh, Concas, MARIA PINA, Derringer, J, Furlotte, Na, Galesloot, Te, Girotto, Giorgia, Gupta, R, Hall, Lm, Harris, Se, Hofer, E, Horikoshi, M, Huffman, Je, Kaasik, K, Kalafati, Ip, Karlsson, R, Kong, A, Lahti, J, van der Lee, Sj, de Leeuw, C, Lind, Pa, Lindgren, Ko, Liu, T, Mangino, M, Marten, J, Mihailov, E, Miller, Mb, van der Most, Pj, Oldmeadow, C, Payton, A, Pervjakova, N, Peyrot, Wj, Qian, Y, Raitakari, O, Rueedi, R, Salvi, E, Schmidt, B, Schraut, Ke, Shi, J, Smith, Av, Poot, Ra, St Pourcain, B, Teumer, A, Thorleifsson, G, Verweij, N, Vuckovic, Dragana, Wellmann, J, Westra, Hj, Yang, J, Zhao, W, Zhu, Z, Alizadeh, Bz, Amin, N, Bakshi, A, Baumeister, Se, Biino, G, Bønnelykke, K, Boyle, Pa, Campbell, H, Cappuccio, Fp, De Neve, Je, Deloukas, P, Demuth, I, Ding, J, Eibich, P, Eisele, L, Eklund, N, Evans, Dm, Faul, Jd, Feitosa, Mf, Forstner, Aj, Gandin, Ilaria, Gunnarsson, B, Halldórsson, Bv, Harris, Tb, Heath, Ac, Hocking, Lj, Holliday, Eg, Homuth, G, Horan, Ma, Hottenga, Jj, de Jager, Pl, Jugessur, A, Kaakinen, Ma, Kähönen, M, Kanoni, S, Keltigangas Järvinen, L, Kiemeney, La, Kolcic, I, Koskinen, S, Kraja, At, Kroh, M, Kutalik, Z, Latvala, A, Launer, Lj, Lebreton, Mp, Levinson, Df, Lichtenstein, P, Lichtner, P, Loukola, A, Madden, Pa, Mägi, R, Mäki Opas, T, Marques Vidal, P, Meddens, Ga, Mcmahon, G, Meisinger, C, Meitinger, T, Milaneschi, Y, Milani, L, Montgomery, Gw, Myhre, R, Nelson, Cp, Nyholt, Dr, Ollier, We, Palotie, A, Paternoster, L, Pedersen, Nl, Petrovic, Ke, Räikkönen, K, Ring, Sm, Robino, Antonietta, Rostapshova, O, Rudan, I, Rustichini, A, Salomaa, V, Sanders, Ar, Sarin, Ap, Schmidt, H, Scott, Rj, Smith, Bh, Smith, Ja, Staessen, Ja, Steinhagen Thiessen, E, Strauch, K, Terracciano, A, Tobin, Md, Ulivi, Sheila, Vaccargiu, S, Quaye, L, van Rooij, Fj, Venturini, C, Vinkhuyzen, Aa, Völker, U, Völzke, H, Vonk, Jm, Vozzi, Diego, Waage, J, Ware, Eb, Willemsen, G, Attia, Jr, Bennett, Da, Berger, K, Bertram, L, Bisgaard, H, Boomsma, Di, Borecki, Ib, Bultmann, U, Chabris, Cf, Cucca, F, Cusi, D, Dedoussis, Gv, van Duijn, Cm, Eriksson, Jg, Franke, B, Franke, L, Gasparini, Paolo, Gejman, Pv, Gieger, C, Grabe, Hj, Gratten, J, Groenen, Pj, Gudnason, V, van der Harst, P, Hinds, Da, Hoffmann, W, Iacono, Wg, Jacobsson, B, Järvelin, Mr, Jöckel, Kh, Kaprio, J, Kardia, Sl, Lehtimäki, T, Lehrer, Sf, Magnusson, Pk, Martin, Ng, Mcgue, M, Pendleton, N, Penninx, Bw, Perola, M, Pirastu, Nicola, Pirastu, M, Polasek, O, Posthuma, D, Power, C, Province, Ma, Samani, Nj, Schlessinger, D, Schmidt, R, Sørensen, Ti, Spector, Td, Stefansson, K, Thorsteinsdottir, U, Thurik, Ar, Timpson, Nj, Tiemeier, H, Tung, Jy, Uitterlinden, Ag, Vitart, V, Vollenweider, P, Weir, Dr, Wright, Af, Conley, Dc, Krueger, Rf, Smith, Gd, Hofman, A, Laibson, Di, Medland, Se, Meyer, Mn, Johannesson, M, Visscher, Pm, Koellinger, Pd, Cesarini, D, and Benjamin, Dj
- Subjects
Netherlands Twin Register (NTR) ,0301 basic medicine ,Male ,Parents ,education: longevity: prediction: polygenic score [genetics] ,Multifactorial Inheritance ,polygenic ,Lebenserwartung ,Cohort Studies ,0302 clinical medicine ,Databases, Genetic ,Medicine ,genetics ,polygenic score ,longevity, education, gene ,Soziales und Gesundheit ,media_common ,Aged, 80 and over ,education ,Multidisciplinary ,Longevity ,Middle Aged ,Biobank ,humanities ,3. Good health ,Urological cancers Radboud Institute for Health Sciences [Radboudumc 15] ,Cohort ,Educational Status ,Female ,Cohort study ,Estonia ,education, longevity, polygenic ,Offspring ,media_common.quotation_subject ,Kultursektor ,Prognose ,Lernen ,Lower risk ,Education ,03 medical and health sciences ,longevity ,SDG 3 - Good Health and Well-being ,Commentaries ,Polygenic score ,Journal Article ,Genetics ,Humans ,Non-Profit-Sektor ,Genetic Association Studies ,Aged ,Neurodevelopmental disorders Donders Center for Medical Neuroscience [Radboudumc 7] ,business.industry ,ta1184 ,Genetic Variation ,prediction ,Educational attainment ,United Kingdom ,Gesundheitsstatistik ,030104 developmental biology ,Genetic epidemiology ,Scotland ,Gesundheitszustand ,Genetische Forschung ,business ,Prediction ,Bildung ,030217 neurology & neurosurgery ,Demography - Abstract
Educational attainment is associated with many health outcomes, including longevity. It is also known to be substantially heritable. Here, we used data from three large genetic epidemiology cohort studies (Generation Scotland, n = ∼17,000; UK Biobank, n = ∼115,000; and the Estonian Biobank, n = ∼6,000) to test whether education-linked genetic variants can predict lifespan length. We did so by using cohort members' polygenic profile score for education to predict their parents' longevity. Across the three cohorts, meta-analysis showed that a 1 SD higher polygenic education score was associated with ∼2.7% lower mortality risk for both mothers (total n deaths = 79,702) and ∼2.4% lower risk for fathers (total n deaths = 97,630). On average, the parents of offspring in the upper third of the polygenic score distribution lived 0.55 y longer compared with those of offspring in the lower third. Overall, these results indicate that the genetic contributions to educational attainment are useful in the prediction of human longevity. Marioni RE, Ritchie SJ, Joshi PK, Hagenaars SP, Okbay A, Fischer K, Adams MJ, Hill WD, Davies G, Social Science Genetic Association Consortium, Nagy R, Amador C, Läll K, Metspalu A, Liewald DC, Campbell A, Wilson JF, Hayward C, Esko T, Porteous DJ, Proceedings of the National Academy of Sciences of the United States of America, 2016, vol. 113, no. 47, pp. 13366-13371, 2016 Refereed/Peer-reviewed
- Published
- 2016
- Full Text
- View/download PDF
20. Correction: DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines
- Author
-
Jonathan K. Pritchard, Joseph K. Pickrell, Jordana T. Bell, Athma A. Pai, Roger Pique-Regi, Yoav Gilad, Jacob F. Degner, and Daniel J. Gaffney
- Subjects
Genetics ,0303 health sciences ,030305 genetics & heredity ,Correction ,Methylation ,Biology ,03 medical and health sciences ,Genotype ,DNA methylation ,SNP ,1000 Genomes Project ,International HapMap Project ,Gene ,Genotyping ,030304 developmental biology - Abstract
Correction We showed in our study [1] that SNP rs10876043 in the disco-interacting protein 2 homolog B gene (DIP2B) was associated with the first principal component of methylation. Although the analyses and result remain unchanged, it appears that this observation is likely due to a genotyping artifact. That is, the reported rs10876043 genotypes differ according to HapMap Phase (cell lines genotyped in Phase 1/2 have reported genotypes AG and GG, while Phase 3 cell lines have genotype AA). The 1000 Genomes data suggest the correct genotype is probably AA for all of these YRI individuals. These genotype differences between different phases of the HapMap Project, coupled with a small difference in mean methylation between Phase 1/2 vs 3 cell lines appear to have produced an artifactual association. Other analyses in the paper controlled for the top principal components and should therefore be robust to this type of effect.
- Published
- 2016
21. Corrigendum: Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses
- Author
-
Aysu, Okbay, Bart M L, Baselmans, Jan-Emmanuel De, Neve, Patrick, Turley, Michel G, Nivard, Mark Alan, Fontana, S Fleur W, Meddens, Richard Karlsson, Linnér, Cornelius A, Rietveld, Jaime, Derringer, Jacob, Gratten, James J, Lee, Jimmy Z, Liu, Ronald, de Vlaming, Tarunveer S, Ahluwalia, Jadwiga, Buchwald, Alana, Cavadino, Alexis C, Frazier-Wood, Nicholas A, Furlotte, Victoria, Garfield, Marie Henrike, Geisel, Juan R, Gonzalez, Saskia, Haitjema, Robert, Karlsson, Sander W, van der Laan, Karl-Heinz, Ladwig, Jari, Lahti, Sven J, van der Lee, Penelope A, Lind, Tian, Liu, Lindsay, Matteson, Evelin, Mihailov, Michael B, Miller, Camelia C, Minica, Ilja M, Nolte, Dennis, Mook-Kanamori, Peter J, van der Most, Christopher, Oldmeadow, Yong, Qian, Olli, Raitakari, Rajesh, Rawal, Anu, Realo, Rico, Rueedi, Börge, Schmidt, Albert V, Smith, Evie, Stergiakouli, Toshiko, Tanaka, Kent, Taylor, Gudmar, Thorleifsson, Juho, Wedenoja, Juergen, Wellmann, Harm-Jan, Westra, Sara M, Willems, Wei, Zhao, Najaf, Amin, Andrew, Bakshi, Sven, Bergmann, Gyda, Bjornsdottir, Patricia A, Boyle, Samantha, Cherney, Simon R, Cox, Gail, Davies, Oliver S P, Davis, Jun, Ding, Nese, Direk, Peter, Eibich, Rebecca T, Emeny, Ghazaleh, Fatemifar, Jessica D, Faul, Luigi, Ferrucci, Andreas J, Forstner, Christian, Gieger, Richa, Gupta, Tamara B, Harris, Juliette M, Harris, Elizabeth G, Holliday, Jouke-Jan, Hottenga, Philip L De, Jager, Marika A, Kaakinen, Eero, Kajantie, Ville, Karhunen, Ivana, Kolcic, Meena, Kumari, Lenore J, Launer, Lude, Franke, Ruifang, Li-Gao, David C, Liewald, Marisa, Koini, Anu, Loukola, Pedro, Marques-Vidal, Grant W, Montgomery, Miriam A, Mosing, Lavinia, Paternoster, Alison, Pattie, Katja E, Petrovic, Laura, Pulkki-Råback, Lydia, Quaye, Katri, Räikkönen, Igor, Rudan, Rodney J, Scott, Jennifer A, Smith, Angelina R, Sutin, Maciej, Trzaskowski, Anna E, Vinkhuyzen, Lei, Yu, Delilah, Zabaneh, John R, Attia, David A, Bennett, Klaus, Berger, Lars, Bertram, Dorret I, Boomsma, Harold, Snieder, Shun-Chiao, Chang, Francesco, Cucca, Ian J, Deary, Cornelia M, van Duijn, Johan G, Eriksson, Ute, Bültmann, Eco J C, de Geus, Patrick J F, Groenen, Vilmundur, Gudnason, Torben, Hansen, Catharine A, Hartman, Claire M A, Haworth, Caroline, Hayward, Andrew C, Heath, David A, Hinds, Elina, Hyppönen, William G, Iacono, Marjo-Riitta, Järvelin, Karl-Heinz, Jöckel, Jaakko, Kaprio, Sharon L R, Kardia, Liisa, Keltikangas-Järvinen, Peter, Kraft, Laura D, Kubzansky, Terho, Lehtimäki, Patrik K E, Magnusson, Nicholas G, Martin, Matt, McGue, Andres, Metspalu, Melinda, Mills, Renée, de Mutsert, Albertine J, Oldehinkel, Gerard, Pasterkamp, Nancy L, Pedersen, Robert, Plomin, Ozren, Polasek, Christine, Power, Stephen S, Rich, Frits R, Rosendaal, Hester M, den Ruijter, David, Schlessinger, Helena, Schmidt, Rauli, Svento, Reinhold, Schmidt, Behrooz Z, Alizadeh, Thorkild I A, Sørensen, Tim D, Spector, John M, Starr, Kari, Stefansson, Andrew, Steptoe, Antonio, Terracciano, Unnur, Thorsteinsdottir, A Roy, Thurik, Nicholas J, Timpson, Henning, Tiemeier, André G, Uitterlinden, Peter, Vollenweider, Gert G, Wagner, David R, Weir, Jian, Yang, Dalton C, Conley, George Davey, Smith, Albert, Hofman, Magnus, Johannesson, David I, Laibson, Sarah E, Medland, Michelle N, Meyer, Joseph K, Pickrell, Tõnu, Esko, Robert F, Krueger, Jonathan P, Beauchamp, Philipp D, Koellinger, Daniel J, Benjamin, Meike, Bartels, and David, Cesarini
- Subjects
Journal Article ,Medizin ,Genetics ,Article - Abstract
We conducted genome-wide association studies of three phenotypes: subjective well-being (N = 298,420), depressive symptoms (N = 161,460), and neuroticism (N = 170,910). We identified three variants associated with subjective well-being, two with depressive symptoms, and eleven with neuroticism, including two inversion polymorphisms. The two depressive symptoms loci replicate in an independent depression sample. Joint analyses that exploit the high genetic correlations between the phenotypes (|ρ̂| ≈ 0.8) strengthen the overall credibility of the findings, and allow us to identify additional variants. Across our phenotypes, loci regulating expression in central nervous system and adrenal/pancreas tissues are strongly enriched for association.
- Published
- 2016
22. DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines
- Author
-
Joseph K. Pickrell, Jordana T. Bell, Yoav Gilad, Jacob F. Degner, Daniel J. Gaffney, Jonathan K. Pritchard, Roger Pique-Regi, and Athma A. Pai
- Subjects
Genotype ,Transcription, Genetic ,Quantitative Trait Loci ,HapMap Project ,Biology ,Polymorphism, Single Nucleotide ,Cell Line ,Epigenesis, Genetic ,Histones ,03 medical and health sciences ,0302 clinical medicine ,Epigenetics of physical exercise ,Humans ,Promoter Regions, Genetic ,Gene ,RNA-Directed DNA Methylation ,030304 developmental biology ,Epigenomics ,Regulation of gene expression ,Genetics ,0303 health sciences ,Genome, Human ,Research ,Methylation ,DNA Methylation ,Gene Expression Regulation ,DNA methylation ,Illumina Methylation Assay ,030217 neurology & neurosurgery ,Genome-Wide Association Study - Abstract
BACKGROUND: DNA methylation is an essential epigenetic mechanism involved in gene regulation and disease, but little is known about the mechanisms underlying inter-individual variation in methylation profiles. Here we measured methylation levels at 22,290 CpG dinucleotides in lymphoblastoid cell lines from 77 HapMap Yoruba individuals, for which genome-wide gene expression and genotype data were also available. RESULTS: Association analyses of methylation levels with more than three million common single nucleotide polymorphisms (SNPs) identified 180 CpG-sites in 173 genes that were associated with nearby SNPs (putatively in cis, usually within 5 kb) at a false discovery rate of 10%. The most intriguing trans signal was obtained for SNP rs10876043 in the disco-interacting protein 2 homolog B gene (DIP2B, previously postulated to play a role in DNA methylation), that had a genome-wide significant association with the first principal component of patterns of methylation; however, we found only modest signal of trans-acting associations overall. As expected, we found significant negative correlations between promoter methylation and gene expression levels measured by RNA-sequencing across genes. Finally, there was a significant overlap of SNPs that were associated with both methylation and gene expression levels. CONCLUSIONS: Our results demonstrate a strong genetic component to inter-individual variation in DNA methylation profiles. Furthermore, there was an enrichment of SNPs that affect both methylation and gene expression, providing evidence for shared mechanisms in a fraction of genes.
- Published
- 2016
23. Detection and interpretation of shared genetic influences on 42 human traits
- Author
-
David A. Hinds, Laure Ségurel, Joyce Y. Tung, Joseph K. Pickrell, Jimmy Z. Liu, Tomaz Berisa, Génétique épidémiologique et structures des populations humaines (Inserm U535), Epidémiologie, sciences sociales, santé publique (IFR 69), Université Paris 1 Panthéon-Sorbonne (UP1)-Université Paris-Sud - Paris 11 (UP11)-École des hautes études en sciences sociales (EHESS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Université Paris Descartes - Paris 5 (UPD5)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Université Paris 1 Panthéon-Sorbonne (UP1)-Université Paris-Sud - Paris 11 (UP11)-École des hautes études en sciences sociales (EHESS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Université Paris Descartes - Paris 5 (UPD5)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM), Eco-Anthropologie et Ethnobiologie (EAE), Muséum national d'Histoire naturelle (MNHN)-Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS), 23andMe Inc., and Laure, Segurel
- Subjects
0301 basic medicine ,Multifactorial Inheritance ,Genome-wide association study ,macromolecular substances ,[SDV.GEN] Life Sciences [q-bio]/Genetics ,Biology ,Polymorphism, Single Nucleotide ,Body Mass Index ,03 medical and health sciences ,Genetics ,Genetic Pleiotropy ,Humans ,Genetic Predisposition to Disease ,Triglycerides ,[SDV.GEN]Life Sciences [q-bio]/Genetics ,Extramural ,Interpretation (philosophy) ,Inflammatory Bowel Diseases ,Multiple traits ,Parkinson Disease ,Phenotype ,3. Good health ,030104 developmental biology ,Schizophrenia ,Genome-Wide Association Study - Abstract
We performed a scan for genetic variants associated with multiple phenotypes by comparing large genome-wide association studies (GWAS) of 42 traits or diseases. We identified 341 loci (at a false discovery rate of 10%) associated with multiple traits. Several loci are associated with multiple phenotypes; for example, a nonsynonymous variant in the zinc transporter SLC39A8 influences seven of the traits, including risk of schizophrenia (rs13107325: log-transformed odds ratio (log OR) = 0.15, P = 2 × 10(-12)) and Parkinson disease (log OR = -0.15, P = 1.6 × 10(-7)), among others. Second, we used these loci to identify traits that have multiple genetic causes in common. For example, variants associated with increased risk of schizophrenia also tended to be associated with increased risk of inflammatory bowel disease. Finally, we developed a method to identify pairs of traits that show evidence of a causal relationship. For example, we show evidence that increased body mass index causally increases triglyceride levels.
- Published
- 2016
- Full Text
- View/download PDF
24. Case-control association mapping without cases
- Author
-
Joseph K. Pickrell, Yaniv Erlich, and Jimmy Z. Liu
- Subjects
0303 health sciences ,Genome-wide association study ,Disease ,Biology ,medicine.disease ,Bioinformatics ,Biobank ,Coronary artery disease ,03 medical and health sciences ,0302 clinical medicine ,Cohort ,medicine ,Family history ,Association mapping ,030217 neurology & neurosurgery ,030304 developmental biology ,Genetic association - Abstract
The case-control association study is a powerful method for identifying genetic variants that influence disease risk. However, the collection of cases can be time-consuming and expensive; if a disease occurs late in life or is rapidly lethal, it may be more practical to identify family members of cases. Here, we show that replacing cases with their first-degree relatives enables genome-wide association studies by proxy (GWAX). In randomly-ascertained cohorts, this approach enables previously infeasible studies of diseases that are absent (or nearly absent) in the cohort. As an illustration, we performed GWAX of 12 common diseases in 116,196 individuals from the UK Biobank. By combining these results with published GWAS summary statistics in a meta-analysis, we replicated established risk loci and identified 17 newly associated risk loci: four in Alzheimer’s disease, eight in coronary artery disease, and five in type 2 diabetes. In addition to informing disease biology, our results demonstrate the utility of association mapping using family history of disease as a phenotype to be mapped. We anticipate that this approach will prove useful in future genetic studies of complex traits in large population cohorts.
- Published
- 2016
- Full Text
- View/download PDF
25. Case-control association mapping by proxy using family history of disease
- Author
-
Jimmy Z. Liu, Yaniv Erlich, and Joseph K. Pickrell
- Subjects
0301 basic medicine ,Risk ,Genome, Human ,Genome-wide association study ,Disease ,Coronary Artery Disease ,Biology ,Bioinformatics ,Biobank ,03 medical and health sciences ,030104 developmental biology ,Diabetes Mellitus, Type 2 ,Alzheimer Disease ,Meta-analysis ,Case-Control Studies ,Cohort ,Genetics ,Humans ,Genetic Predisposition to Disease ,Family history ,Association mapping ,Demography ,Genetic association ,Genome-Wide Association Study - Abstract
Collecting cases for case-control genetic association studies can be time-consuming and expensive. In some situations (such as studies of late-onset or rapidly lethal diseases), it may be more practical to identify family members of cases. In randomly ascertained cohorts, replacing cases with their first-degree relatives enables studies of diseases that are absent (or nearly absent) in the cohort. We refer to this approach as genome-wide association study by proxy (GWAX) and apply it to 12 common diseases in 116,196 individuals from the UK Biobank. Meta-analysis with published genome-wide association study summary statistics replicated established risk loci and yielded four newly associated loci for Alzheimer's disease, eight for coronary artery disease and five for type 2 diabetes. In addition to informing disease biology, our results demonstrate the utility of association mapping without directly observing cases. We anticipate that GWAX will prove useful in future genetic studies of complex traits in large population cohorts.
- Published
- 2016
26. DNase I sensitivity QTLs are a major determinant of human expression variation
- Author
-
Sherryl De Leon, Roger Pique-Regi, Yoav Gilad, Jean Baptiste Veyrieras, Joseph K. Pickrell, Jacob F. Degner, Matthew Stephens, Gregory E. Crawford, Jonathan K. Pritchard, Athma A. Pai, Daniel J. Gaffney, Katelyn Michelini, and Noah Lewellen
- Subjects
Quantitative Trait Loci ,DNA Footprinting ,Biology ,Quantitative trait locus ,Polymorphism, Single Nucleotide ,Article ,DNase-Seq ,03 medical and health sciences ,0302 clinical medicine ,Genetic variation ,Deoxyribonuclease I ,Humans ,Gene ,030304 developmental biology ,Genetics ,0303 health sciences ,Multidisciplinary ,Genome, Human ,Gene Expression Profiling ,Genetic Variation ,Sequence Analysis, DNA ,Chromatin ,DNA binding site ,Gene expression profiling ,Phenotype ,Gene Expression Regulation ,Expression quantitative trait loci ,030217 neurology & neurosurgery ,Transcription Factors - Abstract
In human lymphoblastoid cell lines, 8,902 loci were identified at which genetic variation is significantly associated with local DNase I sensitivity; these variants are responsible for a large fraction of expression quantitative trait loci. Expression quantitative trait loci (eQTLs) are stretches of DNA that regulate gene transcription and expression and contribute to a particular phenotypic trait. eQTL mapping is an important tool for linking genetic variation to changes in gene regulation, but identifying the causal variants underlying eQTLs and the regulatory mechanisms involved remains a challenge. Degner et al. used DNaseI sequencing to measure genome-wide chromatin accessibility in 70 Yoruba lymphoblastoid cell lines to produce genome-wide maps of chromatin accessibility for each individual. They identify variants that they call DNaseI sensitivity quantitative trait loci (dsQTLs). The implication is that changes in chromatin accessibility or transcription-factor binding occur at many gene loci and are likely to be important contributors to phenotypic variation. The mapping of expression quantitative trait loci (eQTLs) has emerged as an important tool for linking genetic variation to changes in gene regulation1,2,3,4,5. However, it remains difficult to identify the causal variants underlying eQTLs, and little is known about the regulatory mechanisms by which they act. Here we show that genetic variants that modify chromatin accessibility and transcription factor binding are a major mechanism through which genetic variation leads to gene expression differences among humans. We used DNase I sequencing to measure chromatin accessibility in 70 Yoruba lymphoblastoid cell lines, for which genome-wide genotypes and estimates of gene expression levels are also available6,7,8. We obtained a total of 2.7 billion uniquely mapped DNase I-sequencing (DNase-seq) reads, which allowed us to produce genome-wide maps of chromatin accessibility for each individual. We identified 8,902 locations at which the DNase-seq read depth correlated significantly with genotype at a nearby single nucleotide polymorphism or insertion/deletion (false discovery rate = 10%). We call such variants ‘DNase I sensitivity quantitative trait loci’ (dsQTLs). We found that dsQTLs are strongly enriched within inferred transcription factor binding sites and are frequently associated with allele-specific changes in transcription factor binding. A substantial fraction (16%) of dsQTLs are also associated with variation in the expression levels of nearby genes (that is, these loci are also classified as eQTLs). Conversely, we estimate that as many as 55% of eQTL single nucleotide polymorphisms are also dsQTLs. Our observations indicate that dsQTLs are highly abundant in the human genome and are likely to be important contributors to phenotypic variation.
- Published
- 2012
- Full Text
- View/download PDF
27. A systematic survey of loss-of-function variants in human protein-coding genes
- Author
-
Klaudia Walter, Yali Xue, Jeffrey C. Barrett, Jennifer Harrow, Catherine E. Snow, Mark Gerstein, Ni Huang, Steven A. McCarroll, Jonathan K. Pritchard, Jeffrey A. Rosenfeld, Zhengdong D. Zhang, Hancheng Zheng, Menachem Fromer, Lukas Habegger, Yingrui Li, Mark A. DePristo, If H. A. Barnes, Bryndis Yngvadottir, James Morris, Alexandra Bignell, David Neil Cooper, Gerton Lunter, Ekta Khurana, Stephen B. Montgomery, Richard A. Gibbs, Donald F. Conrad, Emmanouil T. Dermitzakis, Daniel G. MacArthur, Suzannah Bumpstead, Gary Saunders, Kai Ye, Clara Amid, Marie-Marthe Suner, M. Kay, Joseph K. Pickrell, Adam Frankish, Robert E. Handsaker, Suganthi Balasubramanian, Eric Banks, Toby Hunt, Irene Gallego Romero, Cornelis A. Albers, Chris Tyler-Smith, Qasim Ayub, Denise Carvalho-Silva, Matthew E. Hurles, Min Hu, Luke Jostins, Jun Wang, Mike Jin, and Xinmeng Jasmine Mu
- Subjects
Candidate gene ,Gene Expression ,Biology ,Genome ,Polymorphism, Single Nucleotide ,Article ,Genomic disorders and inherited multi-system disorders DCN MP - Plasticity and memory [IGMD 3] ,Gene Frequency ,Genetic variation ,Humans ,ddc:576.5 ,Disease ,Allele ,Selection, Genetic ,Gene ,Loss function ,Genetics ,Multidisciplinary ,Genome, Human ,Genetic Variation ,Proteins ,Phenotype ,Disease/genetics ,Proteins/genetics ,Human genome - Abstract
Defective Gene Detective Identifying genes that give rise to diseases is one of the major goals of sequencing human genomes. However, putative loss-of-function genes, which are often some of the first identified targets of genome and exome sequencing, have often turned out to be sequencing errors rather than true genetic variants. In order to identify the true scope of loss-of-function genes within the human genome, MacArthur et al. (p. 823 ; see the Perspective by Quintana-Murci ) extensively validated the genomes from the 1000 Genomes Project, as well as an additional European individual, and found that the average person has about 100 true loss-of-function alleles of which approximately 20 have two copies within an individual. Because many known disease-causing genes were identified in “normal” individuals, the process of clinical sequencing needs to reassess how to identify likely causative alleles.
- Published
- 2012
- Full Text
- View/download PDF
28. The population genetics of human disease: The case of recessive, lethal mutations
- Author
-
Yuval B. Simons, Molly Przeworski, Carlos Eduardo G. Amorim, Joseph K. Pickrell, Zachary Baker, Ziyue Gao, Jose Francisco Diesel, and Imran S. Haque
- Subjects
0301 basic medicine ,Cancer Research ,Population genetics ,Biochemistry ,Geographical Locations ,Database and Informatics Methods ,Human disease ,Gene Frequency ,Effective population size ,Genetics (clinical) ,Genetics ,DNA methylation ,Simulation and Modeling ,Chromatin ,Europe ,Nucleic acids ,Deletion Mutation ,Mutation (genetic algorithm) ,Epigenetics ,DNA modification ,Chromatin modification ,Research Article ,Chromosome biology ,Heterozygote ,Cell biology ,lcsh:QH426-470 ,Population Size ,Genes, Recessive ,Biology ,Research and Analysis Methods ,03 medical and health sciences ,Population Metrics ,Humans ,Selection, Genetic ,Allele ,Molecular Biology ,Alleles ,Ecology, Evolution, Behavior and Systematics ,Models, Genetic ,Population Biology ,Genetic Diseases, Inborn ,Correction ,Biology and Life Sciences ,DNA ,lcsh:Genetics ,Genetics, Population ,Biological Databases ,030104 developmental biology ,Genetic Loci ,Mutation ,People and Places ,Mutation Databases ,Genes, Lethal ,Gene expression - Abstract
Do the frequencies of disease mutations in human populations reflect a simple balance between mutation and purifying selection? What other factors shape the prevalence of disease mutations? To begin to answer these questions, we focused on one of the simplest cases: recessive mutations that alone cause lethal diseases or complete sterility. To this end, we generated a hand-curated set of 417 Mendelian mutations in 32 genes reported to cause a recessive, lethal Mendelian disease. We then considered analytic models of mutation-selection balance in infinite and finite populations of constant sizes and simulations of purifying selection in a more realistic demographic setting, and tested how well these models fit allele frequencies estimated from 33,370 individuals of European ancestry. In doing so, we distinguished between CpG transitions, which occur at a substantially elevated rate, and three other mutation types. Intriguingly, the observed frequency for CpG transitions is slightly higher than expectation but close, whereas the frequencies observed for the three other mutation types are an order of magnitude higher than expected, with a bigger deviation from expectation seen for less mutable types. This discrepancy is even larger when subtle fitness effects in heterozygotes or lethal compound heterozygotes are taken into account. In principle, higher than expected frequencies of disease mutations could be due to widespread errors in reporting causal variants, compensation by other mutations, or balancing selection. It is unclear why these factors would have a greater impact on disease mutations that occur at lower rates, however. We argue instead that the unexpectedly high frequency of disease mutations and the relationship to the mutation rate likely reflect an ascertainment bias: of all the mutations that cause recessive lethal diseases, those that by chance have reached higher frequencies are more likely to have been identified and thus to have been included in this study. Beyond the specific application, this study highlights the parameters likely to be important in shaping the frequencies of Mendelian disease alleles., Author summary What determines the frequencies of disease mutations in human populations? To begin to answer this question, we focus on one of the simplest cases: mutations that cause completely recessive, lethal Mendelian diseases. We first review theory about what to expect from mutation and selection in a population of finite size and generate predictions based on simulations using a plausible demographic scenario of recent human evolution. For a highly mutable type of mutation, transitions at CpG sites, we find that the predictions are close to the observed frequencies of recessive lethal disease mutations. For less mutable types, however, predictions substantially under-estimate the observed frequency. We discuss possible explanations for the discrepancy and point to a complication that, to our knowledge, is not widely appreciated: that there exists ascertainment bias in disease mutation discovery. Specifically, we suggest that alleles that have been identified to date are likely the ones that by chance have reached higher frequencies and are thus more likely to have been mapped. More generally, our study highlights the factors that influence the frequencies of Mendelian disease alleles.
- Published
- 2018
- Full Text
- View/download PDF
29. A rod cell marker of nocturnal ancestry
- Author
-
Joseph K. Pickrell and George H. Perry
- Subjects
Opsin ,genetic structures ,Zoology ,Nocturnal ,Biology ,Article ,medicine.anatomical_structure ,Evolutionary biology ,Anthropology ,Behavioral ecology ,medicine ,Cathemerality ,Diurnality ,Circadian rhythm ,Rod cell ,Retinal Rod Photoreceptor Cells ,Ecology, Evolution, Behavior and Systematics - Abstract
In a recent Cell article, Solovei et al. (2009) have shown that the rod cell nuclei of nocturnal and diurnal mammals (including primates) are organized in distinct patterns, and that the nocturnal-associated pattern likely facilitates efficient photon capture by the photoreceptors. Their research underscores the exceptional selective pressures placed on the visual system in low light environments and provides a new marker of nocturnal ancestry. This marker can be used to advance our understanding of activity pattern evolution, potentially including the behavioral ecology of ancestral primates.
- Published
- 2010
- Full Text
- View/download PDF
30. The Genetics of Human Adaptation: Hard Sweeps, Soft Sweeps, and Polygenic Adaptation
- Author
-
Joseph K. Pickrell, Jonathan K. Pritchard, and Graham Coop
- Subjects
Genetics ,Multifactorial Inheritance ,education.field_of_study ,Natural selection ,Agricultural and Biological Sciences(all) ,Biochemistry, Genetics and Molecular Biology(all) ,Positive selection ,Population ,Adaptation, Biological ,Biology ,Article ,General Biochemistry, Genetics and Molecular Biology ,Fixation (population genetics) ,Genetics, Population ,Evolutionary biology ,Humans ,Selection, Genetic ,General Agricultural and Biological Sciences ,education - Abstract
There has long been interest in understanding the genetic basis of human adaptation. To what extent are phenotypic differences among human populations driven by natural selection? With the recent arrival of large genome-wide data sets on human variation, there is now unprecedented opportunity for progress on this type of question. Several lines of evidence argue for an important role of positive selection in shaping human variation and differences among populations. These include studies of comparative morphology and physiology, as well as population genetic studies of candidate loci and genome-wide data. However, the data also suggest that it is unusual for strong selection to drive new mutations rapidly to fixation in particular populations (the ‘hard sweep’ model). We argue, instead, for alternatives to the hard sweep model: in particular, polygenic adaptation could allow rapid adaptation while not producing classical signatures of selective sweeps. We close by discussing some of the likely opportunities for progress in the field.
- Published
- 2010
- Full Text
- View/download PDF
31. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data
- Author
-
Jacob F. Degner, Joseph K. Pickrell, Yoav Gilad, Jonathan K. Pritchard, John C. Marioni, Athma A. Pai, and Everlyne Nkadori
- Subjects
Statistics and Probability ,Sequence analysis ,Single-nucleotide polymorphism ,Biology ,Polymorphism, Single Nucleotide ,Biochemistry ,Genome ,Humans ,Allele ,International HapMap Project ,Molecular Biology ,Alleles ,Whole genome sequencing ,Genetics ,Base Sequence ,Genome, Human ,Sequence Analysis, RNA ,Gene Expression Profiling ,Computational Biology ,Genome Analysis ,Original Papers ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Human genome ,Software ,Reference genome - Abstract
Motivation: Next-generation sequencing has become an important tool for genome-wide quantification of DNA and RNA. However, a major technical hurdle lies in the need to map short sequence reads back to their correct locations in a reference genome. Here, we investigate the impact of SNP variation on the reliability of read-mapping in the context of detecting allele-specific expression (ASE). Results: We generated 16 million 35 bp reads from mRNA of each of two HapMap Yoruba individuals. When we mapped these reads to the human genome we found that, at heterozygous SNPs, there was a significant bias toward higher mapping rates of the allele in the reference sequence, compared with the alternative allele. Masking known SNP positions in the genome sequence eliminated the reference bias but, surprisingly, did not lead to more reliable results overall. We find that even after masking, ∼5–10% of SNPs still have an inherent bias toward more effective mapping of one allele. Filtering out inherently biased SNPs removes 40% of the top signals of ASE. The remaining SNPs showing ASE are enriched in genes previously known to harbor cis-regulatory variation or known to show uniparental imprinting. Our results have implications for a variety of applications involving detection of alternate alleles from short-read sequence data. Availability: Scripts, written in Perl and R, for simulating short reads, masking SNP variation in a reference genome and analyzing the simulation output are available upon request from JFD. Raw short read data were deposited in GEO (http://www.ncbi.nlm.nih.gov/geo/) under accession number GSE18156. Contact: jdegner@uchicago.edu; marioni@uchicago.edu; gilad@uchicago.edu; pritch@uchicago.edu Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2009
- Full Text
- View/download PDF
32. Genome-wide association study identifies 74 loci associated with educational attainment
- Author
-
K. Petrovic, Massimo Mangino, Daniele Cusi, Ozren Polasek, Rodney J. Scott, Yong Qian, Aysu Okbay, Jari Lahti, Bjarni Gunnarsson, George McMahon, Elizabeth G. Holliday, Thomas Meitinger, Frank J. A. van Rooij, Mika Kähönen, Martin Kroh, Ian J. Deary, Neil Pendleton, Pamela A. F. Madden, David J. Porteous, Lambertus A. Kiemeney, Sven Oskarsson, Edith Hofer, Robert F. Krueger, Olga Rostapshova, Georg Homuth, Paolo Gasparini, Aldo Rustichini, Sarah E. Medland, Christian Gieger, Veronique Vitart, Nicholas J. Timpson, George Dedoussis, Joseph K. Pickrell, Christopher Oldmeadow, Aldi T. Kraja, Johan G. Eriksson, Lydia Quaye, William G. Iacono, Danielle Posthuma, George Davey Smith, Karl-Oskar Lindgren, David C. Liewald, Pim van der Harst, Börge Schmidt, Christine Power, Francesco P. Cappuccio, Francesco Cucca, Simona Vaccargiu, Joyce Y. Tung, Aarno Palotie, Natalia Pervjakova, Jonas Bacelis, Jouke-Jan Hottenga, Helena Schmidt, Kari Stefansson, Tamara B. Harris, Momoko Horikoshi, Lude Franke, Wolfgang Hoffmann, Ingrid B. Borecki, William E R Ollier, Johannes Waage, Andreas J. Forstner, Caroline Hayward, Penelope A. Lind, Patricia A. Boyle, Kadri Kaasik, Jian Yang, Gerardus A. Meddens, Antti Latvala, John Attia, Pascal Timshel, Vilmundur Gudnason, Maël Lebreton, Valur Emilsson, James F. Wilson, Jonathan Marten, Ute Bültmann, Erika Salvi, Olli T. Raitakari, Peter M. Visscher, Niek Verweij, Elisabeth Steinhagen-Thiessen, Cristina Venturini, Lili Milani, Tessel E. Galesloot, Kevin Thom, Klaus Berger, Paul Lichtenstein, Tian Liu, Philipp Koellinger, Riccardo E. Marioni, Marjo-Riitta Järvelin, Clemens Baumbach, Unnur Thorsteinsdottir, Magnus Johannesson, Susan M. Ring, David A. Bennett, Anu Loukola, Hans-Jörgen Grabe, Jan A. Staessen, Igor Rudan, Ginevra Biino, Nicholas G. Martin, Jingyun Yang, Anna A. E. Vinkhuyzen, Katri Räikkönen, Zhihong Zhu, Gudmar Thorleifsson, Mary F. Feitosa, Ivana Kolcic, Alexander Teumer, Jaakko Kaprio, David Schlessinger, Katharina E. Schraut, Konstantin Strauch, Ilja Demuth, Albert V. Smith, Juergen Wellmann, Jennifer E. Huffman, Panos Deloukas, Mario Pirastu, Reedik Mägi, Maria Pina Concas, Jaime Derringer, Patrick J. F. Groenen, Henry Völzke, Wei Zhao, Abdel Abdellaoui, Andres Metspalu, Nicholas A. Furlotte, Christopher P. Nelson, Barbara Franke, Steven F. Lehrer, Patrick Turley, Tõnu Esko, Jun Ding, Pedro Marques-Vidal, S. Fleur W. Meddens, Zoltán Kutalik, Gonneke Willemsen, Andrew C. Heath, Michelle N. Meyer, James J. Lee, Roy Thurik, Antonietta Robino, Henning Tiemeier, Grant W. Montgomery, C. deLeeuw, Astanand Jugessur, Antti-Pekka Sarin, Veikko Salomaa, Dalton Conley, Tim D. Spector, Sebastian E. Baumeister, Gyda Bjornsdottir, Lavinia Paternoster, Tune H. Pers, Jacob Gratten, Martin D. Tobin, Daniel J. Benjamin, Douglas F. Levinson, Stavroula Kanoni, Elina Hyppönen, David R. Weir, Peter J. van der Most, Terho Lehtimäki, David A. Hinds, Pablo V. Gejman, Uwe Völker, Cornelia M. van Duijn, Karl-Heinz Jöckel, Bjarni V. Halldorsson, Markus Perola, Nicola Pirastu, Klaus Bønnelykke, Robert Karlsson, David Cesarini, Michael A. Province, Jianxin Shi, Najaf Amin, Dale R. Nyholt, Lenore J. Launer, Nilesh J. Samani, Sven J. van der Lee, Dorret I. Boomsma, Harry Campbell, Peter Vollenweider, Liisa Keltigangas-Jarvinen, David Laibson, Ronald de Vlaming, Lynne J. Hocking, Christopher F. Chabris, Blair H. Smith, Gail Davies, Niina Eklund, Ioanna P. Kalafati, Bo Jacobsson, Sheila Ulivi, Alan F. Wright, Sarah E. Harris, Mark Alan Fontana, Diego Vozzi, Tomi Mäki-Opas, Albert Hofman, Hans Bisgaard, Andrew Bakshi, Marika Kaakinen, Johannes H. Brandsma, Christa Meisinger, Ilaria Gandin, Tarunveer S. Ahluwalia, Jennifer A. Smith, Beate St Pourcain, Rico Rueedi, Lewin Eisele, Michael B. Miller, Brenda W.J.H. Penninx, Alan R. Sanders, Thorkild I. A. Sørensen, André G. Uitterlinden, Cornelius A. Rietveld, Peter Lichtner, Dragana Vuckovic, Giorgia Girotto, Behrooz Z. Alizadeh, Reinhold Schmidt, Raymond A. Poot, Judith M. Vonk, Antony Payton, Wouter J. Peyrot, Augustine Kong, Y. Milaneschi, Jessica D. Faul, Patrik K. E. Magnusson, Antonio Terracciano, David M. Evans, Sharon L.R. Kardia, Peter K. Joshi, Michael A. Horan, Matt McGue, Richa Gupta, Jonathan P. Beauchamp, Peter Eibich, Erin B. Ware, Lars Bertram, Philip L. De Jager, Nancy L. Pedersen, Ronny Myhre, Guo-Bo Chen, Harm-Jan Westra, Jan-Emmanuel De Neve, Evelin Mihailov, Leanne M. Hall, Seppo Koskinen, Rush University Medical Center [Chicago], Department of Epidemiology [Rotterdam], Erasmus University Medical Center [Rotterdam] (Erasmus MC), Helmholtz-Zentrum München (HZM), University of Queensland [Brisbane], Erasmus University Rotterdam, Universidad de Navarra [Pamplona] (UNAV), National Institute for Health and Welfare [Helsinki], Institute for Molecular Medicine Finland [Helsinki] (FIMM), Helsinki Institute of Life Science (HiLIFE), Helsingin yliopisto = Helsingfors universitet = University of Helsinki-Helsingin yliopisto = Helsingfors universitet = University of Helsinki, Consiglio Nazionale delle Ricerche (CNR), Montpellier Research in Management (MRM), Université Paul-Valéry - Montpellier 3 (UPVM)-Université de Perpignan Via Domitia (UPVD)-Groupe Sup de Co Montpellier (GSCM) - Montpellier Business School-Université de Montpellier (UM), University of Bristol [Bristol], Queensland Institute of Medical Research, Massachusetts General Hospital [Boston], Medical University Graz, Institut des Sciences Moléculaires (ISM), Université Montesquieu - Bordeaux 4-Université Sciences et Technologies - Bordeaux 1-École Nationale Supérieure de Chimie et de Physique de Bordeaux (ENSCPB)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS), King‘s College London, Tampere University Hospital, University of Turku, AP-HP Hôpital universitaire Robert-Debré [Paris], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP), University of Edinburgh, Charité - UniversitätsMedizin = Charité - University Hospital [Berlin], Imperial College London, Reykjavík University, Donders Institute for Brain, Cognition and Behaviour, Radboud university [Nijmegen], Karolinska Institutet [Stockholm], Department of Health Sciences [Leicester], University of Leicester, Broad Institute of MIT and Harvard (BROAD INSTITUTE), Harvard Medical School [Boston] (HMS)-Massachusetts Institute of Technology (MIT)-Massachusetts General Hospital [Boston], Department of Medical Epidemiology and Biostatistics (MEB), Dpt of Pharmacology and Personalised Medicine [Maastricht], Maastricht University [Maastricht], Florida State University [Tallahassee] (FSU), IT University of Copenhagen, University of Helsinki-University of Helsinki, Université Montpellier 1 (UM1)-Groupe Sup de Co Montpellier (GSCM) - Montpellier Business School-Université Paul-Valéry - Montpellier 3 (UPVM)-Université de Montpellier (UM)-Université Montpellier 2 - Sciences et Techniques (UM2)-Université de Perpignan Via Domitia (UPVD), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure de Chimie et de Physique de Bordeaux (ENSCPB)-Université Sciences et Technologies - Bordeaux 1-Université Montesquieu - Bordeaux 4-Institut de Chimie du CNRS (INC), Faculteit Economie en Bedrijfskunde, Microeconomics (ASE, FEB), LifeLines Cohort Study, Alizadeh, BZ., de Boer, RA., Boezen, HM., Bruinenberg, M., Franke, L., van der Harst, P., Hillege, HL., van der Klauw, MM., Navis, G., Ormel, J., Postma, DS., Rosmalen, JG., Slaets, JP., Snieder, H., Stolk, RP., Wolffenbuttel, BH., Wijmenga, C., Applied Economics, Cell biology, Epidemiology, Erasmus MC other, Econometrics, Child and Adolescent Psychiatry / Psychology, Psychiatry, Internal Medicine, EMGO+ - Mental Health, Complex Trait Genetics, Biological Psychology, Functional Genomics, Economics, Amsterdam Neuroscience - Complex Trait Genetics, Groningen Institute for Gastro Intestinal Genetics and Immunology (3GI), Groningen Research Institute for Asthma and COPD (GRIAC), Public Health Research (PHR), Cardiovascular Centre (CVC), Life Course Epidemiology (LCE), Stem Cell Aging Leukemia and Lymphoma (SALL), Real World Studies in PharmacoEpidemiology, -Genetics, -Economics and -Therapy (PEGET), Okbay, Aysu, Beauchamp, Jonathan P., Fontana, Mark Alan, Lee, James J., Pers, Tune H., Rietveld, Cornelius A., Turley, Patrick, Chen, Guo Bo, Emilsson, Valur, Meddens, S. Fleur W., Oskarsson, Sven, Pickrell, Joseph K., Thom, Kevin, Timshel, Pascal, De Vlaming, Ronald, Abdellaoui, Abdel, Ahluwalia, Tarunveer S., Bacelis, Jona, Baumbach, Clemen, Bjornsdottir, Gyda, Brandsma, Johannes H., Concas, MARIA PINA, Derringer, Jaime, Furlotte, Nicholas A., Galesloot, Tessel E., Girotto, Giorgia, Gupta, Richa, Hall, Leanne M., Harris, Sarah E., Hofer, Edith, Horikoshi, Momoko, Huffman, Jennifer E., Kaasik, Kadri, Kalafati, Ioanna P., Karlsson, Robert, Kong, Augustine, Lahti, Jari, Van Der Lee, Sven J., Deleeuw, Christiaan, Lind, Penelope A., Lindgren, Karl Oskar, Liu, Tian, Mangino, Massimo, Marten, Jonathan, Mihailov, Evelin, Miller, Michael B., Van Der Most, Peter J., Oldmeadow, Christopher, Payton, Antony, Pervjakova, Natalia, Peyrot, Wouter J., Qian, Yong, Raitakari, Olli, Rueedi, Rico, Salvi, Erika, Schmidt, Börge, Schraut, Katharina E., Shi, Jianxin, Smith, Albert V., Poot, Raymond A., St Pourcain, Beate, Teumer, Alexander, Thorleifsson, Gudmar, Verweij, Niek, Vuckovic, Dragana, Wellmann, Juergen, Westra, Harm Jan, Yang, Jingyun, Zhao, Wei, Zhu, Zhihong, Alizadeh, Behrooz Z., Amin, Najaf, Bakshi, Andrew, Baumeister, Sebastian E., Biino, Ginevra, Bønnelykke, Klau, Boyle, Patricia A., Campbell, Harry, Cappuccio, Francesco P., Davies, Gail, De Neve, Jan Emmanuel, Deloukas, Pano, Demuth, Ilja, Ding, Jun, Eibich, Peter, Eisele, Lewin, Eklund, Niina, Evans, David M., Faul, Jessica D., Feitosa, Mary F., Forstner, Andreas J., Gandin, Ilaria, Gunnarsson, Bjarni, Halldórsson, Bjarni V., Harris, Tamara B., Holliday, Elizabeth G., Heath, Andrew C., Hocking, Lynne J., Homuth, Georg, Horan, Michael A., Hottenga, Jouke Jan, De Jager, Philip L., Joshi, Peter K., Jugessur, Astanand, Kaakinen, Marika A., Kähönen, Mika, Kanoni, Stavroula, Keltigangas Järvinen, Liisa, Kiemeney, Lambertus A. L. M., Kolcic, Ivana, Koskinen, Seppo, Kraja, Aldi T., Kroh, Martin, Kutalik, Zoltan, Latvala, Antti, Launer, Lenore J., Lebreton, Maël P., Levinson, Douglas F., Lichtenstein, Paul, Lichtner, Peter, Liewald, David C. M., Loukola, Anu, Madden, Pamela A., Mägi, Reedik, Mäki Opas, Tomi, Marioni, Riccardo E., Marques Vidal, Pedro, Meddens, Gerardus A., Mcmahon, George, Meisinger, Christa, Meitinger, Thoma, Milaneschi, Yusplitri, Milani, Lili, Montgomery, Grant W., Myhre, Ronny, Nelson, Christopher P., Nyholt, Dale R., Ollier, William E. R., Palotie, Aarno, Paternoster, Lavinia, Pedersen, Nancy L., Petrovic, Katja E., Porteous, David J., Raïkkönen, Katri, Ring, Susan M., Robino, Antonietta, Rostapshova, Olga, Rudan, Igor, Rustichini, Aldo, Salomaa, Veikko, Sanders, Alan R., Sarin, Antti Pekka, Schmidt, Helena, Scott, Rodney J., Smith, Blair H., Smith, Jennifer A., Staessen, Jan A., Steinhagen Thiessen, Elisabeth, Strauch, Konstantin, Terracciano, Antonio, Tobin, Martin D., Ulivi, Sheila, Vaccargiu, Simona, Quaye, Lydia, Van Rooij, Frank J. A., Venturini, Cristina, Vinkhuyzen, Anna A. E., Völker, Uwe, Völzke, Henry, Vonk, Judith M., Vozzi, Diego, Waage, Johanne, Ware, Erin B., Willemsen, Gonneke, Attia, John R., Bennett, David A., Berger, Klau, Bertram, Lar, Bisgaard, Han, Boomsma, Dorret I., Borecki, Ingrid B., Bültmann, Ute, Chabris, Christopher F., Cucca, Francesco, Cusi, Daniele, Deary, Ian J., Dedoussis, George V., Van Duijn, Cornelia M., Eriksson, Johan G., Franke, Barbara, Franke, Lude, Gasparini, Paolo, Gejman, Pablo V., Gieger, Christian, Grabe, Hans Jörgen, Gratten, Jacob, Groenen, Patrick J. F., Gudnason, Vilmundur, Van Der Harst, Pim, Hayward, Caroline, Hinds, David A., Hoffmann, Wolfgang, Hyppönen, Elina, Iacono, William G., Jacobsson, Bo, Järvelin, Marjo Riitta, Jöckel, Karl Heinz, Kaprio, Jaakko, Kardia, Sharon L. R., Lehtimäki, Terho, Lehrer, Steven F., Magnusson, Patrik K. E., Martin, Nicholas G., Mcgue, Matt, Metspalu, Andre, Pendleton, Neil, Penninx, Brenda W. J. H., Perola, Marku, Pirastu, Nicola, Pirastu, Mario, Polasek, Ozren, Posthuma, Danielle, Power, Christine, Province, Michael A., Samani, Nilesh J., Schlessinger, David, Schmidt, Reinhold, Sørensen, Thorkild I. A., Spector, Tim D., Stefansson, Kari, Thorsteinsdottir, Unnur, Thurik, A. Roy, Timpson, Nicholas J., Tiemeier, Henning, Tung, Joyce Y., Uitterlinden, André G., Vitart, Veronique, Vollenweider, Peter, Weir, David R., Wilson, James F., Wright, Alan F., Conley, Dalton C., Krueger, Robert F., Davey Smith, George, Hofman, Albert, Laibson, David I., Medland, Sarah E., Meyer, Michelle N., Yang, Jian, Johannesson, Magnu, Visscher, Peter M., Esko, Toñu, Koellinger, Philipp D., Cesarini, David, Benjamin, Daniel J., EMGO - Mental health, IOO, Human genetics, Beauchamp, Jonathan P, Lee, James JJ, Hypponen, Elina, and Benjamin, Daniel J
- Subjects
0301 basic medicine ,Netherlands Twin Register (NTR) ,Candidate gene ,Bipolar Disorder ,Medizin ,Genome-wide association study ,Genome-wide association studies ,0302 clinical medicine ,Cognition ,Alzheimer Disease ,Brain ,Computational Biology ,Fetus ,Gene Expression Regulation ,Gene-Environment Interaction ,Great Britain ,Humans ,Molecular Sequence Annotation ,Polymorphism ,Single Nucleotide ,Schizophrenia ,Educational Status ,Genome-Wide Association Study ,Medicine (all) ,Multidisciplinary ,Fetu ,tau ,Gene–environment interaction ,Soziales und Gesundheit ,Genetics ,[QFIN]Quantitative Finance [q-fin] ,HERITABILITY ,General Commentary ,Alzheimer's disease ,Biobank ,Phenotype ,Multidisciplinary Sciences ,Urological cancers Radboud Institute for Health Sciences [Radboudumc 15] ,educational attainment ,Behavioural genetics ,Science & Technology - Other Topics ,Bildungsniveau ,TRAITS ,Human ,General Science & Technology ,Kultursektor ,SNP ,ta3111 ,Polymorphism, Single Nucleotide ,Learning and memory ,Alzheimer Disease/genetics ,Bipolar Disorder/genetics ,Brain/metabolism ,Fetus/metabolism ,Gene Expression Regulation/genetics ,Polymorphism, Single Nucleotide/genetics ,Schizophrenia/genetics ,03 medical and health sciences ,ACHIEVEMENT ,MD Multidisciplinary ,Non-Profit-Sektor ,QH426 ,Science & Technology ,Neurodevelopmental disorders Donders Center for Medical Neuroscience [Radboudumc 7] ,tauopathies ,Data Science ,gene ,education ,school ,Heritability ,Educational Statu ,Educational attainment ,United Kingdom ,030104 developmental biology ,IQ ,Genetische Forschung ,Psychiatric disorders ,Bildung ,030217 neurology & neurosurgery ,LifeLines Cohort Study ,Neuroscience - Abstract
Contains fulltext : 167137.pdf (Publisher’s version ) (Closed access) Educational attainment is strongly influenced by social and other environmental factors, but genetic factors are estimated to account for at least 20% of the variation across individuals. Here we report the results of a genome-wide association study (GWAS) for educational attainment that extends our earlier discovery sample of 101,069 individuals to 293,723 individuals, and a replication study in an independent sample of 111,349 individuals from the UK Biobank. We identify 74 genome-wide significant loci associated with the number of years of schooling completed. Single-nucleotide polymorphisms associated with educational attainment are disproportionately found in genomic regions regulating gene expression in the fetal brain. Candidate genes are preferentially expressed in neural tissue, especially during the prenatal period, and enriched for biological pathways involved in neural development. Our findings demonstrate that, even for a behavioural phenotype that is mostly environmentally determined, a well-powered GWAS identifies replicable associated genetic variants that suggest biologically relevant pathways. Because educational attainment is measured in large numbers of individuals, it will continue to be useful as a proxy phenotype in efforts to characterize the genetic influences of related phenotypes, including cognition and neuropsychiatric diseases.
- Published
- 2016
- Full Text
- View/download PDF
33. Genetic variants associated with subjective well-being, depressive symptoms, and neuroticism identified through genome-wide analyses
- Author
-
Sharon L.R. Kardia, Melinda Mills, Ian J. Deary, Sven Bergmann, Magnus Johannesson, Igor Rudan, Nicholas J. Timpson, Richa Gupta, Hester M. den Ruijter, Rico Rueedi, Lars Bertram, Juliette Harris, Jennifer A. Smith, Eero Kajantie, Jian Yang, David C. Liewald, Ronald de Vlaming, Anu Realo, Jessica D. Faul, Patrik K. E. Magnusson, Jadwiga Buchwald, Philip L. De Jager, Rajesh Rawal, Marisa Koini, K. Petrovic, Unnur Thorsteinsdottir, Pedro Marques-Vidal, Roy Thurik, Evie Stergiakouli, Ivana Kolcic, William G. Iacono, Laura D. Kubzansky, Jaakko Kaprio, Behrooz Z. Alizadeh, Samantha Cherney, Reinhold Schmidt, Vilmundur Gudnason, Nicholas A. Furlotte, Nicholas G. Martin, Delilah Zabaneh, Rebecca T. Emeny, Eco J. C. de Geus, Klaus Berger, Lude Franke, Sven J. van der Lee, S. Fleur W. Meddens, Bart M. L. Baselmans, Dalton Conley, Jouke-Jan Hottenga, David A. Bennett, Saskia Haitjema, Jun Ding, Elina Hyppönen, Kari Stefansson, Henning Tiemeier, Grant W. Montgomery, Michelle N. Meyer, James J. Lee, Aysu Okbay, Lei Yu, Gyda Bjornsdottir, Marie Henrike Geisel, Albertine J. Oldehinkel, Liisa Keltikangas-Järvinen, George Davey Smith, Harold Snieder, Juho Wedenoja, Frits R. Rosendaal, Michel G. Nivard, Simon R. Cox, Christine Power, Dorret I. Boomsma, Börge Schmidt, David A. Hinds, Philipp Koellinger, Harm-Jan Westra, Terho Lehtimäki, Alison Pattie, Jan-Emmanuel De Neve, Torben Hansen, John M. Starr, Albert V. Smith, Antonio Terracciano, Juergen Wellmann, Albert Hofman, Katri Räikkönen, Patrick Turley, Andrew Steptoe, Jimmy Z. Liu, Evelin Mihailov, Jari Lahti, Marika Kaakinen, David R. Weir, Cornelia M. van Duijn, Maciej Trzaskowski, David Cesarini, Najaf Amin, Lydia Quaye, Sara M. Willems, Ghazaleh Fatemifar, Christopher Oldmeadow, Ville Karhunen, Alana Cavadino, Juan R. González, Caroline Hayward, Penelope A. Lind, Tamara B. Harris, Jaime Derringer, Camelia C. Minică, Jacob Gratten, Patrick J. F. Groenen, Ozren Polasek, Anu Loukola, Peter Vollenweider, Christian Gieger, Rodney J. Scott, Lindsay K. Matteson, Meike Bartels, Andrew Bakshi, David Laibson, Andrew C. Heath, Daniel J. Benjamin, Stephen S. Rich, Claire M. A. Haworth, C.A. Hartman, Angelina R. Sutin, Tian Liu, Johan G. Eriksson, Robert Plomin, Francesco Cucca, John Attia, Tarunveer S. Ahluwalia, Andreas J. Forstner, Gudmar Thorleifsson, Rauli Svento, Ute Bültmann, Olli T. Raitakari, Karl-Heinz Jöckel, Mark Alan Fontana, Oliver S. P. Davis, Yong Qian, Anna A. E. Vinkhuyzen, Wei Zhao, Elizabeth G. Holliday, Andres Metspalu, Alexis C. Frazier-Wood, Meena Kumari, Nese Direk, Miriam A. Mosing, Ilja M. Nolte, Sander W. van der Laan, Matt McGue, Tim D. Spector, Marjo-Riitta Järvelin, Peter Eibich, Nancy L. Pedersen, David Schlessinger, Toshiko Tanaka, Dennis O. Mook-Kanamori, Luigi Ferrucci, Gerard Pasterkamp, Jonathan P. Beauchamp, Robert Karlsson, Lenore J. Launer, Helena Schmidt, Renée de Mutsert, Gert G. Wagner, Peter Kraft, Joseph K. Pickrell, Thorkild I. A. Sørensen, André G. Uitterlinden, Patricia A. Boyle, Cornelius A. Rietveld, Robert F. Krueger, Karl-Heinz Ladwig, Michael B. Miller, Sarah E. Medland, Tõnu Esko, Kent D. Taylor, Lavinia Paternoster, Peter J. van der Most, Richard Karlsson Linnér, Laura Pulkki-Råback, Gail Davies, Victoria Garfield, Shun-Chiao Chang, Ruifang Li-Gao, Applied Economics, Pathology, Epidemiology, Public Health, Child and Adolescent Psychiatry / Psychology, Psychiatry, Internal Medicine, Ophthalmology, Social Science Genetic Association Consortium (SSGAC), CHARGE Consortium, Okbay, Aysu, Baselmans, Bart ML, De Neve, Jan-Emmanuel, Turley, Patrick, Hyppönen, Elina, Cesarini, David, EMGO+ - Mental Health, Amsterdam Neuroscience - Mood, Anxiety, Psychosis, Stress & Sleep, Complex Trait Genetics, Biological Psychology, Functional Genomics, Economics, Amsterdam Neuroscience - Complex Trait Genetics, Queensland Institute of Medical Research, Massachusetts General Hospital [Boston], University of Queensland [Brisbane], Center for Research in Environmental Epidemiology (CREAL), Universitat Pompeu Fabra [Barcelona] (UPF)-Catalunya ministerio de salud, Universitat Pompeu Fabra [Barcelona] (UPF), CIBER de Epidemiología y Salud Pública (CIBERESP), Institut des Sciences Moléculaires (ISM), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure de Chimie et de Physique de Bordeaux (ENSCPB)-Université Sciences et Technologies - Bordeaux 1-Université Montesquieu - Bordeaux 4-Institut de Chimie du CNRS (INC), Tampere University Hospital, University of Turku, King‘s College London, University of Helsinki, AP-HP Hôpital universitaire Robert-Debré [Paris], Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP), Université de Lausanne (UNIL), Birmingham City University (BCU), Helmholtz-Zentrum München (HZM), Department of Epidemiology and Public Health, University College of London [London] (UCL), Donders Institute for Brain, Cognition and Behaviour, Radboud university [Nijmegen], University of Edinburgh, Florida State University [Tallahassee] (FSU), Wuhan University [China], Rush University Medical Center [Chicago], Department of Epidemiology [Rotterdam], Erasmus University Medical Center [Rotterdam] (Erasmus MC), Erasmus University Rotterdam, National Institute for Health and Welfare [Helsinki], Department of Medical Epidemiology and Biostatistics (MEB), Karolinska Institutet [Stockholm], Dpt of Pharmacology and Personalised Medicine [Maastricht], Maastricht University [Maastricht], Montpellier Research in Management (MRM), Université Montpellier 1 (UM1)-Groupe Sup de Co Montpellier (GSCM) - Montpellier Business School-Université Paul-Valéry - Montpellier 3 (UPVM)-Université de Montpellier (UM)-Université Montpellier 2 - Sciences et Techniques (UM2)-Université de Perpignan Via Domitia (UPVD), Université Montesquieu - Bordeaux 4-Université Sciences et Technologies - Bordeaux 1-École Nationale Supérieure de Chimie et de Physique de Bordeaux (ENSCPB)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS), Helsingin yliopisto = Helsingfors universitet = University of Helsinki, Université Paul-Valéry - Montpellier 3 (UPVM)-Université de Perpignan Via Domitia (UPVD)-Groupe Sup de Co Montpellier (GSCM) - Montpellier Business School-Université de Montpellier (UM), Life Course Epidemiology (LCE), Groningen Institute for Gastro Intestinal Genetics and Immunology (3GI), Public Health Research (PHR), Sociology/ICS, Interdisciplinary Centre Psychopathology and Emotion regulation (ICPE), Stem Cell Aging Leukemia and Lymphoma (SALL), Real World Studies in PharmacoEpidemiology, -Genetics, -Economics and -Therapy (PEGET), and LifeLines Cohort Study
- Subjects
0301 basic medicine ,Netherlands Twin Register (NTR) ,Medizin ,Genome-wide association study ,Disease ,Genome-wide association studies ,DISEASE ,0302 clinical medicine ,well-being ,neuroticism ,Anxiety Disorders/genetics ,Bayes Theorem ,Depression/genetics ,Genome-Wide Association Study ,Humans ,Phenotype ,Polymorphism, Single Nucleotide ,Genetics & Heredity ,RISK ,Genetics ,PERSONALITY ,[QFIN]Quantitative Finance [q-fin] ,Depression ,HERITABILITY ,COMMON VARIANTS ,11 Medical And Health Sciences ,Anxiety Disorders ,Neuroticism ,depression ,Behavioural genetics ,Life Sciences & Biomedicine ,Biology ,personality ,genetic ,epidemiology ,ta3111 ,Article ,03 medical and health sciences ,SDG 3 - Good Health and Well-being ,RESOURCE ,Journal Article ,Human height ,Subjective well-being ,behavioural genetics ,METAANALYSIS ,Genetic association ,HAPPINESS ,Science & Technology ,MAJOR DEPRESSION ,06 Biological Sciences ,Heritability ,030104 developmental biology ,genome-wide association studies ,HUMAN HEIGHT ,030217 neurology & neurosurgery ,LifeLines Cohort Study ,Developmental Biology ,genome-wide analysis - Abstract
Very few genetic variants have been associated with depression and neuroticism, likely because of limitations on sample size in previous studies. Subjective well-being, a phenotype that is genetically correlated with both of these traits, has not yet been studied with genome-wide data. We conducted genome-wide association studies of three phenotypes: subjective well-being (n = 298,420), depressive symptoms (n = 161,460), and neuroticism (n = 170,911). We identify 3 variants associated with subjective well-being, 2 variants associated with depressive symptoms, and 11 variants associated with neuroticism, including 2 inversion polymorphisms. The two loci associated with depressive symptoms replicate in an independent depression sample. Joint analyses that exploit the high genetic correlations between the phenotypes (vertical bar(p) over cap vertical bar approximate to 0.8) strengthen the overall credibility of the findings and allow us to identify additional variants. Across our phenotypes, loci regulating expression in central nervous system and adrenal or pancreas tissues are strongly enriched for association.
- Published
- 2016
- Full Text
- View/download PDF
34. Genetic Associations with Subjective Well-Being Also Implicate Depression and Neuroticism
- Author
-
Tõnu Esko, Richard Karlsson Linnér, David Laibson, Michelle N. Meyer, James J. Lee, Bart M. L. Baselmans, Michel G. Nivard, George Davey Smith, Daniel J. Benjamin, Jimmy Z. Liu, Mark Alan Fontana, Magnus Johannesson, Cornelius A. Rietveld, Jan-Emmanuel De Neve, Fleur Meddens, David Cesarini, Joseph K. Pickrell, Patrick Turley, Jonathan P. Beauchamp, Albert Hofman, Robert F. Krueger, Sarah E. Medland, Jacob Gratten, Ronald de Vlaming, Dalton Conley, Aysu Okbay, Meike Bartels, Philipp Koellinger, and Jaime Derringer
- Subjects
Genetics ,0303 health sciences ,Single-nucleotide polymorphism ,Biology ,Neuroticism ,Genetic correlation ,Phenotype ,03 medical and health sciences ,0302 clinical medicine ,Subjective well-being ,030217 neurology & neurosurgery ,Depression (differential diagnoses) ,Depressive symptoms ,030304 developmental biology - Abstract
We conducted a genome-wide association study of subjective well-being (SWB) in 298,420 individuals. We also performed auxiliary analyses of depressive symptoms ("DS";N= 161,460) and neuroticism (N= 170,910), both of which have a substantial genetic correlation with SWB (ρ≈-0.8). We identify three SNPs associated with SWB at genome-wide significance. Two of them are significantly associated with DS in an independent sample. In our auxiliary analyses, we identify 13 additional genome-wide-significant associations: two with DS and eleven with neuroticism, including two inversion polymorphisms. Across our phenotypes, loci regulating expression in central nervous system and adrenal/pancreas tissues are enriched. The discovery of genetic loci associated with the three phenotypes we study has proven elusive; our findings illustrate the payoffs from studying them jointly.
- Published
- 2015
- Full Text
- View/download PDF
35. Detection and interpretation of shared genetic influences on 40 human traits
- Author
-
Laure Ségurel, Joseph K. Pickrell, Tomaz Berisa, Joyce Y. Tung, and David A. Hinds
- Subjects
2. Zero hunger ,Genetics ,0303 health sciences ,Ear infection ,Genome-wide association study ,Disease ,Type 2 diabetes ,Biology ,medicine.disease ,Phenotype ,3. Good health ,03 medical and health sciences ,0302 clinical medicine ,medicine ,Male-pattern baldness ,030212 general & internal medicine ,Body mass index ,030304 developmental biology ,Genetic association - Abstract
We performed a genome-wide scan for genetic variants that influence multiple human phenotypes by comparing large genome-wide association studies (GWAS) of 40 traits or diseases, including anthropometric traits (e.g. nose size and male pattern baldness), immune traits (e.g. susceptibility to childhood ear infections and Crohn's disease), metabolic phenotypes (e.g. type 2 diabetes and lipid levels), and psychiatric diseases (e.g. schizophrenia and Parkinson's disease). First, we identified 307 loci (at a false discovery rate of 10%) that influence multiple traits (excluding “trivial” phenotype pairs like type 2 diabetes and fasting glucose). Several loci influence a large number of phenotypes; for example, variants near the blood group gene ABO influence eleven of these traits, including risk of childhood ear infections (rs635634: log-odds ratio = 0.06, P = 1.4 × 10−8) and allergies (log-odds ratio = 0.05, P = 2.5 × 10−8), among others. Similarly, a nonsynonymous variant in the zinc transporter SLC39A8 influences seven of these traits, including risk of schizophrenia (rs13107325: log-odds ratio = 0.15, P = 2 × 10−12) and Parkinson’s disease (log-odds ratio = -0.15, P = 1.6 × 10−7), among others. Second, we used these loci to identify traits that share multiple genetic causes in common. For example, genetic variants that delay age of menarche in women also, on average, delay age of voice drop in men, decrease body mass index (BMI), increase adult height, and decrease risk of male pattern baldness. Finally, we identified four pairs of traits that show evidence of a causal relationship. For example, we show evidence that increased BMI causally increases triglyceride levels, and that increased liability to hypothyroidism causally decreases adult height.
- Published
- 2015
- Full Text
- View/download PDF
36. Eight thousand years of natural selection in Europe
- Author
-
Harald Meller, Bastien Llamas, Johannes Krause, de Castro Jmb, Roodenberg J, David Reich, Fokke Gerritsen, Kristin Stewardson, Pavel Kuznetsov, Joseph K. Pickrell, Mario Novak, Eppie R. Jones, David W. Anthony, Rojo Guerra Ma, Alpaslan Roodenberg S, Eadaoin Harney, Josep Maria Vergès, Iain Mathieson, Oleg Mochalov, Kurt W. Alt, Moiseyev, Ron Pinhasi, Carles Lalueza-Fox, Eudald Carbonell, Marina Lozano, Nadine Rohland, Cristina Gamba, Dorcas Brown, Daniel Fernandes, Mallick S, Wolfgang Haak, Kendra Sirak, Stanislav Dryomov, Aleksander Khokhlov, Iosif Lazaridis, Alan Cooper, Juan Luis Arsuaga, and Nick Patterson
- Subjects
Genetics ,0303 health sciences ,education.field_of_study ,Natural selection ,Population ,Biology ,Indirect evidence ,03 medical and health sciences ,0302 clinical medicine ,Ancient DNA ,Evolutionary biology ,Genetic variation ,Adaptation ,education ,Social organization ,030217 neurology & neurosurgery ,Selection (genetic algorithm) ,030304 developmental biology - Abstract
The arrival of farming in Europe around 8,500 years ago necessitated adaptation to new environments, pathogens, diets, and social organizations. While indirect evidence of adaptation can be detected in patterns of genetic variation in present-day people, ancient DNA makes it possible to witness selection directly by analyzing samples from populations before, during and after adaptation events. Here we report the first genome-wide scan for selection using ancient DNA, capitalizing on the largest genome-wide dataset yet assembled: 230 West Eurasians dating to between 6500 and 1000 BCE, including 163 with newly reported data. The new samples include the first genome-wide data from the Anatolian Neolithic culture, who we show were members of the population that was the source of Europe's first farmers, and whose genetic material we extracted by focusing on the DNA-rich petrous bone. We identify genome-wide significant signatures of selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height.
- Published
- 2015
- Full Text
- View/download PDF
37. Genome-wide patterns of selection in 230 ancient Eurasians
- Author
-
Jacob Roodenberg, Swapan Mallick, Dorcas Brown, Ron Pinhasi, Juan Luis Arsuaga, Alan Cooper, Nick Patterson, Songül Alpaslan Roodenberg, Eadaoin Harney, Johannes Krause, Iain Mathieson, Manuel Ángel Rojo Guerra, Iosif Lazaridis, Oleg Mochalov, Wolfgang Haak, Eudald Carbonell, David W. Anthony, Cristina Gamba, Vayacheslav Moiseyev, Josep Maria Vergès, Marina Lozano, Carles Lalueza-Fox, José María Bermúdez de Castro, Kendra Sirak, Aleksandr Khokhlov, Joseph K. Pickrell, Fokke Gerritsen, Bastien Llamas, Kurt W. Alt, Mario Novak, Kristin Stewardson, Nadin Rohland, Harald Meller, Daniel Fernandes, Pavel Kuznetsov, Eppie R. Jones, Stanislav Dryomov, David Reich, Irish Research Council for the Humanities and Social Sciences, Netherlands Organization for Scientific Research, Russian Foundation for Basic Research, Ministry of Education and Science of the Russian Federation, German Research Foundation, European Research Council, Ministerio de Economía y Competitividad (España), Australian Research Council, National Science Foundation (US), National Institutes of Health (US), Howard Hughes Medical Institute, Art and Culture, History, Antiquity, and CLUE+
- Subjects
Male ,Multifactorial Inheritance ,Archaeogenetics ,Asia ,Population ,Biology ,Genome ,Article ,Bone and Bones ,03 medical and health sciences ,0302 clinical medicine ,Humans ,Selection, Genetic ,education ,History, Ancient ,Selection (genetic algorithm) ,030304 developmental biology ,Genetics ,0303 health sciences ,education.field_of_study ,Multidisciplinary ,Natural selection ,ancient DNA ,prehistory ,Eurasia ,natural selection ,Genome, Human ,Pigmentation ,Immunity ,Agriculture ,DNA ,Sequence Analysis, DNA ,15. Life on land ,Body Height ,Diet ,3. Good health ,Europe ,Genetics, Population ,Ancient DNA ,Haplotypes ,Evolutionary biology ,Human genome ,Adaptation ,030217 neurology & neurosurgery - Abstract
Mathieson, Iain et al., Ancient DNA makes it possible to observe natural selection directly by analysing samples from populations before, during and after adaptation events. Here we report a genome-wide scan for selection using ancient DNA, capitalizing on the largest ancient DNA data set yet assembled: 230 West Eurasians who lived between 6500 and 300 bc, including 163 with newly reported data. The new samples include, to our knowledge, the first genome-wide ancient DNA from Anatolian Neolithic farmers, whose genetic material we obtained by extracting from petrous bones, and who we show were members of the population that was the source of Europe’s first farmers. We also report a transect of the steppe region in Samara between 5600 and 300 bc, which allows us to identify admixture into the steppe from at least two external sources. We detect selection at loci associated with diet, pigmentation and immunity, and two independent episodes of selection on height., I.M. was supported by the Human Frontier Science Program LT001095/2014-L. C.G. was supported by the Irish Research Council for Humanities and Social Sciences (IRCHSS). F.G. was supported by a grant of the Netherlands Organization for Scientific Research, no. 380-62-005. A.K., P.K. and O.M. were supported by RFBR no. 15-06-01916 and RFH no. 15-11-63008 and O.M. by a state grant of the Ministry of Education and Science of the Russia Federation no. 33.1195.2014/k. J.K. was supported by ERC starting grant APGREID and DFG grant KR 4015/1-1. K.W.A. was supported by DFG grant AL 287 / 14-1. C.L.-F. was supported by a BFU2015-64699-P grant from the Spanish government. W.H. and B.L. were supported by Australian Research Council DP130102158. R.P. was supported by ERC starting grant ADNABIOARC (263441), and an Irish Research Council ERC support grant. D.R. was supported by US National Science Foundation HOMINID grant BCS-1032255, US National Institutes of Health grant GM100233, and the Howard Hughes Medical Institute.
- Published
- 2015
- Full Text
- View/download PDF
38. Identifying genetic variants that affect viability in large cohorts
- Author
-
Molly Przeworski, John R. B. Perry, Tomaz Berisa, Hakhamanesh Mostafavi, Joseph K. Pickrell, Felix R. Day, Day, Felix [0000-0003-3789-7651], Perry, John [0000-0001-6483-3771], and Apollo - University of Cambridge Repository
- Subjects
Male ,0301 basic medicine ,Heredity ,Physiology ,Epidemiology ,Vascular Medicine ,Cohort Studies ,Families ,Fathers ,Endocrinology ,0302 clinical medicine ,Gene Frequency ,APOE*4 Allele ,Medicine and Health Sciences ,Coronary Heart Disease ,Biology (General) ,Genetics ,2. Zero hunger ,0303 health sciences ,Natural selection ,General Neuroscience ,Biobank ,3. Good health ,Genetic Mapping ,Genetic Epidemiology ,Cohort ,Female ,General Agricultural and Biological Sciences ,Research Article ,Common disease-common variant ,QH301-705.5 ,Cardiology ,Mothers ,Variant Genotypes ,Single-nucleotide polymorphism ,Biology ,General Biochemistry, Genetics and Molecular Biology ,Evolution, Molecular ,03 medical and health sciences ,Genetic variation ,Humans ,Selection, Genetic ,Allele ,Allele frequency ,Alleles ,030304 developmental biology ,Endocrine Physiology ,Models, Genetic ,General Immunology and Microbiology ,Human evolutionary genetics ,Puberty ,Biology and Life Sciences ,Genetic Variation ,Genetic architecture ,Genetics, Population ,030104 developmental biology ,Genetic epidemiology ,Genetic Loci ,People and Places ,Population Groupings ,Genetic Fitness ,030217 neurology & neurosurgery ,Demography - Abstract
A number of open questions in human evolutionary genetics would become tractable if we were able to directly measure evolutionary fitness. As a step towards this goal, we developed a method to examine whether individual genetic variants, or sets of genetic variants, currently influence viability. The approach consists in testing whether the frequency of an allele varies across ages, accounting for variation in ancestry. We applied it to the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort and to the parents of participants in the UK Biobank. Across the genome, we found only a few common variants with large effects on age-specific mortality: tagging the APOE ε4 allele and near CHRNA3. These results suggest that when large, even late-onset effects are kept at low frequency by purifying selection. Testing viability effects of sets of genetic variants that jointly influence 1 of 42 traits, we detected a number of strong signals. In participants of the UK Biobank of British ancestry, we found that variants that delay puberty timing are associated with a longer parental life span (P~6.2 × 10−6 for fathers and P~2.0 × 10−3 for mothers), consistent with epidemiological studies. Similarly, variants associated with later age at first birth are associated with a longer maternal life span (P~1.4 × 10−3). Signals are also observed for variants influencing cholesterol levels, risk of coronary artery disease (CAD), body mass index, as well as risk of asthma. These signals exhibit consistent effects in the GERA cohort and among participants of the UK Biobank of non-British ancestry. We also found marked differences between males and females, most notably at the CHRNA3 locus, and variants associated with risk of CAD and cholesterol levels. Beyond our findings, the analysis serves as a proof of principle for how upcoming biomedical data sets can be used to learn about selection effects in contemporary humans., Author summary Our global understanding of adaptation in humans is limited to indirect statistical inferences from patterns of genetic variation, which are sensitive to past selection pressures. We introduced a method that allowed us to directly observe ongoing selection in humans by identifying genetic variants that affect survival to a given age (i.e., viability selection). We applied our approach to the GERA cohort and parents of the UK Biobank participants. We found viability effects of variants near the APOE and CHRNA3 genes, which are associated with the risk of Alzheimer disease and smoking behavior, respectively. We also tested for the joint effect of sets of genetic variants that influence quantitative traits. We uncovered an association between longer life span and genetic variants that delay puberty timing and age at first birth. We also detected detrimental effects of higher genetically predicted cholesterol levels, body mass index, risk of coronary artery disease (CAD), and risk of asthma on survival. Some of the observed effects differ between males and females, most notably those at the CHRNA3 gene and variants associated with risk of CAD and cholesterol levels. Beyond this application, our analysis shows how large biomedical data sets can be used to study natural selection in humans.
- Published
- 2017
- Full Text
- View/download PDF
39. False positive peaks in ChIP-seq and other sequencing-based functional assays caused by unannotated high copy number regions
- Author
-
Yoav Gilad, Joseph K. Pickrell, Jonathan K. Pritchard, and Daniel J. Gaffney
- Subjects
Statistics and Probability ,Chromatin Immunoprecipitation ,Gene Dosage ,Hybrid genome assembly ,Computational biology ,Biology ,Biochemistry ,Genome ,DNase-Seq ,Deep sequencing ,03 medical and health sciences ,0302 clinical medicine ,Humans ,natural sciences ,1000 Genomes Project ,Molecular Biology ,030304 developmental biology ,Genetics ,0303 health sciences ,Base Sequence ,Genome, Human ,Computational Biology ,High-Throughput Nucleotide Sequencing ,Genome project ,Genome Analysis ,Computer Science Applications ,Applications Note ,Computational Mathematics ,Computational Theory and Mathematics ,Human genome ,Sequence Analysis ,030217 neurology & neurosurgery ,Reference genome - Abstract
Motivation: Sequencing-based assays such as ChIP-seq, DNase-seq and MNase-seq have become important tools for genome annotation. In these assays, short sequence reads enriched for loci of interest are mapped to a reference genome to determine their origin. Here, we consider whether false positive peak calls can be caused by particular type of error in the reference genome: multicopy sequences which have been incorrectly assembled and collapsed into a single copy. Results: Using sequencing data from the 1000 Genomes Project, we systematically scanned the human genome for regions of high sequencing depth. These regions are highly enriched for erroneously inferred transcription factor binding sites, positions of nucleosomes and regions of open chromatin. We suggest a simple masking procedure to remove these regions and reduce false positive calls. Availability: Files for masking out these regions are available at eqtl.uchicago.edu Contact: pickrell@uchicago.edu; dgaffney@uchicago.edu; gilad@uchicago.edu; pritch@uchicago.edu Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2011
- Full Text
- View/download PDF
40. Joint Analysis of Functional Genomic Data and Genome-wide Association Studies of 18 Human Traits
- Author
-
Joseph K. Pickrell
- Subjects
Genomics ,Single-nucleotide polymorphism ,Genome-wide association study ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Genome ,Article ,03 medical and health sciences ,0302 clinical medicine ,Genetics ,Humans ,Quantitative Biology - Genomics ,Genetics(clinical) ,Gene ,Genetics (clinical) ,030304 developmental biology ,Genetic association ,Genomics (q-bio.GN) ,0303 health sciences ,Models, Genetic ,030305 genetics & heredity ,Bayes Theorem ,Phenotype ,FOS: Biological sciences ,Trait ,Erratum ,Functional genomics ,030217 neurology & neurosurgery ,Genome-Wide Association Study - Abstract
Annotations of gene structures and regulatory elements can inform genome-wide association studies (GWAS). However, choosing the relevant annotations for interpreting an association study of a given trait remains challenging. We describe a statistical model that uses association statistics computed across the genome to identify classes of genomic element that are enriched or depleted for loci that influence a trait. The model naturally incorporates multiple types of annotations. We applied the model to GWAS of 18 human traits, including red blood cell traits, platelet traits, glucose levels, lipid levels, height, BMI, and Crohn's disease. For each trait, we evaluated the relevance of 450 different genomic annotations, including protein-coding genes, enhancers, and DNase-I hypersensitive sites in over a hundred tissues and cell lines. We show that the fraction of phenotype-associated SNPs that influence protein sequence ranges from around 2% (for platelet volume) up to around 20% (for LDL cholesterol); that repressed chromatin is significantly depleted for SNPs associated with several traits; and that cell type-specific DNase-I hypersensitive sites are enriched for SNPs associated with several traits (for example, the spleen in platelet volume). Finally, by re-weighting each GWAS using information from functional genomics, we increase the number of loci with high-confidence associations by around 5%., Comment: Fixed typos, included minor clarifications
- Published
- 2014
- Full Text
- View/download PDF
41. Towards a new history and geography of human genes informed by ancient DNA
- Author
-
Joseph K. Pickrell and David Reich
- Subjects
0303 health sciences ,Natural selection ,Globe ,Population Replacement ,Genealogy ,Geographic distribution ,03 medical and health sciences ,0302 clinical medicine ,Transformative learning ,Geography ,Ancient DNA ,medicine.anatomical_structure ,medicine ,Human genome ,Adaptation ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
Genetic information contains a record of the history of our species, and technological advances have transformed our ability to access this record. Many studies have used genome-wide data from populations today to learn about the peopling of the globe and subsequent adaptation to local conditions. Implicit in this research is the assumption that the geographic locations of people today are informative about the geographic locations of their ancestors in the distant past. However, it is now clear that long-range migration, admixture and population replacement have been the rule rather than the exception in human history. In light of this, we argue that it is time to critically re-evaluate current views of the peopling of the globe and the importance of natural selection in determining the geographic distribution of phenotypes. We specifically highlight the transformative potential of ancient DNA. By accessing the genetic make-up of populations living at archaeologically-known times and places, ancient DNA makes it possible to directly track migrations and responses to natural selection.
- Published
- 2014
- Full Text
- View/download PDF
42. Ancient west Eurasian ancestry in southern and eastern Africa
- Author
-
Po-Ru Loh, Nick Patterson, Mark Lipson, Bonnie Berger, Joseph K. Pickrell, Brigitte Pakendorf, Mark Stoneking, David Reich, Broad Institute of MIT and Harvard (BROAD INSTITUTE), Harvard Medical School [Boston] (HMS)-Massachusetts Institute of Technology (MIT)-Massachusetts General Hospital [Boston], Department of Evolutionary Genetics, Evolutionary Genetics, Dynamique Du Langage (DDL), Université Lumière - Lyon 2 (UL2)-Centre National de la Recherche Scientifique (CNRS), Department of Genetics [Boston], Harvard Medical School [Boston] (HMS), and ANR-11-IDEX-0007,Avenir L.S.E.,Advanced Studies on Language Complexity(2011)
- Subjects
Gene Flow ,Kenya ,Genotype ,Range (biology) ,Population ,[SHS.ANTHRO-BIO]Humanities and Social Sciences/Biological anthropology ,Ethnic group ,Population genetics ,Africa, Southern ,Linkage Disequilibrium ,White People ,Indigenous ,Prehistory ,03 medical and health sciences ,0302 clinical medicine ,Gene Frequency ,parasitic diseases ,Ethnicity ,Humans ,Computer Simulation ,Quantitative Biology - Populations and Evolution ,10. No inequality ,education ,ComputingMilieux_MISCELLANEOUS ,Demography ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Multidisciplinary ,Middle East ,Models, Genetic ,Populations and Evolution (q-bio.PE) ,Africa, Eastern ,Emigration and Immigration ,Biological Sciences ,Europe ,Genetics, Population ,Geography ,FOS: Biological sciences ,Ethnology ,geographic locations ,030217 neurology & neurosurgery - Abstract
The history of southern Africa involved interactions between indigenous hunter-gatherers and a range of populations that moved into the region. Here we use genome-wide genetic data to show that there are at least two admixture events in the history of Khoisan populations (southern African hunter-gatherers and pastoralists who speak non-Bantu languages with click consonants). One involved populations related to Niger-Congo-speaking African populations, and the other introduced ancestry most closely related to west Eurasian (European or Middle Eastern) populations. We date this latter admixture event to approximately 900-1,800 years ago, and show that it had the largest demographic impact in Khoisan populations that speak Khoe-Kwadi languages. A similar signal of west Eurasian ancestry is present throughout eastern Africa. In particular, we also find evidence for two admixture events in the history of Kenyan, Tanzanian, and Ethiopian populations, the earlier of which involved populations related to west Eurasians and which we date to approximately 2,700 - 3,300 years ago. We reconstruct the allele frequencies of the putative west Eurasian population in eastern Africa, and show that this population is a good proxy for the west Eurasian ancestry in southern Africa. The most parsimonious explanation for these findings is that west Eurasian ancestry entered southern Africa indirectly through eastern Africa., Comment: Added additional simulations, some additional discussion
- Published
- 2014
- Full Text
- View/download PDF
43. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment
- Author
-
Nick Patterson, Huwenbo Shi, Noah Zaitlen, Alexander Gusev, David P. Strachan, Joel N. Hirschhorn, Alkes L. Price, Joseph K. Pickrell, Gaurav Bhatia, and Bogdan Pasaniuc
- Subjects
FOS: Computer and information sciences ,Statistics and Probability ,Linkage disequilibrium ,Time Factors ,Genotype ,Computer science ,Gaussian ,Correlation and dependence ,Biostatistics ,computer.software_genre ,Polymorphism, Single Nucleotide ,Biochemistry ,Quantitative Biology - Quantitative Methods ,Statistics - Applications ,Linkage Disequilibrium ,Cohort Studies ,symbols.namesake ,Statistics ,Humans ,Applications (stat.AP) ,Quantitative Biology - Genomics ,Imputation (statistics) ,Hidden Markov model ,Quantitative Biology - Populations and Evolution ,Molecular Biology ,Quantitative Methods (q-bio.QM) ,Genetic association ,Genomics (q-bio.GN) ,Statistics::Applications ,Populations and Evolution (q-bio.PE) ,Quantitative Biology::Genomics ,Original Papers ,Summary statistics ,Computer Science Applications ,Computational Mathematics ,Phenotype ,Computational Theory and Mathematics ,Sample size determination ,Case-Control Studies ,FOS: Biological sciences ,symbols ,Data mining ,computer ,Algorithms ,Software ,Imputation (genetics) ,Genome-Wide Association Study - Abstract
Imputation using external reference panels is a widely used approach for increasing power in GWAS and meta-analysis. Existing HMM-based imputation approaches require individual-level genotypes. Here, we develop a new method for Gaussian imputation from summary association statistics, a type of data that is becoming widely available. In simulations using 1000 Genomes (1000G) data, this method recovers 84% (54%) of the effective sample size for common (>5%) and low-frequency (1-5%) variants (increasing to 87% (60%) when summary LD information is available from target samples) versus 89% (67%) for HMM-based imputation, which cannot be applied to summary statistics. Our approach accounts for the limited sample size of the reference panel, a crucial step to eliminate false-positive associations, and is computationally very fast. As an empirical demonstration, we apply our method to 7 case-control phenotypes from the WTCCC data and a study of height in the British 1958 birth cohort (1958BC). Gaussian imputation from summary statistics recovers 95% (105%) of the effective sample size (as quantified by the ratio of $��^2$ association statistics) compared to HMM-based imputation from individual-level genotypes at the 227 (176) published SNPs in the WTCCC (1958BC height) data. In addition, for publicly available summary statistics from large meta-analyses of 4 lipid traits, we publicly release imputed summary statistics at 1000G SNPs, which could not have been obtained using previously published methods, and demonstrate their accuracy by masking subsets of the data. We show that 1000G imputation using our approach increases the magnitude and statistical evidence of enrichment at genic vs. non-genic loci for these traits, as compared to an analysis without 1000G imputation. Thus, imputation of summary statistics will be a valuable tool in future functional enrichment analyses., 32 pages, 4 figures
- Published
- 2013
44. The complete genome sequence of a Neanderthal from the Altai Mountains
- Author
-
Cesare de Filippo, Joseph K. Pickrell, Michael Lachmann, Jacob O. Kitzman, Janet Kelso, Vladimir B. Doronichev, Priya Moorjani, Peter H. Sudmant, Michael V. Shunkov, Anja Heinze, Sriram Sankararaman, James C. Mullikin, Susanna Sawyer, A.P. Derevianko, Philip L. F. Johnson, Fernando Racimo, Hélène Blanché, Michael Dannemann, Montgomery Slatkin, Ed S. Lein, Swapan Mallick, Trygve E. Bakken, Richard E. Green, Gabriel Renaud, Heng Li, Matthias Meyer, Liubov V. Golovanova, Jay Shendure, Christoph Theunert, Martin Kircher, Kay Prüfer, David Reich, Svante Pääbo, Samuel H. Vohr, Nick Patterson, Martin Kuhlwilm, Flora Jay, Michael Siebauer, Qiaomei Fu, Matthias Ongyerth, Bence Viola, Howard M. Cann, Evan E. Eichler, Ines Hellmann, and Arti Tandon
- Subjects
Gene Flow ,Heterozygote ,Neanderthal ,DNA Copy Number Variations ,Population ,Archaic humans ,Neanderthal genome project ,Article ,Gene Frequency ,biology.animal ,Animals ,Humans ,Inbreeding ,education ,Denisovan ,Toe Phalanges ,Phylogeny ,Neanderthals ,Genetics ,Population Density ,education.field_of_study ,Multidisciplinary ,Genome ,biology ,Models, Genetic ,Human evolutionary genetics ,Fossils ,biology.organism_classification ,Siberia ,Caves ,Human evolution ,Evolutionary biology ,Anatomically modern human ,Africa ,Female - Abstract
We present a high-quality genome sequence of a Neanderthal woman from Siberia. We show that her parents were related at the level of half-siblings and that mating among close relatives was common among her recent ancestors. We also sequenced the genome of a Neanderthal from the Caucasus to low coverage. An analysis of the relationships and population history of available archaic genomes and 25 present-day human genomes shows that several gene flow events occurred among Neanderthals, Denisovans and early modern humans, possibly including gene flow into Denisovans from an unknown archaic group. Thus, interbreeding, albeit of low magnitude, occurred among many hominin groups in the Late Pleistocene. In addition, the high-quality Neanderthal genome allows us to establish a definitive list of substitutions that became fixed in modern humans after their separation from the ancestors of Neanderthals and Denisovans.
- Published
- 2013
45. Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data
- Author
-
Jonathan K. Pritchard and Joseph K. Pickrell
- Subjects
Gene Flow ,Cancer Research ,lcsh:QH426-470 ,Bioinformatics ,Population ,Population genetics ,Genome-wide association study ,Biology ,Breeding ,Genetics & Genomics ,Genome ,Polymorphism, Single Nucleotide ,Gene flow ,Dogs ,Genetic drift ,Gene Frequency ,Phylogenetics ,Genetics ,Animals ,Humans ,General Materials Science ,Domestication ,education ,Quantitative Biology - Populations and Evolution ,Molecular Biology ,Allele frequency ,Genetics (clinical) ,Ecology, Evolution, Behavior and Systematics ,Ancestor ,Evolutionary Biology ,education.field_of_study ,Wolves ,Population Biology ,Models, Genetic ,Genetic Drift ,Populations and Evolution (q-bio.PE) ,Statistical model ,lcsh:Genetics ,Evolutionary biology ,FOS: Biological sciences ,Graph (abstract data type) ,Population Genetics ,Algorithms ,Research Article ,Genome-Wide Association Study - Abstract
Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In this model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication, and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and "ancient" Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com, 28 pages, 6 figures in main text. Attached supplement is 22 pages, 15 figures. This is an updated version of the preprint available at http://precedings.nature.com/documents/6956/version/1
- Published
- 2012
46. Misuse of hierarchical linear models overstates the significance of a reported association between OXTR and prosociality
- Author
-
Daniel G. MacArthur, Luke Jostins, Joseph K. Pickrell, and Jeffrey C. Barrett
- Subjects
Genetics ,Multidisciplinary ,Statistical significance ,Multilevel model ,Trait ,p-value ,Allele ,Association (psychology) ,Psychology ,Oxytocin receptor ,Developmental psychology ,t-statistic - Abstract
Kogan et al. recently reported an association between an intronic variant in the oxytocin receptor gene, OXTR, and perceived prosociality (1). The significance of this association is described as P < 0.001, but the reported t statistic corresponds to P = 3.2 × 10−16. This level of significance is striking given the small number of genotyped individuals (N = 23) compared with other studies of complex trait genetics (2, 3). We show that this highly significant P value is a result of a misapplication of the statistical technique of hierarchical linear models (HLMs), and that the P value under an appropriate analysis is much larger: P = 0.027.
- Published
- 2012
- Full Text
- View/download PDF
47. The genetic prehistory of southern Africa
- Author
-
David Reich, Falko Berthold, Brenna M. Henn, Chiara Barbieri, Mark Lipson, Sarah A. Tishkoff, Hirosi Nakagawa, Joseph Lachance, Brigitte Pakendorf, Mark Stoneking, Sununguko Wata Mpoloka, Nick Patterson, Tom Güldemann, Linda Gerlach, Joseph K. Pickrell, Po-Ru Loh, Carlos Bustamante, Blesswell Kure, Bonnie Berger, Christfried Naumann, Joanna L. Mountain, Génétique épidémiologique et structures des populations humaines (Inserm U535), Epidémiologie, sciences sociales, santé publique (IFR 69), Université Paris 1 Panthéon-Sorbonne (UP1)-Université Paris-Sud - Paris 11 (UP11)-École des hautes études en sciences sociales (EHESS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Université Paris Descartes - Paris 5 (UPD5)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Université Paris 1 Panthéon-Sorbonne (UP1)-Université Paris-Sud - Paris 11 (UP11)-École des hautes études en sciences sociales (EHESS)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Université Paris Descartes - Paris 5 (UPD5)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM), Broad Institute of MIT and Harvard (BROAD INSTITUTE), Harvard Medical School [Boston] (HMS)-Massachusetts Institute of Technology (MIT)-Massachusetts General Hospital [Boston], Harvard School of Public Health, Affymetrix, Computer Science and Artificial Intelligence Laboratory [Cambridge] (CSAIL), Massachusetts Institute of Technology (MIT), University of Pennsylvania [Philadelphia], Evolutionary Genetics, Dynamique Du Langage (DDL), Université Lumière - Lyon 2 (UL2)-Centre National de la Recherche Scientifique (CNRS), Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Mathematics, Lipson, Mark, Loh, Po-Ru, and Berger, Bonnie
- Subjects
History ,Population ,Population Dynamics ,[SHS.ANTHRO-BIO]Humanities and Social Sciences/Biological anthropology ,Ethnic group ,General Physics and Astronomy ,Ethnic Groups ,Biology ,General Biochemistry, Genetics and Molecular Biology ,Article ,Africa, Southern ,Linkage Disequilibrium ,Ancient ,Prehistory ,03 medical and health sciences ,Databases ,0302 clinical medicine ,Genetic ,Models ,Databases, Genetic ,Ethnicity ,Genetics ,Humans ,Cluster Analysis ,education ,Quantitative Biology - Populations and Evolution ,Biological sciences ,Southern ,History, Ancient ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Multidisciplinary ,Models, Genetic ,Populations and Evolution (q-bio.PE) ,General Chemistry ,Gene Pool ,Genetics, Population ,Evolutionary biology ,FOS: Biological sciences ,Africa ,Ethnology ,Gene pool ,030217 neurology & neurosurgery - Abstract
Southern and eastern African populations that speak non-Bantu languages with click consonants are known to harbour some of the most ancient genetic lineages in humans, but their relationships are poorly understood. Here, we report data from 23 populations analyzed at over half a million single nucleotide polymorphisms, using a genome-wide array designed for studying human history. The southern African Khoisan fall into two genetic groups, loosely corresponding to the northwestern and southeastern Kalahari, which we show separated within the last 30,000 years. We find that all individuals derive at least a few percent of their genomes from admixture with non-Khoisan populations that began approximately 1,200 years ago. In addition, the east African Hadza and Sandawe derive a fraction of their ancestry from admixture with a population related to the Khoisan, supporting the hypothesis of an ancient link between southern and eastern Africa, To appear in Nature Communications
- Published
- 2012
- Full Text
- View/download PDF
48. Noisy Splicing Drives mRNA Isoform Diversity in Human Cells
- Author
-
Jonathan K. Pritchard, Athma A. Pai, Joseph K. Pickrell, and Yoav Gilad
- Subjects
Cancer Research ,lcsh:QH426-470 ,Exonic splicing enhancer ,Biology ,Cell Line ,03 medical and health sciences ,Exon ,0302 clinical medicine ,Genetics ,Humans ,splice ,RNA, Messenger ,Evolutionary Biology/Genomics ,Molecular Biology ,Gene ,Genetics (clinical) ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,0303 health sciences ,Splice site mutation ,Sequence Analysis, RNA ,Alternative splicing ,Intron ,Genetic Variation ,Genetics and Genomics ,Exons ,Introns ,lcsh:Genetics ,Alternative Splicing ,RNA splicing ,RNA Splice Sites ,030217 neurology & neurosurgery ,Research Article - Abstract
While the majority of multiexonic human genes show some evidence of alternative splicing, it is unclear what fraction of observed splice forms is functionally relevant. In this study, we examine the extent of alternative splicing in human cells using deep RNA sequencing and de novo identification of splice junctions. We demonstrate the existence of a large class of low abundance isoforms, encompassing approximately 150,000 previously unannotated splice junctions in our data. Newly-identified splice sites show little evidence of evolutionary conservation, suggesting that the majority are due to erroneous splice site choice. We show that sequence motifs involved in the recognition of exons are enriched in the vicinity of unconserved splice sites. We estimate that the average intron has a splicing error rate of approximately 0.7% and show that introns in highly expressed genes are spliced more accurately, likely due to their shorter length. These results implicate noisy splicing as an important property of genome evolution., Author Summary Most human genes are split into pieces, such that the protein-coding parts (exons) are separated in the genome by large tracts of non-coding DNA (introns) that must be transcribed and spliced out to create a functional transcript. Variation in splicing reactions can create multiple transcripts from the same gene, yet the function for many of these alternative transcripts is unknown. In this study, we show that many of these transcripts are due to splicing errors which are not preserved over evolutionary time. We estimate that the error rate in the splicing of an intron is about 0.7% and demonstrate that there are two major types of splicing error: errors in the recognition of exons and errors in the precise choice of splice site. These results raise the possibility that variation in levels of alternative splicing across species may in part be to variation in splicing error rate.
- Published
- 2010
49. Understanding mechanisms underlying human gene expression variation with RNA sequencing
- Author
-
Matthew Stephens, John C. Marioni, Joseph K. Pickrell, Barbara E. Engelhardt, Jonathan K. Pritchard, Jean-Baptiste Veyrieras, Jacob F. Degner, Athma A. Pai, Yoav Gilad, and Everlyne Nkadori
- Subjects
DNA, Complementary ,Transcription, Genetic ,Quantitative Trait Loci ,Black People ,Nigeria ,Genomics ,Biology ,Quantitative trait locus ,Polymorphism, Single Nucleotide ,Article ,Exon ,Consensus Sequence ,Humans ,RNA, Messenger ,International HapMap Project ,Gene ,Alleles ,Genetics ,Multidisciplinary ,Sequence Analysis, RNA ,Gene Expression Profiling ,Genetic Variation ,Exons ,Gene expression profiling ,Gene Expression Regulation ,RNA splicing ,Expression quantitative trait loci ,RNA Splice Sites - Abstract
Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project2. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals.
- Published
- 2010
50. Evolutionary dynamics of human Toll-like receptors and their different contributions to host defense
- Author
-
Etienne Patin, Laurent Abel, Christiane Bouchier, Kenneth K. Kidd, Lluis Quintana-Murci, Brigitte Gicquel, Meriem Ben-Ali, Judith R. Kidd, Jean-Laurent Casanova, Guillaume Laval, Sandra Pellegrini, Hélène Quach, Josiane Ragimbeau, Olivier Neyrolles, Joseph K. Pickrell, Magali Tichit, Alexandre Alcaïs, and Luis B. Barreiro
- Subjects
Nonsynonymous substitution ,Cancer Research ,lcsh:QH426-470 ,Biology ,Infections ,Evolution, Molecular ,03 medical and health sciences ,Negative selection ,0302 clinical medicine ,Immunity ,Genetics and Genomics/Population Genetics ,Genetics ,Ethnicity ,Humans ,Selection, Genetic ,Evolutionary dynamics ,Molecular Biology ,Genetics (clinical) ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,0303 health sciences ,Natural selection ,Innate immune system ,Human evolutionary genetics ,Toll-Like Receptors ,Genetics and Genomics ,Sequence Analysis, DNA ,3. Good health ,lcsh:Genetics ,Kinetics ,Genetics, Population ,Infectious disease (medical specialty) ,Mutation ,Genetics and Genomics/Genetics of the Immune System ,030217 neurology & neurosurgery ,Research Article - Abstract
Infectious diseases have been paramount among the threats to health and survival throughout human evolutionary history. Natural selection is therefore expected to act strongly on host defense genes, particularly on innate immunity genes whose products mediate the direct interaction between the host and the microbial environment. In insects and mammals, the Toll-like receptors (TLRs) appear to play a major role in initiating innate immune responses against microbes. In humans, however, it has been speculated that the set of TLRs could be redundant for protective immunity. We investigated how natural selection has acted upon human TLRs, as an approach to assess their level of biological redundancy. We sequenced the ten human TLRs in a panel of 158 individuals from various populations worldwide and found that the intracellular TLRs—activated by nucleic acids and particularly specialized in viral recognition—have evolved under strong purifying selection, indicating their essential non-redundant role in host survival. Conversely, the selective constraints on the TLRs expressed on the cell surface—activated by compounds other than nucleic acids—have been much more relaxed, with higher rates of damaging nonsynonymous and stop mutations tolerated, suggesting their higher redundancy. Finally, we tested whether TLRs have experienced spatially-varying selection in human populations and found that the region encompassing TLR10-TLR1-TLR6 has been the target of recent positive selection among non-Africans. Our findings indicate that the different TLRs differ in their immunological redundancy, reflecting their distinct contributions to host defense. The insights gained in this study foster new hypotheses to be tested in clinical and epidemiological genetics of infectious disease., Author Summary The detrimental effects of microbial infections have led to the evolution of a variety of host defense mechanisms. A vast array of host innate immunity receptors, critical sensors of viruses, bacteria, and fungi, exist to achieve permanent surveillance of intruding pathogens. The best characterized class of microbial sensors is the Toll-like receptor (TLR) family, which elicits inflammatory and antimicrobial responses after activation by microbial products. Here we investigated how microbes have exerted selective pressure on the human TLR family to gain insights on the extent to which they are functionally important in the immune system. By resequencing the ten TLRs in different worldwide populations, we show that intracellular TLRs—principally specialized in viral recognition—evolve under strong purifying selection, indicating their essential role in host survival, while the remaining TLRs display higher levels of immunological redundancy. However, for this latter group of genes, we also show that mutations altering immune responses have been in some cases beneficial for host survival, as attested by the signature of positive selection favoring a reduced TLR1-mediated response in Europeans. Our findings taken together indicate that the different human TLRs differ in their biological relevance and provide clues to be experimentally tested in clinical, immunological, and epidemiological studies.
- Published
- 2009
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.