1. Analysis of protein-coding genetic variation in 60,706 humans
- Author
-
Jack A. Kosmicki, Mark A. DePristo, Mark I. McCarthy, Patrick F. Sullivan, Laramie E. Duncan, Ryan Poplin, David Neil Cooper, Mitja I. Kurki, Aarno Palotie, Hong-Hee Won, Dermot P.B. McGovern, John Danesh, Jose C. Florez, Grace Tiao, Anne H. O’Donnell-Luria, Timothy Fennell, Gad Getz, Douglas M. Ruderfer, Joanne Berghout, Mark J. Daly, Monkol Lek, Daniel P. Howrigan, Stacey Gabriel, Daniel P. Birnbaum, Ami Levy Moonshine, Michael Boehnke, Ben Weisburd, Ruth McPherson, Christine Stevens, Dongmei Yu, Sekar Kathiresan, Andrew J. Hill, James G. Wilson, James S. Ware, Hugh Watkins, Benjamin M. Neale, Khalid Shakir, David Altshuler, María Teresa Tusié-Luna, Lorena Orozco, James Zou, Samuel A. Rose, Menachem Fromer, Jeremiah M. Scharf, Daniel G. MacArthur, Namrata Gupta, Pamela Sklar, Eric Vallabh Minikel, Steven A. McCarroll, Jaakko Tuomilehto, Jackie Goldstein, Ming T. Tsuang, Stacey Donnelly, Konrad J. Karczewski, Fengmei Zhao, Stephen J. Glatt, Ron Do, Nicole A. Deflaux, Adam Kiezun, Emma Pierce-Hoffman, Markku Laakso, Beryl B. Cummings, Pradeep Natarajan, Danish Saleheen, Karol Estrada, Peter D. Stenson, Manuel A. Rivas, Diego Ardissino, Kaitlin E. Samocha, Gina M. Peloso, Laura D. Gauthier, Eric Banks, Brett Thomas, Shaun Purcell, Taru Tukiainen, Valentin Ruano-Rubio, Christina M. Hultman, Jason Flannick, Roberto Elosua, Complex Trait Genetics, Amsterdam Neuroscience - Complex Trait Genetics, Institute for Molecular Medicine Finland, Aarno Palotie / Principal Investigator, Jaakko Tuomilehto Research Group, Department of Public Health, Clinicum, Genomics of Neurological and Neuropsychiatric Disorders, Danesh, John [0000-0003-1158-6791], Apollo - University of Cambridge Repository, Wellcome Trust, and The Academy of Medical Sciences
- Subjects
0301 basic medicine ,Proteome ,DNA Mutational Analysis ,Datasets as Topic ,Human genetic variation ,GUIDELINES ,0302 clinical medicine ,Exome Aggregation Consortium ,SEQUENCE VARIANTS ,Coding region ,2.1 Biological and endogenous factors ,Exome ,Aetiology ,MUTATION ,Genetics ,0303 health sciences ,Multidisciplinary ,HUMAN-DISEASE ,NETWORKS ,Multidisciplinary Sciences ,Phenotype ,Mutation (genetic algorithm) ,Science & Technology - Other Topics ,Biotechnology ,General Science & Technology ,Genomics ,Computational biology ,Biology ,DNA sequencing ,03 medical and health sciences ,Rare Diseases ,Clinical Research ,Genetic variation ,Humans ,Genetic Testing ,Gene ,030304 developmental biology ,Science & Technology ,Human Genome ,HUMAN-POPULATION HISTORY ,Genetic Variation ,FRAMEWORK ,R1 ,EVOLUTION ,030104 developmental biology ,DISCOVERY ,Sample Size ,Generic health relevance ,3111 Biomedicine ,Genètica humana -- Variació ,030217 neurology & neurosurgery - Abstract
SummaryLarge-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). The resulting catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We show that this catalogue can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 72% of which have no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human “knockout” variants in protein-coding genes.
- Published
- 2016
- Full Text
- View/download PDF