51. Landscape of multi-nucleotide variants in 125,748 human exomes and 15,708 genomes
- Author
-
Ruth Loos, Qingbo Wang, Anne O'Donnell-Luria, Benjamin Glaser, James Ware, John D. Rioux, ROBERTO ELOSUA, Kristian Cibulskis, Laurent Francioli, E Shyong Tai, Terho Lehtimäki, Daniel MacArthur, Irina Armean, Matthew Bown, Tiinamaija Tuomi, Jeanette Erdmann, Matthew Solomonson, Harry Sokol, Aarno Palotie, Martti Färkkilä, Ronald Ma, Olle Melander, Emilia Solinas, Tampere University, Clinical Medicine, Department of Clinical Chemistry, Centre of Excellence in Complex Disease Genetics, HUS Abdominal Center, Institute for Molecular Medicine Finland, Helsinki Institute of Life Science HiLIFE, Department of Medicine, Clinicum, Gastroenterologian yksikkö, Helsinki University Hospital Area, Research Programs Unit, Aarno Palotie / Principal Investigator, Genomics of Neurological and Neuropsychiatric Disorders, Department of Public Health, Samuli Olli Ripatti / Principal Investigator, Complex Disease Genetics, Biostatistics Helsinki, Biosciences, and Rosetrees Trust
- Subjects
0301 basic medicine ,Mutation rate ,Science ,Nonsense mutation ,General Physics and Astronomy ,Computational biology ,Genome ,Article ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,Protein sequencing ,0302 clinical medicine ,DNA Mutational Analysis ,Genetic variation ,ELEMENTS ,lcsh:Science ,Exome ,SIGNATURES ,Exome sequencing ,Polymerase ,030304 developmental biology ,0303 health sciences ,Multidisciplinary ,biology ,Haplotype ,1184 Genetics, developmental biology, physiology ,Genome Aggregation Database Production Team ,Genomics ,General Chemistry ,FRAMEWORK ,EVOLUTION ,SLIPPAGE ,030104 developmental biology ,Haplotypes ,CpG site ,DE-NOVO MUTATIONS ,Genome Aggregation Database Consortium ,030220 oncology & carcinogenesis ,Mutation (genetic algorithm) ,WHOLE-GENOME ,PATTERNS ,biology.protein ,DNA-POLYMERASE-ZETA ,REPEATS ,lcsh:Q ,3111 Biomedicine ,030217 neurology & neurosurgery - Abstract
Multi-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,792,248 MNVs across the genome with constituent variants falling within 2 bp distance of one another, including 18,756 variants with a novel combined effect on protein sequence. Finally, we estimate the relative impact of known mutational mechanisms - CpG deamination, replication error by polymerase zeta, and polymerase slippage at repeat junctions - on the generation of MNVs. Our results demonstrate the value of haplotype-aware variant annotation, and refine our understanding of genome-wide mutational mechanisms of MNVs., Multi-nucleotide variants (MNV) are genetic variants in close proximity of each other on the same haplotype whose functional impact is difficult to predict if they reside in the same codon. Here, Wang et al. use the gnomAD dataset to assemble a catalogue of MNVs and estimate their global mutation rate.
- Published
- 2020