1. Phased whole-genome genetic risk in a family quartet using a major allele reference sequence.
- Author
-
Dewey, Frederick E, Chen, Rong, Cordero, Sergio P, Ormond, Kelly E, Caleshu, Colleen, Karczewski, Konrad J, Whirl-Carrillo, Michelle, Wheeler, Matthew T, Dudley, Joel T, Byrnes, Jake K, Cornejo, Omar E, Knowles, Joshua W, Woon, Mark, Sangkuhl, Katrin, Gong, Li, Thorn, Caroline F, Hebert, Joan M, Capriotti, Emidio, David, Sean P, Pavlovic, Aleksandra, West, Anne, Thakuria, Joseph V, Ball, Madeleine P, Zaranek, Alexander W, Rehm, Heidi L, Church, George M, West, John S, Bustamante, Carlos D, Snyder, Michael, Altman, Russ B, Klein, Teri E, Butte, Atul J, and Ashley, Euan A
- Subjects
Humans ,Thrombophilia ,Genetic Predisposition to Disease ,Risk Assessment ,Pedigree ,Sequence Alignment ,Sequence Analysis ,DNA ,DNA Mutational Analysis ,Base Sequence ,Genotype ,Haplotypes ,Alleles ,Genes ,Synthetic ,Genome ,Human ,Reference Standards ,Female ,Male ,Genetic Variation ,Genome-Wide Association Study ,Biotechnology ,Genetics ,Human Genome ,2.1 Biological and endogenous factors ,Generic health relevance ,Developmental Biology - Abstract
Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.
- Published
- 2011