264 results on '"Bustamante CD"'
Search Results
2. Analysis of the Genetic Basis of Disease in the Context of Worldwide Human Relationships and Migration
- Author
-
Butte, Atul, Corona, E, Chen, R, Sikora, M, Morgan, AA, Patel, CJ, Ramesh, A, Bustamante, CD, and Butte, AJ
- Abstract
Genetic diversity across different human populations can enhance understanding of the genetic basis of disease. We calculated the genetic risk of 102 diseases in 1,043 unrelated individuals across 51 populations of the Human Genome Diversity Panel. We foun
- Published
- 2013
3. Pan-cancer analysis of whole genomes
- Author
-
Campbell, PJ, Getz, G, Korbel, JO, Stuart, JM, Jennings, JL, Stein, LD, Perry, MD, Nahal-Bose, HK, Ouellette, BFF, Li, CH, Rheinbay, E, Nielsen, GP, Sgroi, DC, Wu, CL, Faquin, WC, Deshpande, V, Boutros, PC, Lazar, AJ, Hoadley, KA, Louis, DN, Dursi, LJ, Yung, CK, Bailey, MH, Saksena, G, Raine, KM, Buchhalter, I, Kleinheinz, K, Schlesner, M, Zhang, J, Wang, W, Wheeler, DA, Ding, L, Simpson, JT, O’Connor, BD, Yakneen, S, Ellrott, K, Miyoshi, N, Butler, AP, Royo, R, Shorser, SI, Vazquez, M, Rausch, T, Tiao, G, Waszak, SM, Rodriguez-Martin, B, Shringarpure, S, Wu, DY, Demidov, GM, Delaneau, O, Hayashi, S, Imoto, S, Habermann, N, Segre, AV, Garrison, E, Cafferkey, A, Alvarez, EG, Heredia-Genestar, JM, Muyas, F, Drechsel, O, Bruzos, AL, Temes, J, Zamora, J, Baez-Ortega, A, Kim, HL, Mashl, RJ, Ye, K, DiBiase, A, Huang, KL, Letunic, I, McLellan, MD, Newhouse, SJ, Shmaya, T, Kumar, S, Wedge, DC, Wright, MH, Yellapantula, VD, Gerstein, M, Khurana, E, Marques-Bonet, T, Navarro, A, Bustamante, CD, Siebert, R, Nakagawa, H, Easton, DF, Ossowski, S, Tubio, JMC, De La Vega, FM, Estivill, X, Yuen, D, Mihaiescu, GL, Omberg, L, Ferretti, V, Sabarinathan, R, Pich, O, Gonzalez-Perez, A, Taylor-Weiner, A, Fittall, MW, Demeulemeester, J, Tarabichi, M, Roberts, ND, Campbell, PJ, Getz, G, Korbel, JO, Stuart, JM, Jennings, JL, Stein, LD, Perry, MD, Nahal-Bose, HK, Ouellette, BFF, Li, CH, Rheinbay, E, Nielsen, GP, Sgroi, DC, Wu, CL, Faquin, WC, Deshpande, V, Boutros, PC, Lazar, AJ, Hoadley, KA, Louis, DN, Dursi, LJ, Yung, CK, Bailey, MH, Saksena, G, Raine, KM, Buchhalter, I, Kleinheinz, K, Schlesner, M, Zhang, J, Wang, W, Wheeler, DA, Ding, L, Simpson, JT, O’Connor, BD, Yakneen, S, Ellrott, K, Miyoshi, N, Butler, AP, Royo, R, Shorser, SI, Vazquez, M, Rausch, T, Tiao, G, Waszak, SM, Rodriguez-Martin, B, Shringarpure, S, Wu, DY, Demidov, GM, Delaneau, O, Hayashi, S, Imoto, S, Habermann, N, Segre, AV, Garrison, E, Cafferkey, A, Alvarez, EG, Heredia-Genestar, JM, Muyas, F, Drechsel, O, Bruzos, AL, Temes, J, Zamora, J, Baez-Ortega, A, Kim, HL, Mashl, RJ, Ye, K, DiBiase, A, Huang, KL, Letunic, I, McLellan, MD, Newhouse, SJ, Shmaya, T, Kumar, S, Wedge, DC, Wright, MH, Yellapantula, VD, Gerstein, M, Khurana, E, Marques-Bonet, T, Navarro, A, Bustamante, CD, Siebert, R, Nakagawa, H, Easton, DF, Ossowski, S, Tubio, JMC, De La Vega, FM, Estivill, X, Yuen, D, Mihaiescu, GL, Omberg, L, Ferretti, V, Sabarinathan, R, Pich, O, Gonzalez-Perez, A, Taylor-Weiner, A, Fittall, MW, Demeulemeester, J, Tarabichi, M, and Roberts, ND
- Abstract
Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale1–3. Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4–5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter4; identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation5,6; analyses timings and patterns of tumour evolution7; describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity8,9; and evaluates a range of more-specialized features of cancer genomes8,10–18.
- Published
- 2020
4. Spectrum and prevalence of genetic predisposition in medulloblastoma: a retrospective genetic study and prospective validation in a clinical trial cohort
- Author
-
Waszak, SM, Northcott, PA, Buchhalter, I, Robinson, GW, Sutter, C, Groebner, S, Grund, KB, Brugières, L, Jones, DTW, Pajtler, KW, Morrissy, AS, Kool, M, Sturm, D, Chavez, L, Ernst, A, Brabetz, S, Hain, M, Zichner, T, Segura-Wang, M, Weischenfeldt, J, Rausch, T, Mardin, BR, Zhou, X, Baciu, C, Lawerenz, C, Chan, JA, Varlet, P, Guerrini-Rousseau, L, Fults, DW, Grajkowska, W, Hauser, P, Jabado, N, Ra, YS, Zitterbart, K, Shringarpure, SS, De La Vega, FM, Bustamante, CD, Ng, HK, Perry, A, MacDonald, TJ, Hernáiz Driever, P, Bendel, AE, Bowers, DC, McCowage, G, Chintagumpala, MM, Cohn, R, Hassall, T, Fleischhack, G, Eggen, T, Wesenberg, F, Feychting, M, Lannering, B, Schüz, J, Johansen, C, Andersen, TV, Röösli, M, Kuehni, CE, Grotzer, M, Kjaerheim, K, Monoranu, CM, Archer, TC, Duke, E, Pomeroy, SL, Shelagh, R, Frank, S, Sumerauer, D, Scheurlen, W, Ryzhova, MV, Milde, T, Kratz, CP, Samuel, D, Zhang, J, Solomon, DA, Marra, M, Eils, R, Bartram, CR, von Hoff, K, Rutkowski, S, Ramaswamy, V, Gilbertson, RJ, Korshunov, A, Taylor, MD, Lichter, P, Malkin, D, Gajjar, A, Korbel, JO, Pfister, SM, Waszak, SM, Northcott, PA, Buchhalter, I, Robinson, GW, Sutter, C, Groebner, S, Grund, KB, Brugières, L, Jones, DTW, Pajtler, KW, Morrissy, AS, Kool, M, Sturm, D, Chavez, L, Ernst, A, Brabetz, S, Hain, M, Zichner, T, Segura-Wang, M, Weischenfeldt, J, Rausch, T, Mardin, BR, Zhou, X, Baciu, C, Lawerenz, C, Chan, JA, Varlet, P, Guerrini-Rousseau, L, Fults, DW, Grajkowska, W, Hauser, P, Jabado, N, Ra, YS, Zitterbart, K, Shringarpure, SS, De La Vega, FM, Bustamante, CD, Ng, HK, Perry, A, MacDonald, TJ, Hernáiz Driever, P, Bendel, AE, Bowers, DC, McCowage, G, Chintagumpala, MM, Cohn, R, Hassall, T, Fleischhack, G, Eggen, T, Wesenberg, F, Feychting, M, Lannering, B, Schüz, J, Johansen, C, Andersen, TV, Röösli, M, Kuehni, CE, Grotzer, M, Kjaerheim, K, Monoranu, CM, Archer, TC, Duke, E, Pomeroy, SL, Shelagh, R, Frank, S, Sumerauer, D, Scheurlen, W, Ryzhova, MV, Milde, T, Kratz, CP, Samuel, D, Zhang, J, Solomon, DA, Marra, M, Eils, R, Bartram, CR, von Hoff, K, Rutkowski, S, Ramaswamy, V, Gilbertson, RJ, Korshunov, A, Taylor, MD, Lichter, P, Malkin, D, Gajjar, A, Korbel, JO, and Pfister, SM
- Abstract
Background: Medulloblastoma is associated with rare hereditary cancer predisposition syndromes; however, consensus medulloblastoma predisposition genes have not been defined and screening guidelines for genetic counselling and testing for paediatric patients are not available. We aimed to assess and define these genes to provide evidence for future screening guidelines. Methods: In this international, multicentre study, we analysed patients with medulloblastoma from retrospective cohorts (International Cancer Genome Consortium [ICGC] PedBrain, Medulloblastoma Advanced Genomics International Consortium [MAGIC], and the CEFALO series) and from prospective cohorts from four clinical studies (SJMB03, SJMB12, SJYC07, and I-HIT-MED). Whole-genome sequences and exome sequences from blood and tumour samples were analysed for rare damaging germline mutations in cancer predisposition genes. DNA methylation profiling was done to determine consensus molecular subgroups: WNT (MB WNT ), SHH (MB SHH ), group 3 (MB Group3 ), and group 4 (MB Group4 ). Medulloblastoma predisposition genes were predicted on the basis of rare variant burden tests against controls without a cancer diagnosis from the Exome Aggregation Consortium (ExAC). Previously defined somatic mutational signatures were used to further classify medulloblastoma genomes into two groups, a clock-like group (signatures 1 and 5) and a homologous recombination repair deficiency-like group (signatures 3 and 8), and chromothripsis was investigated using previously established criteria. Progression-free survival and overall survival were modelled for patients with a genetic predisposition to medulloblastoma. Findings: We included a total of 1022 patients with medulloblastoma from the retrospective cohorts (n=673) and the four prospective studies (n=349), from whom blood samples (n=1022) and tumour samples (n=800) were analysed for germline mutations in 110 cancer predisposition genes. In our rare variant burden analysis, we co
- Published
- 2018
5. Human genetics. The genetics of Mexico recapitulates Native American substructure and affects biomedical traits
- Author
-
Moreno-Estrada A, Gignoux CR, Fernández-López JC, Zakharia F, Sikora M, Contreras AV, Acuña-Alonzo V, Sandoval K, Eng C, Romero-Hidalgo S, Ortiz-Tello P, Robles V, Kenny EE, Nuño-Arana I, Barquera-Lozano R, Macín-Pérez G, Granados-Arriola J, Huntsman S, Galanter JM, Via M, Ford JG, Chapela R, Rodriguez-Cintron W, Rodríguez-Santana JR, Romieu I, Sienra-Monge JJ, del Rio Navarro B, London SJ, Ruiz-Linares A, Garcia-Herrera R, Estrada K, Hidalgo-Miranda A, Jimenez-Sanchez G, Carnevale A, Soberón X, Canizales-Quinteros S, Rangel-Villalobos H, Silva-Zolezzi I, Burchard EG, and Bustamante CD
- Subjects
Genome, Human ,Mexican Americans ,Population ,Indians, North American ,Black People ,Genetic Variation ,Humans ,Mexico ,White People - Abstract
Mexico harbors great cultural and ethnic diversity, yet fine-scale patterns of human genome-wide variation from this region remain largely uncharacterized. We studied genomic variation within Mexico from over 1000 individuals representing 20 indigenous and 11 mestizo populations. We found striking genetic stratification among indigenous populations within Mexico at varying degrees of geographic isolation. Some groups were as differentiated as Europeans are from East Asians. Pre-Columbian genetic substructure is recapitulated in the indigenous ancestry of admixed mestizo individuals across the country. Furthermore, two independently phenotyped cohorts of Mexicans and Mexican Americans showed a significant association between subcontinental ancestry and lung function. Thus, accounting for fine-scale ancestry patterns is critical for medical and population genetic studies within Mexico, in Mexican-descent populations, and likely in many other populations worldwide.
- Published
- 2014
6. An integrated map of genetic variation from 1,092 human genomes
- Author
-
Altshuler, DM, Durbin, RM, Abecasis, GR, Bentley, DR, Chakravarti, A, Clark, AG, Donnelly, P, Eichler, EE, Flicek, P, Gabriel, SB, Gibbs, RA, Green, ED, Hurles, ME, Knoppers, BM, Korbel, JO, Lander, ES, Lee, C, Lehrach, H, Mardis, ER, Marth, GT, McVean, GA, Nickerson, DA, Schmidt, JP, Sherry, ST, Wang, J, Wilson, RK, Dinh, H, Kovar, C, Lee, S, Lewis, L, Muzny, D, Reid, J, Wang, M, Fang, X, Guo, X, Jian, M, Jiang, H, Jin, X, Li, G, Li, J, Li, Y, Li, Z, Liu, X, Lu, Y, Ma, X, Su, Z, Tai, S, Tang, M, Wang, B, Wang, G, Wu, H, Wu, R, Yin, Y, Zhang, W, Zhao, J, Zhao, M, Zheng, X, Zhou, Y, Gupta, N, Clarke, L, Leinonen, R, Smith, RE, Zheng-Bradley, X, Grocock, R, Humphray, S, James, T, Kingsbury, Z, Sudbrak, R, Albrecht, MW, Amstislavskiy, VS, Borodina, TA, Lienhard, M, Mertes, F, Sultan, M, Timmermann, B, Yaspo, M-L, Fulton, L, Fulton, R, Weinstock, GM, Balasubramaniam, S, Burton, J, Danecek, P, Keane, TM, Kolb-Kokocinski, A, McCarthy, S, Stalker, J, Quail, M, Davies, CJ, Gollub, J, Webster, T, Wong, B, Zhan, Y, Auton, A, Yu, F, Bainbridge, M, Challis, D, Evani, US, Lu, J, Nagaswamy, U, Sabo, A, Wang, Y, Yu, J, Coin, LJM, Fang, L, Li, Q, Lin, H, Liu, B, Luo, R, Qin, N, Shao, H, Xie, Y, Ye, C, Yu, C, Zhang, F, Zheng, H, Zhu, H, Garrison, EP, Kural, D, Lee, W-P, Leong, WF, Ward, AN, Wu, J, Zhang, M, Griffin, L, Hsieh, C-H, Mills, RE, Shi, X, Von Grotthuss, M, Zhang, C, Daly, MJ, DePristo, MA, Banks, E, Bhatia, G, Carneiro, MO, Del Angel, G, Genovese, G, Handsaker, RE, Hartl, C, McCarroll, SA, Nemesh, JC, Poplin, RE, Schaffner, SF, Shakir, K, Yoon, SC, Lihm, J, Makarov, V, Jin, H, Kim, W, Kim, KC, Rausch, T, Beal, K, Cunningham, F, Herrero, J, McLaren, WM, Ritchie, GRS, Gottipati, S, Keinan, A, Rodriguez-Flores, JL, Sabeti, PC, Grossman, SR, Tabrizi, S, Tariyal, R, Cooper, DN, Ball, EV, Stenson, PD, Barnes, B, Bauer, M, Cheetham, RK, Cox, T, Eberle, M, Kahn, S, Murray, L, Peden, J, Shaw, R, Ye, K, Batzer, MA, Konkel, MK, Walker, JA, MacArthur, DG, Lek, M, Herwig, R, Shriver, MD, Bustamante, CD, Byrnes, JK, De la Vega, FM, Gravel, S, Kenny, EE, Kidd, JM, Lacroute, P, Maples, BK, Moreno-Estrada, A, Zakharia, F, Halperin, E, Baran, Y, Craig, DW, Christoforides, A, Homer, N, Izatt, T, Kurdoglu, AA, Sinari, SA, Squire, K, Xiao, C, Sebat, J, Bafna, V, Burchard, EG, Hernandez, RD, Gignoux, CR, Haussler, D, Katzman, SJ, Kent, WJ, Howie, B, Ruiz-Linares, A, Dermitzakis, ET, Lappalainen, T, Devine, SE, Maroo, A, Tallon, LJ, Rosenfeld, JA, Michelson, LP, Kang, HM, Anderson, P, Angius, A, Bigham, A, Blackwell, T, Busonero, F, Cucca, F, Fuchsberger, C, Jones, C, Jun, G, Lyons, R, Maschio, A, Porcu, E, Reinier, F, Sanna, S, Schlessinger, D, Sidore, C, Tan, A, Trost, MK, Awadalla, P, Hodgkinson, A, Lunter, G, Marchini, JL, Myers, S, Churchhouse, C, Delaneau, O, Gupta-Hinch, A, Iqbal, Z, Mathieson, I, Rimmer, A, Xifara, DK, Oleksyk, TK, Fu, Y, Xiong, M, Jorde, L, Witherspoon, D, Xing, J, Browning, BL, Alkan, C, Hajirasouliha, I, Hormozdiari, F, Ko, A, Sudmant, PH, Chen, K, Chinwalla, A, Ding, L, Dooling, D, Koboldt, DC, McLellan, MD, Wallis, JW, Wendl, MC, Zhang, Q, Tyler-Smith, C, Albers, CA, Ayub, Q, Chen, Y, Coffey, AJ, Colonna, V, Huang, N, Jostins, L, Li, H, Scally, A, Walter, K, Xue, Y, Zhang, Y, Gerstein, MB, Abyzov, A, Balasubramanian, S, Chen, J, Clarke, D, Habegger, L, Harmanci, AO, Jin, M, Khurana, E, Mu, XJ, Sisu, C, Degenhardt, J, Stuetz, AM, Church, D, Michaelson, JJ, Ben, B, Lindsay, SJ, Ning, Z, Frankish, A, Harrow, J, Fowler, G, Hale, W, Kalra, D, Barker, J, Kelman, G, Kulesha, E, Radhakrishnan, R, Roa, A, Smirnov, D, Streeter, I, Toneva, I, Vaughan, B, Ananiev, V, Belaia, Z, Beloslyudtsev, D, Bouk, N, Chen, C, Cohen, R, Cook, C, Garner, J, Hefferon, T, Kimelman, M, Liu, C, Lopez, J, Meric, P, O'Sullivan, C, Ostapchuk, Y, Phan, L, Ponomarov, S, Schneider, V, Shekhtman, E, Sirotkin, K, Slotta, D, Zhang, H, Barnes, KC, Beiswanger, C, Cai, H, Cao, H, Gharani, N, Henn, B, Jones, D, Kaye, JS, Kent, A, Kerasidou, A, Mathias, R, Ossorio, PN, Parker, M, Reich, D, Rotimi, CN, Royal, CD, Sandoval, K, Su, Y, Tian, Z, Tishkoff, S, Toji, LH, Via, M, Yang, H, Yang, L, Zhu, J, Bodmer, W, Bedoya, G, Ming, CZ, Yang, G, You, CJ, Peltonen, L, Garcia-Montero, A, Orfao, A, Dutil, J, Martinez-Cruzado, JC, Brooks, LD, Felsenfeld, AL, McEwen, JE, Clemm, NC, Duncanson, A, Dunn, M, Guyer, MS, Peterson, JL, 1000 Genomes Project Consortium, Dermitzakis, Emmanouil, Universitat de Barcelona, Massachusetts Institute of Technology. Department of Biology, Altshuler, David, and Lander, Eric S.
- Subjects
Natural selection ,LOCI ,Genome-wide association study ,Evolutionary biology ,Continental Population Groups/genetics ,Human genetic variation ,VARIANTS ,Genoma humà ,Binding Sites/genetics ,0302 clinical medicine ,RARE ,Sequence Deletion/genetics ,WIDE ASSOCIATION ,ddc:576.5 ,Copy-number variation ,MUTATION ,Exome sequencing ,transcription factor ,Conserved Sequence ,Human evolution ,Sequence Deletion ,Genetics ,RISK ,0303 health sciences ,Multidisciplinary ,Continental Population Groups ,1000 Genomes Project Consortium ,Genetic analysis ,Genomics ,Polymorphism, Single Nucleotide/genetics ,Research Highlight ,3. Good health ,Algorithm ,Multidisciplinary Sciences ,Genetic Variation/genetics ,Map ,Science & Technology - Other Topics ,Conserved Sequence/genetics ,Integrated approach ,General Science & Technology ,Genetics, Medical ,Haplotypes/genetics ,Biology ,Polymorphism, Single Nucleotide ,Evolution, Molecular ,03 medical and health sciences ,Genetic variation ,Humans ,Transcription Factors/metabolism ,POPULATION-STRUCTURE ,1000 Genomes Project ,Polymorphism ,Nucleotide Motifs ,Alleles ,030304 developmental biology ,COPY NUMBER VARIATION ,Science & Technology ,Binding Sites ,Human genome ,Genome, Human ,Racial Groups ,Genetic Variation ,Genetics, Population ,Haplotypes ,Genome, Human/genetics ,untranslated RNA ,030217 neurology & neurosurgery ,Transcription Factors ,Genome-Wide Association Study - Abstract
By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations., National Institutes of Health (U.S.) (Grant RC2HL102925), National Institutes of Health (U.S.) (Grant U54HG3067)
- Published
- 2012
7. Genetic structure and domestication history of the grape
- Author
-
Myles, S, Boyko, A, Owens, C, Brown, P, Grassi, F, Aradhya, M, Prins, B, Reynolds, A, Chia, J, Ware, D, Bustamante, C, Buckler, E, Myles S, Boyko AR, Owens CL, Brown PJ, GRASSI, Fabrizio, Aradhya MK, Prins B, Reynolds A, Chia JM, Ware D, Bustamante CD, Buckler ES, Myles, S, Boyko, A, Owens, C, Brown, P, Grassi, F, Aradhya, M, Prins, B, Reynolds, A, Chia, J, Ware, D, Bustamante, C, Buckler, E, Myles S, Boyko AR, Owens CL, Brown PJ, GRASSI, Fabrizio, Aradhya MK, Prins B, Reynolds A, Chia JM, Ware D, Bustamante CD, and Buckler ES
- Abstract
The grape is one of the earliest domesticated fruit crops and, since antiquity, it has been widely cultivated and prized for its fruit and wine. Here, we characterize genome-wide patterns of genetic variation in over 1,000 samples of the domesticated grape, Vitis vinifera subsp. vinifera, and its wild relative, V. vinifera subsp. sylvestris from the US Department of Agriculture grape germ-plasm collection. We find support for a Near East origin of vinifera and present evidence of introgression from local sylvestris as the grape moved into Europe. High levels of genetic diversity and rapid linkage disequilibrium (LD) decay have been maintained in vinifera, which is consistent with a weak domestication bottleneck followed by thousands of years of widespread vegetative propagation. The considerable genetic diversity within vinifera, however, is contained within a complex network of close pedigree relationships that has been generated by crosses among elite cultivars. We show that first-degree relationships are rare between wine and table grapes and among grapes from geographically distant regions. Our results suggest that although substantial genetic diversity has been maintained in the grape subsequent to domestication, there has been a limited exploration of this diversity. We propose that the adoption of vegetative propagation was a double-edged sword: Although it provided a benefit by ensuring true breeding cultivars, it also discouraged the generation of unique cultivars through crosses. The grape currently faces severe pathogen pressures, and the long-term sustainability of the grape and wine industries will rely on the exploitation of the grape's tremendous natural genetic diversity.
- Published
- 2011
8. Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds.
- Author
-
Bovine Hap Map, Consortium, Gibbs, Ra, Taylor, Jf, Van Tassel, Cp, Barendse, W, Eversole, Ka, Gill, Ca, Green, Rd, Hamernik, Dl, Kappes, Sm, Lien, S, Matukumalli, Lk, Mcevan, Jc, Mazareth, Lv, Schnabel, Rd, Weinstock, Gm, Wheeler, Da, Ajmone Marsan, Paolo, Boettcher, Pj, Caetano, Ar, Garcia, Jf, Hanotte, O, Mariani, P, Skow, Lc, Sonstegard, T, Williams, Jl, Diallo, B, Hailemariam, L, Martinez, Ml, Morris, Ca, Silva, Lo, Spelman, Rj, Malatu, W, Zhao, K, Abbey, Ca, Agaba, M, Araujo, Fr, Bunch, Rj, Burton, J, Gorni, C, Olivier, H, Harrison, Be, Luff, B, Machado, Ma, Mwakaya, J, Plastow, G, Sim, W, Smith, T, Thomas, Mb, Valentini, A, Williams, P, Womack, J, Wolliams, Ja, Liu, Y, Qin, X, Worley, Kc, Gao, C, Jiang, H, Moore, S, Ren, Y, Song, Xz, Bustamante, Cd, Hernandez, Rd, Muzny, Dm, Patil, S, San Lucas, A, Fu, Q, Kent, Mp, Vega, R, Matukumalli, A, Mcwilliam, S, Sclep, G, Bryc, K, Choi, J, Gao, H, Grefenstette, Jj, Murdoch, B, Stella, A, Villa Angulo, R, Wright, M, Aerts, J, Jann, O, Negrini, Riccardo, Goddard, Me, Hayes, Bj, Bradley, Dg, Lau, Lp, Liu, Ge, Lynn, Dj, Panzitta, F, Dodds, Kg, Ajmone Marsan, Paolo (ORCID:0000-0003-3165-4579), Negrini, Riccardo (ORCID:0000-0002-8735-0286), Bovine Hap Map, Consortium, Gibbs, Ra, Taylor, Jf, Van Tassel, Cp, Barendse, W, Eversole, Ka, Gill, Ca, Green, Rd, Hamernik, Dl, Kappes, Sm, Lien, S, Matukumalli, Lk, Mcevan, Jc, Mazareth, Lv, Schnabel, Rd, Weinstock, Gm, Wheeler, Da, Ajmone Marsan, Paolo, Boettcher, Pj, Caetano, Ar, Garcia, Jf, Hanotte, O, Mariani, P, Skow, Lc, Sonstegard, T, Williams, Jl, Diallo, B, Hailemariam, L, Martinez, Ml, Morris, Ca, Silva, Lo, Spelman, Rj, Malatu, W, Zhao, K, Abbey, Ca, Agaba, M, Araujo, Fr, Bunch, Rj, Burton, J, Gorni, C, Olivier, H, Harrison, Be, Luff, B, Machado, Ma, Mwakaya, J, Plastow, G, Sim, W, Smith, T, Thomas, Mb, Valentini, A, Williams, P, Womack, J, Wolliams, Ja, Liu, Y, Qin, X, Worley, Kc, Gao, C, Jiang, H, Moore, S, Ren, Y, Song, Xz, Bustamante, Cd, Hernandez, Rd, Muzny, Dm, Patil, S, San Lucas, A, Fu, Q, Kent, Mp, Vega, R, Matukumalli, A, Mcwilliam, S, Sclep, G, Bryc, K, Choi, J, Gao, H, Grefenstette, Jj, Murdoch, B, Stella, A, Villa Angulo, R, Wright, M, Aerts, J, Jann, O, Negrini, Riccardo, Goddard, Me, Hayes, Bj, Bradley, Dg, Lau, Lp, Liu, Ge, Lynn, Dj, Panzitta, F, Dodds, Kg, Ajmone Marsan, Paolo (ORCID:0000-0003-3165-4579), and Negrini, Riccardo (ORCID:0000-0002-8735-0286)
- Published
- 2009
9. POPULATION GENETICS. Genomic evidence for the Pleistocene and recent population history of Native Americans
- Author
-
Maanasa, Raghavan, Matthias, Steinrücken, Kelley, Harris, Stephan, Schiffels, Simon, Rasmussen, Michael, DeGiorgio, Anders, Albrechtsen, Cristina, Valdiosera, María C, Ávila-Arcos, Anna-Sapfo, Malaspinas, Anders, Eriksson, Ida, Moltke, Mait, Metspalu, Julian R, Homburger, Jeff, Wall, Omar E, Cornejo, J Víctor, Moreno-Mayar, Thorfinn S, Korneliussen, Tracey, Pierre, Morten, Rasmussen, Paula F, Campos, Peter, de Barros Damgaard, Morten E, Allentoft, John, Lindo, Ene, Metspalu, Ricardo, Rodríguez-Varela, Josefina, Mansilla, Celeste, Henrickson, Andaine, Seguin-Orlando, Helena, Malmström, Thomas, Stafford, Suyash S, Shringarpure, Andrés, Moreno-Estrada, Monika, Karmin, Kristiina, Tambets, Anders, Bergström, Yali, Xue, Vera, Warmuth, Andrew D, Friend, Joy, Singarayer, Paul, Valdes, Francois, Balloux, Ilán, Leboreiro, Jose Luis, Vera, Hector, Rangel-Villalobos, Davide, Pettener, Donata, Luiselli, Loren G, Davis, Evelyne, Heyer, Christoph P E, Zollikofer, Marcia S, Ponce de León, Colin I, Smith, Vaughan, Grimes, Kelly-Anne, Pike, Michael, Deal, Benjamin T, Fuller, Bernardo, Arriaza, Vivien, Standen, Maria F, Luz, Francois, Ricaut, Niede, Guidon, Ludmila, Osipova, Mikhail I, Voevoda, Olga L, Posukh, Oleg, Balanovsky, Maria, Lavryashina, Yuri, Bogunov, Elza, Khusnutdinova, Marina, Gubina, Elena, Balanovska, Sardana, Fedorova, Sergey, Litvinov, Boris, Malyarchuk, Miroslava, Derenko, M J, Mosher, David, Archer, Jerome, Cybulski, Barbara, Petzelt, Joycelynn, Mitchell, Rosita, Worl, Paul J, Norman, Peter, Parham, Brian M, Kemp, Toomas, Kivisild, Chris, Tyler-Smith, Manjinder S, Sandhu, Michael, Crawford, Richard, Villems, David Glenn, Smith, Michael R, Waters, Ted, Goebel, John R, Johnson, Ripan S, Malhi, Mattias, Jakobsson, David J, Meltzer, Andrea, Manica, Richard, Durbin, Carlos D, Bustamante, Yun S, Song, Rasmus, Nielsen, Eske, Willerslev, Raghavan M, Steinrücken M, Harris K, Schiffels S, Rasmussen S, DeGiorgio M, Albrechtsen A, Valdiosera C, Ávila-Arcos MC, Malaspinas AS, Eriksson A, Moltke I, Metspalu M, Homburger JR, Wall J, Cornejo OE, Moreno-Mayar JV, Korneliussen TS, Pierre T, Rasmussen M, Campos PF, Damgaard Pde B, Allentoft ME, Lindo J, Metspalu E, Rodríguez-Varela R, Mansilla J, Henrickson C, Seguin-Orlando A, Malmström H, Stafford T Jr, Shringarpure SS, Moreno-Estrada A, Karmin M, Tambets K, Bergström A, Xue Y, Warmuth V, Friend AD, Singarayer J, Valdes P, Balloux F, Leboreiro I, Vera JL, Rangel-Villalobos H, Pettener D, Luiselli D, Davis LG, Heyer E, Zollikofer CP, Ponce de León MS, Smith CI, Grimes V, Pike KA, Deal M, Fuller BT, Arriaza B, Standen V, Luz MF, Ricaut F, Guidon N, Osipova L, Voevoda MI, Posukh OL, Balanovsky O, Lavryashina M, Bogunov Y, Khusnutdinova E, Gubina M, Balanovska E, Fedorova S, Litvinov S, Malyarchuk B, Derenko M, Mosher MJ, Archer D, Cybulski J, Petzelt B, Mitchell J, Worl R, Norman PJ, Parham P, Kemp BM, Kivisild T, Tyler-Smith C, Sandhu MS, Crawford M, Villems R, Smith DG, Waters MR, Goebel T, Johnson JR, Malhi RS, Jakobsson M, Meltzer DJ, Manica A, Durbin R, Bustamante CD, Song YS, Nielsen R, and Willerslev E
- Subjects
Gene Flow ,Siberia ,Models, Genetic ,Athabascans and Amerindians ,Human Migration ,Genetic history of Native American ,Indians, North American ,Humans ,Genomics ,Americas ,Population genetic ,History, Ancient ,Article - Abstract
How and when the Americas were populated remains contentious. Using ancient and modern genome-wide data, we find that the ancestors of all present-day Native Americans, including Athabascans and Amerindians, entered the Americas as a single migration wave from Siberia no earlier than 23 thousand years ago (KYA), and after no more than 8,000-year isolation period in Beringia. Following their arrival to the Americas, ancestral Native Americans diversified into two basal genetic branches around 13 KYA, one that is now dispersed across North and South America and the other is restricted to North America. Subsequent gene flow resulted in some Native Americans sharing ancestry with present-day East Asians (including Siberians) and, more distantly, Australo-Melanesians. Putative ‘Paleoamerican’ relict populations, including the historical Mexican Pericúes and South American Fuego-Patagonians, are not directly related to modern Australo-Melanesians as suggested by the Paleoamerican Model.
- Published
- 2015
10. Pulling out the 1%:whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries
- Author
-
Nikola Theodossiev, Cristina Valdiosera, Simon Gravel, Diana Dimitrova, Yingrui Li, Krasimir Leshtakov, M. Thomas P. Gilbert, William J. Greenleaf, Martin Sikora, Andrés Moreno-Estrada, Meredith L. Carpenter, Morten E. Allentoft, Sonia Guillén, Hannes Schroeder, Eske Willerslev, Georgi Nekhrizov, Karla Sandoval, Jason D. Buenrostro, Morten Rasmussen, Jun Wang, Davide Pettener, Carlos Bustamante, Donata Luiselli, Carpenter ML, Buenrostro JD, Valdiosera C, Schroeder H, Allentoft ME, Sikora M, Rasmussen M, Gravel S, Guillén S, Nekhrizov G, Leshtakov K, Dimitrova D, Theodossiev N, Pettener D, Luiselli D, Sandoval K, Moreno-Estrada A, Li Y, Wang J, Gilbert MT, Willerslev E, Greenleaf WJ, and Bustamante CD.
- Subjects
Male ,0106 biological sciences ,Adolescent ,Genomics ,Computational biology ,Biology ,010603 evolutionary biology ,01 natural sciences ,Genome ,Bone and Bones ,Article ,Deep sequencing ,03 medical and health sciences ,Principal Component Analysi ,Neanderthal genome ,MTDNA ,Genetics ,Humans ,ANCIENT DNA ,Genetics(clinical) ,Genomic library ,Environmental DNA ,Child ,History, Ancient ,Genetics (clinical) ,Gene Library ,030304 developmental biology ,Principal Component Analysis ,0303 health sciences ,Fossils ,Shotgun sequencing ,whole genome sequence ,High-Throughput Nucleotide Sequencing ,Nucleic Acid Hybridization ,DNA ,Mummies ,Sequence Analysis, DNA ,aDNA sequencing librarie ,Ancient DNA ,RNA ,Human genome ,Tooth ,Hair - Abstract
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain
- Published
- 2013
11. Polygenic risk score portability for common diseases across genetically diverse populations.
- Author
-
Moreno-Grau S, Vernekar M, Lopez-Pineda A, Mas-Montserrat D, Barrabés M, Quinto-Cortés CD, Moatamed B, Lee MTM, Yu Z, Numakura K, Matsuda Y, Wall JD, Ioannidis AG, Katsanis N, Takano T, and Bustamante CD
- Subjects
- Female, Humans, Asian People genetics, Genome-Wide Association Study, Models, Genetic, Polymorphism, Single Nucleotide, White People genetics, Black People genetics, Genetic Predisposition to Disease, Genetic Risk Score
- Abstract
Background: Polygenic risk scores (PRS) derived from European individuals have reduced portability across global populations, limiting their clinical implementation at worldwide scale. Here, we investigate the performance of a wide range of PRS models across four ancestry groups (Africans, Europeans, East Asians, and South Asians) for 14 conditions of high-medical interest., Methods: To select the best-performing model per trait, we first compared PRS performances for publicly available scores, and constructed new models using different methods (LDpred2, PRS-CSx and SNPnet). We used 285 K European individuals from the UK Biobank (UKBB) for training and 18 K, including diverse ancestries, for testing. We then evaluated PRS portability for the best models in Europeans and compared their accuracies with respect to the best PRS per ancestry. Finally, we validated the selected PRS models using an independent set of 8,417 individuals from Biobank of the Americas-Genomelink (BbofA-GL); and performed a PRS-Phewas., Results: We confirmed a decay in PRS performances relative to Europeans when the evaluation was conducted using the best-PRS model for Europeans (51.3% for South Asians, 46.6% for East Asians and 39.4% for Africans). We observed an improvement in the PRS performances when specifically selecting ancestry specific PRS models (phenotype variance increase: 1.62 for Africans, 1.40 for South Asians and 0.96 for East Asians). Additionally, when we selected the optimal model conditional on ancestry for CAD, HDL-C and LDL-C, hypertension, hypothyroidism and T2D, PRS performance for studied populations was more comparable to what was observed in Europeans. Finally, we were able to independently validate tested models for Europeans, and conducted a PRS-Phewas, identifying cross-trait interplay between cardiometabolic conditions, and between immune-mediated components., Conclusion: Our work comprehensively evaluated PRS accuracy across a wide range of phenotypes, reducing the uncertainty with respect to which PRS model to choose and in which ancestry group. This evaluation has let us identify specific conditions where implementing risk-prioritization strategies could have practical utility across diverse ancestral groups, contributing to democratizing the implementation of PRS., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
12. Unappreciated subcontinental admixture in Europeans and European Americans and implications for genetic epidemiology studies.
- Author
-
Gouveia MH, Bentley AR, Leal TP, Tarazona-Santos E, Bustamante CD, Adeyemo AA, Rotimi CN, and Shriner D
- Subjects
- Humans, Molecular Epidemiology, European People genetics, Genetics, Population
- Abstract
European-ancestry populations are recognized as stratified but not as admixed, implying that residual confounding by locus-specific ancestry can affect studies of association, polygenic adaptation, and polygenic risk scores. We integrate individual-level genome-wide data from ~19,000 European-ancestry individuals across 79 European populations and five European American cohorts. We generate a new reference panel that captures ancestral diversity missed by both the 1000 Genomes and Human Genome Diversity Projects. Both Europeans and European Americans are admixed at the subcontinental level, with admixture dates differing among subgroups of European Americans. After adjustment for both genome-wide and locus-specific ancestry, associations between a highly differentiated variant in LCT (rs4988235) and height or LDL-cholesterol were confirmed to be false positives whereas the association between LCT and body mass index was genuine. We provide formal evidence of subcontinental admixture in individuals with European ancestry, which, if not properly accounted for, can produce spurious results in genetic epidemiology studies., (© 2023. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.)
- Published
- 2023
- Full Text
- View/download PDF
13. The genomic history of the indigenous people of the Canary Islands.
- Author
-
Serrano JG, Ordóñez AC, Santana J, Sánchez-Cañadillas E, Arnay M, Rodríguez-Rodríguez A, Morales J, Velasco-Vázquez J, Alberto-Barroso V, Delgado-Darias T, de Mercadal MCC, Hernández JC, Moreno-Benítez MA, Pais J, Ringbauer H, Sikora M, McColl H, Pino-Yanes M, Ferrer MH, Bustamante CD, and Fregel R
- Subjects
- Humans, Spain, Africa, Northern, Indigenous Peoples, Islands, Genetic Variation, Genetics, Population, Genetic Drift, Genomics
- Abstract
The indigenous population of the Canary Islands, which colonized the archipelago around the 3
rd century CE, provides both a window into the past of North Africa and a unique model to explore the effects of insularity. We generate genome-wide data from 40 individuals from the seven islands, dated between the 3rd -16rd centuries CE. Along with components already present in Moroccan Neolithic populations, the Canarian natives show signatures related to Bronze Age expansions in Eurasia and trans-Saharan migrations. The lack of gene flow between islands and constant or decreasing effective population sizes suggest that populations were isolated. While some island populations maintained relatively high genetic diversity, with the only detected bottleneck coinciding with the colonization time, other islands with fewer natural resources show the effects of insularity and isolation. Finally, consistent genetic differentiation between eastern and western islands points to a more complex colonization process than previously thought., (© 2023. The Author(s).)- Published
- 2023
- Full Text
- View/download PDF
14. Neural ADMIXTURE for rapid genomic clustering.
- Author
-
Mantes AD, Montserrat DM, Bustamante CD, Giró-I-Nieto X, and Ioannidis AG
- Abstract
Characterizing the genetic structure of large cohorts has become increasingly important as genetic studies extend to massive, increasingly diverse biobanks. Popular methods decompose individual genomes into fractional cluster assignments with each cluster representing a vector of DNA variant frequencies. However, with rapidly increasing biobank sizes, these methods have become computationally intractable. Here we present Neural ADMIXTURE, a neural network autoencoder that follows the same modeling assumptions as the current standard algorithm, ADMIXTURE, while reducing the compute time by orders of magnitude surpassing even the fastest alternatives. One month of continuous compute using ADMIXTURE can be reduced to just hours with Neural ADMIXTURE. A multi-head approach allows Neural ADMIXTURE to offer even further acceleration by calculating multiple cluster numbers in a single run. Furthermore, the models can be stored, allowing cluster assignment to be performed on new data in linear time without needing to share the training samples., Competing Interests: Competing interests: C.D.B. is the CEO of Galatea Bio, and A.G.I. also holds shares. The remaining authors declare no competing interests.
- Published
- 2023
- Full Text
- View/download PDF
15. Session Introduction: Overcoming health disparities in precision medicine.
- Author
-
Barnes KC, De La Vega FM, Bustamante CD, Gignoux CR, Kenny E, Mathias RA, and Pasaniuc B
- Subjects
- Humans, Computational Biology, Precision Medicine
- Abstract
The following sections are included: Overview, Equitable risk prediction, Pharmacoequity, Race, genetic ancestry, and population structure, Conclusion, Acknowledgments, References.
- Published
- 2023
16. Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease.
- Author
-
Zhou W, Kanai M, Wu KH, Rasheed H, Tsuo K, Hirbo JB, Wang Y, Bhattacharya A, Zhao H, Namba S, Surakka I, Wolford BN, Lo Faro V, Lopera-Maya EA, Läll K, Favé MJ, Partanen JJ, Chapman SB, Karjalainen J, Kurki M, Maasha M, Brumpton BM, Chavan S, Chen TT, Daya M, Ding Y, Feng YA, Guare LA, Gignoux CR, Graham SE, Hornsby WE, Ingold N, Ismail SI, Johnson R, Laisk T, Lin K, Lv J, Millwood IY, Moreno-Grau S, Nam K, Palta P, Pandit A, Preuss MH, Saad C, Setia-Verma S, Thorsteinsdottir U, Uzunovic J, Verma A, Zawistowski M, Zhong X, Afifi N, Al-Dabhani KM, Al Thani A, Bradford Y, Campbell A, Crooks K, de Bock GH, Damrauer SM, Douville NJ, Finer S, Fritsche LG, Fthenou E, Gonzalez-Arroyo G, Griffiths CJ, Guo Y, Hunt KA, Ioannidis A, Jansonius NM, Konuma T, Lee MTM, Lopez-Pineda A, Matsuda Y, Marioni RE, Moatamed B, Nava-Aguilar MA, Numakura K, Patil S, Rafaels N, Richmond A, Rojas-Muñoz A, Shortt JA, Straub P, Tao R, Vanderwerff B, Vernekar M, Veturi Y, Barnes KC, Boezen M, Chen Z, Chen CY, Cho J, Smith GD, Finucane HK, Franke L, Gamazon ER, Ganna A, Gaunt TR, Ge T, Huang H, Huffman J, Katsanis N, Koskela JT, Lajonchere C, Law MH, Li L, Lindgren CM, Loos RJF, MacGregor S, Matsuda K, Olsen CM, Porteous DJ, Shavit JA, Snieder H, Takano T, Trembath RC, Vonk JM, Whiteman DC, Wicks SJ, Wijmenga C, Wright J, Zheng J, Zhou X, Awadalla P, Boehnke M, Bustamante CD, Cox NJ, Fatumo S, Geschwind DH, Hayward C, Hveem K, Kenny EE, Lee S, Lin YF, Mbarek H, Mägi R, Martin HC, Medland SE, Okada Y, Palotie AV, Pasaniuc B, Rader DJ, Ritchie MD, Sanna S, Smoller JW, Stefansson K, van Heel DA, Walters RG, Zöllner S, Martin AR, Willer CJ, Daly MJ, and Neale BM
- Abstract
Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)-a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits., Competing Interests: M.J.D. is a founder of Maze Therapeutics. B.M.N. is a member of the scientific advisory board at Deep Genomics and a consultant for Camp4 Therapeutics, Takeda Pharmaceutical, and Biogen. The spouse of C.J.W. works at Regeneron Pharmaceuticals. C.-Y.C. is employed by Biogen. C.R.G. owns stock in 23andMe, Inc. T.R.G. has received research funding from various pharmaceutical companies to support the application of Mendelian randomization to drug target prioritization. E.E.K. has received speaker fees from Regeneron, Illumina, and 23andMe and is a member of the advisory board for Galateo Bio. R.E.M. has received speaker fees from Illumina and is a scientific advisor to the Epigenetic Clock Development Foundation. G.D.S. has received research funding from various pharmaceutical companies to support the application of Mendelian randomization to drug target prioritization. K.S. and U.T. are employed by deCODE Genetics/Amgen, Inc. J.Z. has received research funding from various pharmaceutical companies to support the application of Mendelian randomization to drug target prioritization. S.M. is a co-founder of and holds stock in Seonix Bio., (© 2022.)
- Published
- 2022
- Full Text
- View/download PDF
17. Validating and automating learning of cardiometabolic polygenic risk scores from direct-to-consumer genetic and phenotypic data: implications for scaling precision health research.
- Author
-
Lopez-Pineda A, Vernekar M, Moreno-Grau S, Rojas-Muñoz A, Moatamed B, Lee MTM, Nava-Aguilar MA, Gonzalez-Arroyo G, Numakura K, Matsuda Y, Ioannidis A, Katsanis N, Takano T, and Bustamante CD
- Subjects
- Genetic Predisposition to Disease, Genome-Wide Association Study, Humans, Multifactorial Inheritance genetics, Phenotype, Precision Medicine, Risk Factors, Cardiovascular Diseases, Diabetes Mellitus, Type 2 genetics, Hypertension genetics
- Abstract
Introduction: A major challenge to enabling precision health at a global scale is the bias between those who enroll in state sponsored genomic research and those suffering from chronic disease. More than 30 million people have been genotyped by direct-to-consumer (DTC) companies such as 23andMe, Ancestry DNA, and MyHeritage, providing a potential mechanism for democratizing access to medical interventions and thus catalyzing improvements in patient outcomes as the cost of data acquisition drops. However, much of these data are sequestered in the initial provider network, without the ability for the scientific community to either access or validate. Here, we present a novel geno-pheno platform that integrates heterogeneous data sources and applies learnings to common chronic disease conditions including Type 2 diabetes (T2D) and hypertension., Methods: We collected genotyped data from a novel DTC platform where participants upload their genotype data files and were invited to answer general health questionnaires regarding cardiometabolic traits over a period of 6 months. Quality control, imputation, and genome-wide association studies were performed on this dataset, and polygenic risk scores were built in a case-control setting using the BASIL algorithm., Results: We collected data on N = 4,550 (389 cases / 4,161 controls) who reported being affected or previously affected for T2D and N = 4,528 (1,027 cases / 3,501 controls) for hypertension. We identified 164 out of 272 variants showing identical effect direction to previously reported genome-significant findings in Europeans. Performance metric of the PRS models was AUC = 0.68, which is comparable to previously published PRS models obtained with larger datasets including clinical biomarkers., Discussion: DTC platforms have the potential of inverting research models of genome sequencing and phenotypic data acquisition. Quality control (QC) mechanisms proved to successfully enable traditional GWAS and PRS analyses. The direct participation of individuals has shown the potential to generate rich datasets enabling the creation of PRS cardiometabolic models. More importantly, federated learning of PRS from reuse of DTC data provides a mechanism for scaling precision health care delivery beyond the small number of countries who can afford to finance these efforts directly., Conclusions: The genetics of T2D and hypertension have been studied extensively in controlled datasets, and various polygenic risk scores (PRS) have been developed. We developed predictive tools for both phenotypes trained with heterogeneous genotypic and phenotypic data generated outside of the clinical environment and show that our methods can recapitulate prior findings with fidelity. From these observations, we conclude that it is possible to leverage DTC genetic repositories to identify individuals at risk of debilitating diseases based on their unique genetic landscape so that informed, timely clinical interventions can be incorporated., (© 2022. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
18. Deconvoluting complex correlates of COVID-19 severity with a multi-omic pandemic tracking strategy.
- Author
-
Parikh VN, Ioannidis AG, Jimenez-Morales D, Gorzynski JE, De Jong HN, Liu X, Roque J, Cepeda-Espinoza VP, Osoegawa K, Hughes C, Sutton SC, Youlton N, Joshi R, Amar D, Tanigawa Y, Russo D, Wong J, Lauzon JT, Edelson J, Mas Montserrat D, Kwon Y, Rubinacci S, Delaneau O, Cappello L, Kim J, Shoura MJ, Raja AN, Watson N, Hammond N, Spiteri E, Mallempati KC, Montero-Martín G, Christle J, Kim J, Kirillova A, Seo K, Huang Y, Zhao C, Moreno-Grau S, Hershman SG, Dalton KP, Zhen J, Kamm J, Bhatt KD, Isakova A, Morri M, Ranganath T, Blish CA, Rogers AJ, Nadeau K, Yang S, Blomkalns A, O'Hara R, Neff NF, DeBoever C, Szalma S, Wheeler MT, Gates CM, Farh K, Schroth GP, Febbo P, deSouza F, Cornejo OE, Fernandez-Vina M, Kistler A, Palacios JA, Pinsky BA, Bustamante CD, Rivas MA, and Ashley EA
- Subjects
- Genome, Viral, Genome-Wide Association Study, Humans, SARS-CoV-2 genetics, COVID-19 epidemiology, Pandemics
- Abstract
The SARS-CoV-2 pandemic has differentially impacted populations across race and ethnicity. A multi-omic approach represents a powerful tool to examine risk across multi-ancestry genomes. We leverage a pandemic tracking strategy in which we sequence viral and host genomes and transcriptomes from nasopharyngeal swabs of 1049 individuals (736 SARS-CoV-2 positive and 313 SARS-CoV-2 negative) and integrate them with digital phenotypes from electronic health records from a diverse catchment area in Northern California. Genome-wide association disaggregated by admixture mapping reveals novel COVID-19-severity-associated regions containing previously reported markers of neurologic, pulmonary and viral disease susceptibility. Phylodynamic tracking of consensus viral genomes reveals no association with disease severity or inferred ancestry. Summary data from multiomic investigation reveals metagenomic and HLA associations with severe COVID-19. The wealth of data available from residual nasopharyngeal swabs in combination with clinical data abstracted automatically at scale highlights a powerful strategy for pandemic tracking, and reveals distinct epidemiologic, genetic, and biological associations for those at the highest risk., (© 2022. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
19. Archetypal Analysis for population genetics.
- Author
-
Gimbernat-Mayol J, Dominguez Mantes A, Bustamante CD, Mas Montserrat D, and Ioannidis AG
- Subjects
- Genetic Predisposition to Disease, Genetics, Population, Genome, Genomics methods, Humans, Genome-Wide Association Study, Polymorphism, Single Nucleotide genetics
- Abstract
The estimation of genetic clusters using genomic data has application from genome-wide association studies (GWAS) to demographic history to polygenic risk scores (PRS) and is expected to play an important role in the analyses of increasingly diverse, large-scale cohorts. However, existing methods are computationally-intensive, prohibitively so in the case of nationwide biobanks. Here we explore Archetypal Analysis as an efficient, unsupervised approach for identifying genetic clusters and for associating individuals with them. Such unsupervised approaches help avoid conflating socially constructed ethnic labels with genetic clusters by eliminating the need for exogenous training labels. We show that Archetypal Analysis yields similar cluster structure to existing unsupervised methods such as ADMIXTURE and provides interpretative advantages. More importantly, we show that since Archetypal Analysis can be used with lower-dimensional representations of genetic data, significant reductions in computational time and memory requirements are possible. When Archetypal Analysis is run in such a fashion, it takes several orders of magnitude less compute time than the current standard, ADMIXTURE. Finally, we demonstrate uses ranging across datasets from humans to canids., Competing Interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: CDB and AGI are co-founders of Galatea Bio Inc.
- Published
- 2022
- Full Text
- View/download PDF
20. Author Correction: Comparative and demographic analysis of orang-utan genomes.
- Author
-
Locke DP, Hillier LW, Warren WC, Worley KC, Nazareth LV, Muzny DM, Yang SP, Wang Z, Chinwalla AT, Minx P, Mitreva M, Cook L, Delehaunty KD, Fronick C, Schmidt H, Fulton LA, Fulton RS, Nelson JO, Magrini V, Pohl C, Graves TA, Markovic C, Cree A, Dinh HH, Hume J, Kovar CL, Fowler GR, Lunter G, Meader S, Heger A, Ponting CP, Marques-Bonet T, Alkan C, Chen L, Cheng Z, Kidd JM, Eichler EE, White S, Searle S, Vilella AJ, Chen Y, Flicek P, Ma J, Raney B, Suh B, Burhans R, Herrero J, Haussler D, Faria R, Fernando O, Darré F, Farré D, Gazave E, Oliva M, Navarro A, Roberto R, Capozzi O, Archidiacono N, Della Valle G, Purgato S, Rocchi M, Konkel MK, Walker JA, Ullmer B, Batzer MA, Smit AFA, Hubley R, Casola C, Schrider DR, Hahn MW, Quesada V, Puente XS, Ordoñez GR, López-Otín C, Vinar T, Brejova B, Ratan A, Harris RS, Miller W, Kosiol C, Lawson HA, Taliwal V, Martins AL, Siepel A, RoyChoudhury A, Ma X, Degenhardt J, Bustamante CD, Gutenkunst RN, Mailund T, Dutheil JY, Hobolth A, Schierup MH, Ryder OA, Yoshinaga Y, de Jong PJ, Weinstock GM, Rogers J, Mardis ER, Gibbs RA, and Wilson RK
- Published
- 2022
- Full Text
- View/download PDF
21. Ancient DNA reveals five streams of migration into Micronesia and matrilocality in early Pacific seafarers.
- Author
-
Liu YC, Hunter-Anderson R, Cheronet O, Eakin J, Camacho F, Pietrusewsky M, Rohland N, Ioannidis A, Athens JS, Douglas MT, Ikehara-Quebral RM, Bernardos R, Culleton BJ, Mah M, Adamski N, Broomandkhoshbacht N, Callan K, Lawson AM, Mandl K, Michel M, Oppenheimer J, Stewardson K, Zalzala F, Kidd K, Kidd J, Schurr TG, Auckland K, Hill AVS, Mentzer AJ, Quinto-Cortés CD, Robson K, Kennett DJ, Patterson N, Bustamante CD, Moreno-Estrada A, Spriggs M, Vilar M, Lipson M, Pinhasi R, and Reich D
- Subjects
- Asian People genetics, Child, Female, History, Ancient, Humans, Male, Micronesia, Oceania, DNA, Ancient, DNA, Mitochondrial genetics, Human Migration history
- Abstract
Micronesia began to be peopled earlier than other parts of Remote Oceania, but the origins of its inhabitants remain unclear. We generated genome-wide data from 164 ancient and 112 modern individuals. Analysis reveals five migratory streams into Micronesia. Three are East Asian related, one is Polynesian, and a fifth is a Papuan source related to mainland New Guineans that is different from the New Britain-related Papuan source for southwest Pacific populations but is similarly derived from male migrants ~2500 to 2000 years ago. People of the Mariana Archipelago may derive all of their precolonial ancestry from East Asian sources, making them the only Remote Oceanians without Papuan ancestry. Female-inherited mitochondrial DNA was highly differentiated across early Remote Oceanian communities but homogeneous within, implying matrilocal practices whereby women almost never raised their children in communities different from the ones in which they grew up.
- Published
- 2022
- Full Text
- View/download PDF
22. Clotting factor genes are associated with preeclampsia in high-altitude pregnant women in the Peruvian Andes.
- Author
-
Nieves-Colón MA, Badillo Rivera KM, Sandoval K, Villanueva Dávalos V, Enriquez Lencinas LE, Mendoza-Revilla J, Adhikari K, González-Buenfil R, Chen JW, Zhang ET, Sockell A, Ortiz-Tello P, Hurtado GM, Condori Salas R, Cebrecos R, Manzaneda Choque JC, Manzaneda Choque FP, Yábar Pilco GP, Rawls E, Eng C, Huntsman S, Burchard E, Ruiz-Linares A, González-José R, Bedoya G, Rothhammer F, Bortolini MC, Poletti G, Gallo C, Bustamante CD, Baker JC, Gignoux CR, Wojcik GL, and Moreno-Estrada A
- Subjects
- Altitude, Blood Coagulation Factors, Blood Proteins genetics, Case-Control Studies, Factor VII genetics, Factor X genetics, Female, Humans, Peru epidemiology, Placenta, Pregnancy, Pre-Eclampsia epidemiology, Pre-Eclampsia genetics
- Abstract
Preeclampsia is a multi-organ complication of pregnancy characterized by sudden hypertension and proteinuria that is among the leading causes of preterm delivery and maternal morbidity and mortality worldwide. The heterogeneity of preeclampsia poses a challenge for understanding its etiology and molecular basis. Intriguingly, risk for the condition increases in high-altitude regions such as the Peruvian Andes. To investigate the genetic basis of preeclampsia in a population living at high altitude, we characterized genome-wide variation in a cohort of preeclamptic and healthy Andean families (n = 883) from Puno, Peru, a city located above 3,800 meters of altitude. Our study collected genomic DNA and medical records from case-control trios and duos in local hospital settings. We generated genotype data for 439,314 SNPs, determined global ancestry patterns, and mapped associations between genetic variants and preeclampsia phenotypes. A transmission disequilibrium test (TDT) revealed variants near genes of biological importance for placental and blood vessel function. The top candidate region was found on chromosome 13 of the fetal genome and contains clotting factor genes PROZ, F7, and F10. These findings provide supporting evidence that common genetic variants within coagulation genes play an important role in preeclampsia. A selection scan revealed a potential adaptive signal around the ADAM12 locus on chromosome 10, implicated in pregnancy disorders. Our discovery of an association in a functional pathway relevant to pregnancy physiology in an understudied population of Native American origin demonstrates the increased power of family-based study design and underscores the importance of conducting genetic research in diverse populations., Competing Interests: Declaration of interests J.W.C. is currently a full-time employee at Genentech, Inc. and hold stocks in Roche Holding AG., (Copyright © 2022 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.)
- Published
- 2022
- Full Text
- View/download PDF
23. ClinGen Variant Curation Interface: a variant classification platform for the application of evidence criteria from ACMG/AMP guidelines.
- Author
-
Preston CG, Wright MW, Madhavrao R, Harrison SM, Goldstein JL, Luo X, Wand H, Wulf B, Cheung G, Mandell ME, Tong H, Cheng S, Iacocca MA, Pineda AL, Popejoy AB, Dalton K, Zhen J, Dwight SS, Babb L, DiStefano M, O'Daniel JM, Lee K, Riggs ER, Zastrow DB, Mester JL, Ritter DI, Patel RY, Subramanian SL, Milosavljevic A, Berg JS, Rehm HL, Plon SE, Cherry JM, Bustamante CD, and Costa HA
- Subjects
- Humans, Genetic Testing, Genomics, Genetic Variation, Genome, Human
- Abstract
Background: Identification of clinically significant genetic alterations involved in human disease has been dramatically accelerated by developments in next-generation sequencing technologies. However, the infrastructure and accessible comprehensive curation tools necessary for analyzing an individual patient genome and interpreting genetic variants to inform healthcare management have been lacking., Results: Here we present the ClinGen Variant Curation Interface (VCI), a global open-source variant classification platform for supporting the application of evidence criteria and classification of variants based on the ACMG/AMP variant classification guidelines. The VCI is among a suite of tools developed by the NIH-funded Clinical Genome Resource (ClinGen) Consortium and supports an FDA-recognized human variant curation process. Essential to this is the ability to enable collaboration and peer review across ClinGen Expert Panels supporting users in comprehensively identifying, annotating, and sharing relevant evidence while making variant pathogenicity assertions. To facilitate evidence-based improvements in human variant classification, the VCI is publicly available to the genomics community. Navigation workflows support users providing guidance to comprehensively apply the ACMG/AMP evidence criteria and document provenance for asserting variant classifications., Conclusions: The VCI offers a central platform for clinical variant classification that fills a gap in the learning healthcare system, facilitates widespread adoption of standards for clinical curation, and is available at https://curation.clinicalgenome.org., (© 2021. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
24. Bayesian model comparison for rare-variant association studies.
- Author
-
Venkataraman GR, DeBoever C, Tanigawa Y, Aguirre M, Ioannidis AG, Mostafavi H, Spencer CCA, Poterba T, Bustamante CD, Daly MJ, Pirinen M, and Rivas MA
- Subjects
- Bayes Theorem, Female, Humans, Male, Phenotype, Genetic Variation, Genome-Wide Association Study, Models, Genetic
- Abstract
Whole-genome sequencing studies applied to large populations or biobanks with extensive phenotyping raise new analytic challenges. The need to consider many variants at a locus or group of genes simultaneously and the potential to study many correlated phenotypes with shared genetic architecture provide opportunities for discovery not addressed by the traditional one variant, one phenotype association study. Here, we introduce a Bayesian model comparison approach called MRP (multiple rare variants and phenotypes) for rare-variant association studies that considers correlation, scale, and direction of genetic effects across a group of genetic variants, phenotypes, and studies, requiring only summary statistic data. We apply our method to exome sequencing data (n = 184,698) across 2,019 traits from the UK Biobank, aggregating signals in genes. MRP demonstrates an ability to recover signals such as associations between PCSK9 and LDL cholesterol levels. We additionally find MRP effective in conducting meta-analyses in exome data. Non-biomarker findings include associations between MC1R and red hair color and skin color, IL17RA and monocyte count, and IQGAP2 and mean platelet volume. Finally, we apply MRP in a multi-phenotype setting; after clustering the 35 biomarker phenotypes based on genetic correlation estimates, we find that joint analysis of these phenotypes results in substantial power gains for gene-trait associations, such as in TNFRSF13B in one of the clusters containing diabetes- and lipid-related traits. Overall, we show that the MRP model comparison approach improves upon useful features from widely used meta-analysis approaches for rare-variant association analyses and prioritizes protective modifiers of disease risk., Competing Interests: Declaration of interests M.A.R. is on the SAB of 54Gene and Related Sciences; is scientific founder of Broadwing Bio; and has advised BioMarin, Third Rock Ventures, and MazeTx. C.D.B. is the owner and president of C.D.B. Consulting, LTD, and also a director at EdenRoc Sciences, LLC and BigData Bio LLC (LLC) and Etalon DX; founder of Arc Bio LLC (formerly IdentifyGenomics LLC and BigData Bio LLC); and an SAB member of Imprimed, FaunaBio, Columbia Care, and Digitalis Ventures. He is also a venture partner at F-Prime Capital Partners. M.J.D. is a founder of MazeTx., (Copyright © 2021 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.)
- Published
- 2021
- Full Text
- View/download PDF
25. Paths and timings of the peopling of Polynesia inferred from genomic networks.
- Author
-
Ioannidis AG, Blanco-Portillo J, Sandoval K, Hagelberg E, Barberena-Jonas C, Hill AVS, Rodríguez-Rodríguez JE, Fox K, Robson K, Haoa-Cardinali S, Quinto-Cortés CD, Miquel-Poblete JF, Auckland K, Parks T, Sofro ASM, Ávila-Arcos MC, Sockell A, Homburger JR, Eng C, Huntsman S, Burchard EG, Gignoux CR, Verdugo RA, Moraga M, Bustamante CD, Mentzer AJ, and Moreno-Estrada A
- Subjects
- Female, History, Medieval, Humans, Male, Polynesia, Genome, Human genetics, Genomics, Human Migration history, Native Hawaiian or Other Pacific Islander genetics
- Abstract
Polynesia was settled in a series of extraordinary voyages across an ocean spanning one third of the Earth
1 , but the sequences of islands settled remain unknown and their timings disputed. Currently, several centuries separate the dates suggested by different archaeological surveys2-4 . Here, using genome-wide data from merely 430 modern individuals from 21 key Pacific island populations and novel ancestry-specific computational analyses, we unravel the detailed genetic history of this vast, dispersed island network. Our reconstruction of the branching Polynesian migration sequence reveals a serial founder expansion, characterized by directional loss of variants, that originated in Samoa and spread first through the Cook Islands (Rarotonga), then to the Society (Tōtaiete mā) Islands (11th century), the western Austral (Tuha'a Pae) Islands and Tuāmotu Archipelago (12th century), and finally to the widely separated, but genetically connected, megalithic statue-building cultures of the Marquesas (Te Henua 'Enana) Islands in the north, Raivavae in the south, and Easter Island (Rapa Nui), the easternmost of the Polynesian islands, settled in approximately AD 1200 via Mangareva., (© 2021. The Author(s), under exclusive licence to Springer Nature Limited.)- Published
- 2021
- Full Text
- View/download PDF
26. Dynamic RNA Regulation in the Brain Underlies Physiological Plasticity in a Hibernating Mammal.
- Author
-
Fu R, Gillen AE, Grabek KR, Riemondy KA, Epperson LE, Bustamante CD, Hesselberth JR, and Martin SL
- Abstract
Hibernation is a physiological and behavioral phenotype that minimizes energy expenditure. Hibernators cycle between profound depression and rapid hyperactivation of multiple physiological processes, challenging our concept of mammalian homeostasis. How the hibernator orchestrates and survives these extremes while maintaining cell to organismal viability is unknown. Here, we enhance the genome integrity and annotation of a model hibernator, the 13-lined ground squirrel. Our new assembly brings this genome to near chromosome-level contiguity and adds thousands of previously unannotated genes. These new genomic resources were used to identify 6,505 hibernation-related, differentially-expressed and processed transcripts using RNA-seq data from three brain regions in animals whose physiological status was precisely defined using body temperature telemetry. A software tool, squirrelBox, was developed to foster further data analyses and visualization. SquirrelBox includes a comprehensive toolset for rapid visualization of gene level and cluster group dynamics, sequence scanning of k -mer and domains, and interactive exploration of gene lists. Using these new tools and data, we deconvolute seasonal from temperature-dependent effects on the brain transcriptome during hibernation for the first time, highlighting the importance of carefully timed samples for studies of differential gene expression in hibernation. The identified genes include a regulatory network of RNA binding proteins that are dynamic in hibernation along with the composition of the RNA pool. In addition to passive effects of temperature, we provide evidence for regulated transcription and RNA turnover during hibernation. Significant alternative splicing, largely temperature dependent, also occurs during hibernation. These findings form a crucial first step and provide a roadmap for future work toward defining novel mechanisms of tissue protection and metabolic depression that may 1 day be applied toward improving human health., Competing Interests: KG is CSO for Fauna Bio and SM and CB serve on its SAB. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Fu, Gillen, Grabek, Riemondy, Epperson, Bustamante, Hesselberth and Martin.)
- Published
- 2021
- Full Text
- View/download PDF
27. Discovering prescription patterns in pediatric acute-onset neuropsychiatric syndrome patients.
- Author
-
Lopez Pineda A, Pourshafeie A, Ioannidis A, Leibold CM, Chan AL, Bustamante CD, Frankovich J, and Wojcik GL
- Subjects
- Child, Cohort Studies, Humans, Prescriptions, Autoimmune Diseases, Obsessive-Compulsive Disorder drug therapy, Streptococcal Infections
- Abstract
Objective: Pediatric acute-onset neuropsychiatric syndrome (PANS) is a complex neuropsychiatric syndrome characterized by an abrupt onset of obsessive-compulsive symptoms and/or severe eating restrictions, along with at least two concomitant debilitating cognitive, behavioral, or neurological symptoms. A wide range of pharmacological interventions along with behavioral and environmental modifications, and psychotherapies have been adopted to treat symptoms and underlying etiologies. Our goal was to develop a data-driven approach to identify treatment patterns in this cohort., Materials and Methods: In this cohort study, we extracted medical prescription histories from electronic health records. We developed a modified dynamic programming approach to perform global alignment of those medication histories. Our approach is unique since it considers time gaps in prescription patterns as part of the similarity strategy., Results: This study included 43 consecutive new-onset pre-pubertal patients who had at least 3 clinic visits. Our algorithm identified six clusters with distinct medication usage history which may represent clinician's practice of treating PANS of different severities and etiologies i.e., two most severe groups requiring high dose intravenous steroids; two arthritic or inflammatory groups requiring prolonged nonsteroidal anti-inflammatory drug (NSAID); and two mild relapsing/remitting group treated with a short course of NSAID. The psychometric scores as outcomes in each cluster generally improved within the first two years., Discussion and Conclusion: Our algorithm shows potential to improve our knowledge of treatment patterns in the PANS cohort, while helping clinicians understand how patients respond to a combination of drugs., (Copyright © 2020. Published by Elsevier Inc.)
- Published
- 2021
- Full Text
- View/download PDF
28. Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations.
- Author
-
Martin AR, Gignoux CR, Walters RK, Wojcik GL, Neale BM, Gravel S, Daly MJ, Bustamante CD, and Kenny EE
- Published
- 2020
- Full Text
- View/download PDF
29. High-throughput SARS-CoV-2 and host genome sequencing from single nasopharyngeal swabs.
- Author
-
Gorzynski JE, De Jong HN, Amar D, Hughes CR, Ioannidis A, Bierman R, Liu D, Tanigawa Y, Kistler A, Kamm J, Kim J, Cappello L, Neff NF, Rubinacci S, Delaneau O, Shoura MJ, Seo K, Kirillova A, Raja A, Sutton S, Huang C, Sahoo MK, Mallempati KC, Montero-Martin G, Osoegawa K, Jimenez-Morales D, Watson N, Hammond N, Joshi R, Fernandez-Vina M, Christle JW, Wheeler MT, Febbo P, Farh K, Schroth G, Desouza F, Palacios J, Salzman J, Pinsky BA, Rivas MA, Bustamante CD, Ashley EA, and Parikh VN
- Abstract
During COVID19 and other viral pandemics, rapid generation of host and pathogen genomic data is critical to tracking infection and informing therapies. There is an urgent need for efficient approaches to this data generation at scale. We have developed a scalable, high throughput approach to generate high fidelity low pass whole genome and HLA sequencing, viral genomes, and representation of human transcriptome from single nasopharyngeal swabs of COVID19 patients.
- Published
- 2020
- Full Text
- View/download PDF
30. Clinical Genetics Lacks Standard Definitions and Protocols for the Collection and Use of Diversity Measures.
- Author
-
Popejoy AB, Crooks KR, Fullerton SM, Hindorff LA, Hooker GW, Koenig BA, Pino N, Ramos EM, Ritter DI, Wand H, Wright MW, Yudell M, Zou JY, Plon SE, Bustamante CD, and Ormond KE
- Subjects
- Adult, Child, Ethnicity, Female, Genetic Variation genetics, Genomics standards, Humans, Male, Precision Medicine standards, Prohibitins, Surveys and Questionnaires, Data Collection standards, Genetic Testing standards
- Abstract
Genetics researchers and clinical professionals rely on diversity measures such as race, ethnicity, and ancestry (REA) to stratify study participants and patients for a variety of applications in research and precision medicine. However, there are no comprehensive, widely accepted standards or guidelines for collecting and using such data in clinical genetics practice. Two NIH-funded research consortia, the Clinical Genome Resource (ClinGen) and Clinical Sequencing Evidence-generating Research (CSER), have partnered to address this issue and report how REA are currently collected, conceptualized, and used. Surveying clinical genetics professionals and researchers (n = 448), we found heterogeneity in the way REA are perceived, defined, and measured, with variation in the perceived importance of REA in both clinical and research settings. The majority of respondents (>55%) felt that REA are at least somewhat important for clinical variant interpretation, ordering genetic tests, and communicating results to patients. However, there was no consensus on the relevance of REA, including how each of these measures should be used in different scenarios and what information they can convey in the context of human genetics. A lack of common definitions and applications of REA across the precision medicine pipeline may contribute to inconsistencies in data collection, missing or inaccurate classifications, and misleading or inconclusive results. Thus, our findings support the need for standardization and harmonization of REA data collection and use in clinical genetics and precision health research., (Copyright © 2020 The Authors. Published by Elsevier Inc. All rights reserved.)
- Published
- 2020
- Full Text
- View/download PDF
31. Native American gene flow into Polynesia predating Easter Island settlement.
- Author
-
Ioannidis AG, Blanco-Portillo J, Sandoval K, Hagelberg E, Miquel-Poblete JF, Moreno-Mayar JV, Rodríguez-Rodríguez JE, Quinto-Cortés CD, Auckland K, Parks T, Robson K, Hill AVS, Avila-Arcos MC, Sockell A, Homburger JR, Wojcik GL, Barnes KC, Herrera L, Berríos S, Acuña M, Llop E, Eng C, Huntsman S, Burchard EG, Gignoux CR, Cifuentes L, Verdugo RA, Moraga M, Mentzer AJ, Bustamante CD, and Moreno-Estrada A
- Subjects
- Central America ethnology, Colombia ethnology, Europe ethnology, Genetics, Population, History, Medieval, Humans, Polymorphism, Single Nucleotide genetics, Polynesia, South America ethnology, Time Factors, Gene Flow genetics, Genome, Human genetics, Human Migration history, Indians, Central American genetics, Indians, South American genetics, Islands, Native Hawaiian or Other Pacific Islander genetics
- Abstract
The possibility of voyaging contact between prehistoric Polynesian and Native American populations has long intrigued researchers. Proponents have pointed to the existence of New World crops, such as the sweet potato and bottle gourd, in the Polynesian archaeological record, but nowhere else outside the pre-Columbian Americas
1-6 , while critics have argued that these botanical dispersals need not have been human mediated7 . The Norwegian explorer Thor Heyerdahl controversially suggested that prehistoric South American populations had an important role in the settlement of east Polynesia and particularly of Easter Island (Rapa Nui)2 . Several limited molecular genetic studies have reached opposing conclusions, and the possibility continues to be as hotly contested today as it was when first suggested8-12 . Here we analyse genome-wide variation in individuals from islands across Polynesia for signs of Native American admixture, analysing 807 individuals from 17 island populations and 15 Pacific coast Native American groups. We find conclusive evidence for prehistoric contact of Polynesian individuals with Native American individuals (around AD 1200) contemporaneous with the settlement of remote Oceania13-15 . Our analyses suggest strongly that a single contact event occurred in eastern Polynesia, before the settlement of Rapa Nui, between Polynesian individuals and a Native American group most closely related to the indigenous inhabitants of present-day Colombia.- Published
- 2020
- Full Text
- View/download PDF
32. FasTag: Automatic text classification of unstructured medical narratives.
- Author
-
Venkataraman GR, Pineda AL, Bear Don't Walk Iv OJ, Zehnder AM, Ayyar S, Page RL, Bustamante CD, and Rivas MA
- Subjects
- Animals, Automation, Databases as Topic, Humans, Reproducibility of Results, Species Specificity, Data Mining, Narrative Medicine, Software
- Abstract
Unstructured clinical narratives are continuously being recorded as part of delivery of care in electronic health records, and dedicated tagging staff spend considerable effort manually assigning clinical codes for billing purposes. Despite these efforts, however, label availability and accuracy are both suboptimal. In this retrospective study, we aimed to automate the assignment of top-level International Classification of Diseases version 9 (ICD-9) codes to clinical records from human and veterinary data stores using minimal manual labor and feature curation. Automating top-level annotations could in turn enable rapid cohort identification, especially in a veterinary setting. To this end, we trained long short-term memory (LSTM) recurrent neural networks (RNNs) on 52,722 human and 89,591 veterinary records. We investigated the accuracy of both separate-domain and combined-domain models and probed model portability. We established relevant baseline classification performances by training Decision Trees (DT) and Random Forests (RF). We also investigated whether transforming the data using MetaMap Lite, a clinical natural language processing tool, affected classification performance. We showed that the LSTM-RNNs accurately classify veterinary and human text narratives into top-level categories with an average weighted macro F1 score of 0.74 and 0.68 respectively. In the "neoplasia" category, the model trained on veterinary data had a high validation accuracy in veterinary data and moderate accuracy in human data, with F1 scores of 0.91 and 0.70 respectively. Our LSTM method scored slightly higher than that of the DT and RF models. The use of LSTM-RNN models represents a scalable structure that could prove useful in cohort identification for comparative oncology studies. Digitization of human and veterinary health information will continue to be a reality, particularly in the form of unstructured narratives. Our approach is a step forward for these two domains to learn from and inform one another., Competing Interests: CDB is Principal and Chairman of CDB Consulting LTD. He has advised Fauna Bio, Inc., Imprimed, Embark Vet and Etalon DX as a member of their respective Scientific Advisory Boards, and is a Director of Etalon DX. AMZ is the CEO of Fauna Bio, Inc. MAR is on the SAB of 54Gene and has advised BioMarin, MazeTx, Related Sciences, and Goldfinch Bio. ALP declares that the research presented in this study was done while he was employed by Stanford University, but at the time of submission, he is now employed by Genentech, Inc., a member of the Roche group. This does not alter our adherence to PLOS ONE policies on sharing data and materials. The remaining authors declare no conflicts of interest.
- Published
- 2020
- Full Text
- View/download PDF
33. Development of a small panel of SNPs to infer ancestry in Chileans that distinguishes Aymara and Mapuche components.
- Author
-
Verdugo RA, Di Genova A, Herrera L, Moraga M, Acuña M, Berríos S, Llop E, Valenzuela CY, Bustamante ML, Digman D, Symon A, Asenjo S, López P, Blanco A, Suazo J, Barozet E, Caba F, Villalón M, Alvarado S, Cáceres D, Salgado K, Portales P, Moreno-Estrada A, Gignoux CR, Sandoval K, Bustamante CD, Eng C, Huntsman S, Burchard EG, Loira N, Maass A, and Cifuentes L
- Subjects
- Chile, Female, Gene Frequency genetics, Genetic Markers genetics, Genotype, Genotyping Techniques, Humans, Male, Phylogeography, Saliva, Ethnicity genetics, Genetics, Population organization & administration, Indians, South American genetics, Polymorphism, Single Nucleotide genetics, Population Groups genetics
- Abstract
Background: Current South American populations trace their origins mainly to three continental ancestries, i.e. European, Amerindian and African. Individual variation in relative proportions of each of these ancestries may be confounded with socio-economic factors due to population stratification. Therefore, ancestry is a potential confounder variable that should be considered in epidemiologic studies and in public health plans. However, there are few studies that have assessed the ancestry of the current admixed Chilean population. This is partly due to the high cost of genome-scale technologies commonly used to estimate ancestry. In this study we have designed a small panel of SNPs to accurately assess ancestry in the largest sampling to date of the Chilean mestizo population (n = 3349) from eight cities. Our panel is also able to distinguish between the two main Amerindian components of Chileans: Aymara from the north and Mapuche from the south., Results: A panel of 150 ancestry-informative markers (AIMs) of SNP type was selected to maximize ancestry informativeness and genome coverage. Of these, 147 were successfully genotyped by KASPar assays in 2843 samples, with an average missing rate of 0.012, and a 0.95 concordance with microarray data. The ancestries estimated with the panel of AIMs had relative high correlations (0.88 for European, 0.91 for Amerindian, 0.70 for Aymara, and 0.68 for Mapuche components) with those obtained with AXIOM LAT1 array. The country's average ancestry was 0.53 ± 0.14 European, 0.04 ± 0.04 African, and 0.42 ± 0.14 Amerindian, disaggregated into 0.18 ± 0.15 Aymara and 0.25 ± 0.13 Mapuche. However, Mapuche ancestry was highest in the south (40.03%) and Aymara in the north (35.61%) as expected from the historical location of these ethnic groups. We make our results available through an online app and demonstrate how it can be used to adjust for ancestry when testing association between incidence of a disease and nongenetic risk factors., Conclusions: We have conducted the most extensive sampling, across many different cities, of current Chilean population. Ancestry varied significantly by latitude and human development. The panel of AIMs is available to the community for estimating ancestry at low cost in Chileans and other populations with similar ancestry.
- Published
- 2020
- Full Text
- View/download PDF
34. Population History and Gene Divergence in Native Mexicans Inferred from 76 Human Exomes.
- Author
-
Ávila-Arcos MC, McManus KF, Sandoval K, Rodríguez-Rodríguez JE, Villa-Islas V, Martin AR, Luisi P, Peñaloza-Espinosa RI, Eng C, Huntsman S, Burchard EG, Gignoux CR, Bustamante CD, and Moreno-Estrada A
- Subjects
- Exome, Humans, Mexico, Phylogeography, Adaptation, Biological genetics, American Indian or Alaska Native genetics, Evolution, Molecular, Genetic Variation
- Abstract
Native American genetic variation remains underrepresented in most catalogs of human genome sequencing data. Previous genotyping efforts have revealed that Mexico's Indigenous population is highly differentiated and substructured, thus potentially harboring higher proportions of private genetic variants of functional and biomedical relevance. Here we have targeted the coding fraction of the genome and characterized its full site frequency spectrum by sequencing 76 exomes from five Indigenous populations across Mexico. Using diffusion approximations, we modeled the demographic history of Indigenous populations from Mexico with northern and southern ethnic groups splitting 7.2 KYA and subsequently diverging locally 6.5 and 5.7 KYA, respectively. Selection scans for positive selection revealed BCL2L13 and KBTBD8 genes as potential candidates for adaptive evolution in Rarámuris and Triquis, respectively. BCL2L13 is highly expressed in skeletal muscle and could be related to physical endurance, a well-known phenotype of the northern Mexico Rarámuri. The KBTBD8 gene has been associated with idiopathic short stature and we found it to be highly differentiated in Triqui, a southern Indigenous group from Oaxaca whose height is extremely low compared to other Native populations., (© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
- Published
- 2020
- Full Text
- View/download PDF
35. Ancient DNA Reconstructs the Genetic Legacies of Precontact Puerto Rico Communities.
- Author
-
Nieves-Colón MA, Pestle WJ, Reynolds AW, Llamas B, de la Fuente C, Fowler K, Skerry KM, Crespo-Torres E, Bustamante CD, and Stone AC
- Subjects
- Bone and Bones, Fossils, Genetics, Population, Haplotypes, High-Throughput Nucleotide Sequencing, Human Migration, Humans, Puerto Rico ethnology, Tooth, Chromosomes, Human genetics, DNA, Ancient analysis, DNA, Mitochondrial genetics, Dental Calculus genetics, Indigenous Peoples genetics
- Abstract
Indigenous peoples have occupied the island of Puerto Rico since at least 3000 BC. Due to the demographic shifts that occurred after European contact, the origin(s) of these ancient populations, and their genetic relationship to present-day islanders, are unclear. We use ancient DNA to characterize the population history and genetic legacies of precontact Indigenous communities from Puerto Rico. Bone, tooth, and dental calculus samples were collected from 124 individuals from three precontact archaeological sites: Tibes, Punta Candelero, and Paso del Indio. Despite poor DNA preservation, we used target enrichment and high-throughput sequencing to obtain complete mitochondrial genomes (mtDNA) from 45 individuals and autosomal genotypes from two individuals. We found a high proportion of Native American mtDNA haplogroups A2 and C1 in the precontact Puerto Rico sample (40% and 44%, respectively). This distribution, as well as the haplotypes represented, supports a primarily Amazonian South American origin for these populations and mirrors the Native American mtDNA diversity patterns found in present-day islanders. Three mtDNA haplotypes from precontact Puerto Rico persist among Puerto Ricans and other Caribbean islanders, indicating that present-day populations are reservoirs of precontact mtDNA diversity. Lastly, we find similarity in autosomal ancestry patterns between precontact individuals from Puerto Rico and the Bahamas, suggesting a shared component of Indigenous Caribbean ancestry with close affinity to South American populations. Our findings contribute to a more complete reconstruction of precontact Caribbean population history and explore the role of Indigenous peoples in shaping the biocultural diversity of present-day Puerto Ricans and other Caribbean islanders., (© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2020
- Full Text
- View/download PDF
36. LitGen: Genetic Literature Recommendation Guided by Human Explanations.
- Author
-
Nie A, Pineda AL, Wright MW, Wand H, Wulf B, Costa HA, Patel RY, Bustamante CD, and Zou J
- Subjects
- Case-Control Studies, Humans, Computational Biology, Genetic Variation
- Abstract
As genetic sequencing costs decrease, the lack of clinical interpretation of variants has become the bottleneck in using genetics data. A major rate limiting step in clinical interpretation is the manual curation of evidence in the genetic literature by highly trained biocurators. What makes curation particularly time-consuming is that the curator needs to identify papers that study variant pathogenicity using different types of approaches and evidences-e.g. biochemical assays or case control analysis. In collaboration with the Clinical Genomic Resource (ClinGen)-the flagship NIH program for clinical curation-we propose the first machine learning system, LitGen, that can retrieve papers for a particular variant and filter them by specific evidence types used by curators to assess for pathogenicity. LitGen uses semi-supervised deep learning to predict the type of evi+dence provided by each paper. It is trained on papers annotated by ClinGen curators and systematically evaluated on new test data collected by ClinGen. LitGen further leverages rich human explanations and unlabeled data to gain 7.9%-12.6% relative performance improvement over models learned only on the annotated papers. It is a useful framework to improve clinical variant curation.
- Published
- 2020
37. Genetic variation drives seasonal onset of hibernation in the 13-lined ground squirrel.
- Author
-
Grabek KR, Cooke TF, Epperson LE, Spees KK, Cabral GF, Sutton SC, Merriman DK, Martin SL, and Bustamante CD
- Subjects
- Animals, Female, Genetic Loci, Genetics, Population, Genome, Genomics methods, Geography, Inheritance Patterns, Male, Polymorphism, Single Nucleotide, Genetic Variation, Hibernation genetics, Sciuridae physiology, Seasons
- Abstract
Hibernation in sciurid rodents is a dynamic phenotype timed by a circannual clock. When housed in an animal facility, 13-lined ground squirrels exhibit variation in seasonal onset of hibernation, which is not explained by environmental or biological factors. We hypothesized that genetic factors instead drive variation in timing. After increasing genome contiguity, here, we employ a genotype-by-sequencing approach to characterize genetic variation in 153 ground squirrels. Combined with datalogger records ( n = 72), we estimate high heritability (61-100%) for hibernation onset. Applying a genome-wide scan with 46,996 variants, we identify 2 loci significantly ( p < 7.14 × 10
-6 ), and 12 loci suggestively ( p < 2.13 × 10-4 ), associated with onset. At the most significant locus, whole-genome resequencing reveals a putative causal variant in the promoter of FAM204A . Expression quantitative trait loci (eQTL) analyses further reveal gene associations for 8/14 loci. Our results highlight the power of applying genetic mapping to hibernation and present new insight into genetics driving its onset., Competing Interests: Competing interestsThe authors declare no competing non-financial interests but the following competing financial interests: K.G. is co-founder of, equity owner in and chief scientific officer to Fauna Bio Incorporated. S.M. is an equity owner in and advisor to Fauna Bio Incorporated. C.B. is an equity owner in and advisor to both Fauna Bio Incorporated and Dovetail Genomics., (© The Author(s) 2019.)- Published
- 2019
- Full Text
- View/download PDF
38. The inference of sex-biased human demography from whole-genome data.
- Author
-
Musharoff S, Shringarpure S, Bustamante CD, and Ramachandran S
- Subjects
- Bias, Chromosomes, Human, X genetics, Female, Genetic Variation genetics, Genome genetics, Humans, Male, Models, Genetic, Population Density, Selection, Genetic genetics, Whole Genome Sequencing methods, Demography methods, Genetics, Population methods, Sequence Analysis, DNA methods
- Abstract
Sex-biased demographic events ("sex-bias") involve unequal numbers of females and males. These events are typically inferred from the relative amount of X-chromosomal to autosomal genetic variation and have led to conflicting conclusions about human demographic history. Though population size changes alter the relative amount of X-chromosomal to autosomal genetic diversity even in the absence of sex-bias, this has generally not been accounted for in sex-bias estimators to date. Here, we present a novel method to identify sex-bias from genetic sequence data that models population size changes and estimates the female fraction of the effective population size during each time epoch. Compared to recent sex-bias inference methods, our approach can detect sex-bias that changes on a single population branch without requiring data from an outgroup or knowledge of divergence events. When applied to simulated data, conventional sex-bias estimators are biased by population size changes, especially recent growth or bottlenecks, while our estimator is unbiased. We next apply our method to high-coverage exome data from the 1000 Genomes Project and estimate a male bias in Yorubans (47% female) and Europeans (44%), possibly due to stronger background selection on the X chromosome than on the autosomes. Finally, we apply our method to the 1000 Genomes Project Phase 3 high-coverage Complete Genomics whole-genome data and estimate a female bias in Yorubans (63% female), Europeans (84%), Punjabis (82%), as well as Peruvians (56%), and a male bias in the Southern Han Chinese (45%). Our method additionally identifies a male-biased migration out of Africa based on data from Europeans (20% female). Our results demonstrate that modeling population size change is necessary to estimate sex-bias parameters accurately. Our approach gives insight into signatures of sex-bias in sexual species, and the demographic models it produces can serve as more accurate null models for tests of selection., Competing Interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: CDB is the Owner and President of CDB Consulting, LTD. and also a Director at EdenRoc Sciences, LLC and Etalon DX, founder of Arc Bio LLC (formerly IdentifyGenomics LLC and BigData Bio LLC), and an SAB member of Imprimed, FaunaBio, Columbia Care, and Digitalis Ventures. He is also a Venture Partner at F-Prime Capital Partners. None of these entities played a role in the design, execution, or interpretation of experiments or the results presented here.
- Published
- 2019
- Full Text
- View/download PDF
39. Genomic Evidence for Local Adaptation of Hunter-Gatherers to the African Rainforest.
- Author
-
Lopez M, Choin J, Sikora M, Siddle K, Harmant C, Costa HA, Silvert M, Mouguiama-Daouda P, Hombert JM, Froment A, Le Bomin S, Perry GH, Barreiro LB, Bustamante CD, Verdu P, Patin E, and Quintana-Murci L
- Subjects
- Cameroon, Farmers, Gabon, Genome, Human, Humans, Rainforest, Regulatory Sequences, Nucleic Acid, Repressor Proteins genetics, Uganda, Adaptation, Biological, Gene Flow, Life Style, Multifactorial Inheritance
- Abstract
African rainforests support exceptionally high biodiversity and host the world's largest number of active hunter-gatherers [1-3]. The genetic history of African rainforest hunter-gatherers and neighboring farmers is characterized by an ancient divergence more than 100,000 years ago, together with recent population collapses and expansions, respectively [4-12]. While the demographic past of rainforest hunter-gatherers has been deeply characterized, important aspects of their history of genetic adaptation remain unclear. Here, we investigated how these groups have adapted-through classic selective sweeps, polygenic adaptation, and selection since admixture-to the challenging rainforest environments. To do so, we analyzed a combined dataset of 566 high-coverage exomes, including 266 newly generated exomes, from 14 populations of rainforest hunter-gatherers and farmers, together with 40 newly generated, low-coverage genomes. We find evidence for a strong, shared selective sweep among all hunter-gatherer groups in the regulatory region of TRPS1-primarily involved in morphological traits. We detect strong signals of polygenic adaptation for height and life history traits such as reproductive age; however, the latter appear to result from pervasive pleiotropy of height-associated genes. Furthermore, polygenic adaptation signals for functions related to responses of mast cells to allergens and microbes, the IL-2 signaling pathway, and host interactions with viruses support a history of pathogen-driven selection in the rainforest. Finally, we find that genes involved in heart and bone development and immune responses are enriched in both selection signals and local hunter-gatherer ancestry in admixed populations, suggesting that selection has maintained adaptive variation in the face of recent gene flow from farmers., (Copyright © 2019 Elsevier Ltd. All rights reserved.)
- Published
- 2019
- Full Text
- View/download PDF
40. Xrare: a machine learning method jointly modeling phenotypes and genetic evidence for rare disease diagnosis.
- Author
-
Li Q, Zhao K, Bustamante CD, Ma X, and Wong WH
- Subjects
- Computational Biology, Exome genetics, Genetic Testing, Genetic Variation genetics, Genotype, High-Throughput Nucleotide Sequencing, Humans, Mutation, Phenotype, Rare Diseases genetics, Genomics methods, Machine Learning, Rare Diseases diagnosis, Software
- Abstract
Purpose: Despite the successful progress next-generation sequencing technologies has achieved in diagnosing the genetic cause of rare Mendelian diseases, the current diagnostic rate is still far from satisfactory because of heterogeneity, imprecision, and noise in disease phenotype descriptions and insufficient utilization of expert knowledge in clinical genetics. To overcome these difficulties, we present a novel method called Xrare for the prioritization of causative gene variants in rare disease diagnosis., Methods: We propose a new phenotype similarity scoring method called Emission-Reception Information Content (ERIC), which is highly tolerant of noise and imprecision in clinical phenotypes. We utilize medical genetic domain knowledge by designing genetic features implementing American College of Medical Genetics and Genomics (ACMG) guidelines., Results: ERIC score ranked consistently higher for disease genes than other phenotypic similarity scores in the presence of imprecise and noisy phenotypes. Extensive simulations and real clinical data demonstrated that Xrare outperforms existing alternative methods by 10-40% at various genetic diagnosis scenarios., Conclusion: The Xrare model is learned from a large database of clinical variants, and derives its strength from the tight integration of medical genetics features and phenotypic features similarity scores. Xrare provides the clinical community with a robust and powerful tool for variant prioritization.
- Published
- 2019
- Full Text
- View/download PDF
41. Genetic analyses of diverse populations improves discovery for complex traits.
- Author
-
Wojcik GL, Graff M, Nishimura KK, Tao R, Haessler J, Gignoux CR, Highland HM, Patel YM, Sorokin EP, Avery CL, Belbin GM, Bien SA, Cheng I, Cullina S, Hodonsky CJ, Hu Y, Huckins LM, Jeff J, Justice AE, Kocarnik JM, Lim U, Lin BM, Lu Y, Nelson SC, Park SL, Poisner H, Preuss MH, Richard MA, Schurmann C, Setiawan VW, Sockell A, Vahi K, Verbanck M, Vishnu A, Walker RW, Young KL, Zubair N, Acuña-Alonso V, Ambite JL, Barnes KC, Boerwinkle E, Bottinger EP, Bustamante CD, Caberto C, Canizales-Quinteros S, Conomos MP, Deelman E, Do R, Doheny K, Fernández-Rhodes L, Fornage M, Hailu B, Heiss G, Henn BM, Hindorff LA, Jackson RD, Laurie CA, Laurie CC, Li Y, Lin DY, Moreno-Estrada A, Nadkarni G, Norman PJ, Pooler LC, Reiner AP, Romm J, Sabatti C, Sandoval K, Sheng X, Stahl EA, Stram DO, Thornton TA, Wassel CL, Wilkens LR, Winkler CA, Yoneyama S, Buyske S, Haiman CA, Kooperberg C, Le Marchand L, Loos RJF, Matise TC, North KE, Peters U, Kenny EE, and Carlson CS
- Subjects
- Body Height genetics, Cohort Studies, Female, Genetics, Medical methods, Health Equity trends, Health Status Disparities, Humans, Male, United States, Black or African American, Asian People genetics, Black People genetics, Genome-Wide Association Study methods, Hispanic or Latino genetics, Minority Groups, Multifactorial Inheritance genetics, Women's Health
- Abstract
Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry
1-3 . In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10 . Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12 . Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13 -the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.- Published
- 2019
- Full Text
- View/download PDF
42. Standardized Biogeographic Grouping System for Annotating Populations in Pharmacogenetic Research.
- Author
-
Huddart R, Fohner AE, Whirl-Carrillo M, Wojcik GL, Gignoux CR, Popejoy AB, Bustamante CD, Altman RB, and Klein TE
- Subjects
- Classification, Gene Frequency, Genetic Variation, Humans, Pharmacogenomic Testing, Topography, Medical, Genetics, Population methods, Geographic Mapping, Pharmacogenetics methods, Population Groups classification, Population Groups genetics
- Abstract
The varying frequencies of pharmacogenetic alleles among populations have important implications for the impact of these alleles in different populations. Current population grouping methods to communicate these patterns are insufficient as they are inconsistent and fail to reflect the global distribution of genetic variability. To facilitate and standardize the reporting of variability in pharmacogenetic allele frequencies, we present seven geographically defined groups: American, Central/South Asian, East Asian, European, Near Eastern, Oceanian, and Sub-Saharan African, and two admixed groups: African American/Afro-Caribbean and Latino. These nine groups are defined by global autosomal genetic structure and based on data from large-scale sequencing initiatives. We recognize that broadly grouping global populations is an oversimplification of human diversity and does not capture complex social and cultural identity. However, these groups meet a key need in pharmacogenetics research by enabling consistent communication of the scale of variability in global allele frequencies and are now used by Pharmacogenomics Knowledgebase (PharmGKB)., (© 2018 The Authors Clinical Pharmacology & Therapeutics © 2018 American Society for Clinical Pharmacology and Therapeutics.)
- Published
- 2019
- Full Text
- View/download PDF
43. Structural Variation Detection by Proximity Ligation from Formalin-Fixed, Paraffin-Embedded Tumor Tissue.
- Author
-
Troll CJ, Putnam NH, Hartley PD, Rice B, Blanchette M, Siddiqui S, Ganbat JO, Powers MP, Ramakrishnan R, Kunder CA, Bustamante CD, Zehnder JL, Green RE, and Costa HA
- Subjects
- DNA, Neoplasm genetics, Gene Rearrangement genetics, Humans, Neoplasms genetics, Paraffin Embedding methods, Tissue Fixation methods
- Abstract
The clinical management and therapy of many solid tumor malignancies depends on detection of medically actionable or diagnostically relevant genetic variation. However, a principal challenge for genetic assays from tumors is the fragmented and chemically damaged state of DNA in formalin-fixed, paraffin-embedded (FFPE) samples. From highly fragmented DNA and RNA there is no current technology for generating long-range DNA sequence data as is required to detect genomic structural variation or long-range genotype phasing. We have developed a high-throughput chromosome conformation capture approach for FFPE samples that we call Fix-C, which is similar in concept to Hi-C. Fix-C enables structural variation detection from archival FFPE samples. This method was applied to 15 clinical adenocarcinoma- and sarcoma-positive control specimens spanning a broad range of tumor purities. In this panel, Fix-C analysis achieves a 90% concordance rate with fluorescence in situ hybridization assays, the current clinical gold standard. In addition, novel structural variation undetected by other methods could be identified, and long-range chromatin configuration information recovered from these FFPE samples harboring highly degraded DNA. This powerful approach will enable detailed resolution of global genome rearrangement events during cancer progression from FFPE material and will inform the development of targeted molecular diagnostic assays for patient care., (Copyright © 2019 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.)
- Published
- 2019
- Full Text
- View/download PDF
44. A genetic counseling needs assessment of Mexico.
- Author
-
Bucio D, Ormond KE, Hernandez D, Bustamante CD, and Lopez Pineda A
- Subjects
- Health Workforce statistics & numerical data, Mexico, Education, Medical, Graduate statistics & numerical data, Facilities and Services Utilization statistics & numerical data, Genetic Counseling statistics & numerical data, Needs Assessment statistics & numerical data
- Abstract
Background: While genetic counseling has expanded globally, Mexico has not adopted it as a separate profession. Given the rapid expansion of genetic and genomic services, understanding the current genetic counseling landscape in Mexico is crucial to improving healthcare outcomes., Methods: Our needs assessment strategy has two components. First, we gathered quantitative data about genetics education and medical geneticists' geographic distribution through an exhaustive compilation of available information across several medical schools and public databases. Second, we conducted semi-structured interviews of 19 key-informants from 10 Mexican states remotely with digital recording and transcription., Results: Across 32 states, ~54% of enrolled medical students receive no medical genetics training, and only Mexico City averages at least one medical geneticist per 100,000 people. Barriers to genetic counseling services include: geographic distribution of medical geneticists, lack of access to diagnostic tools, patient health literacy and cultural beliefs, and education in medical genetics/genetic counseling. Participants reported generally positive attitudes towards a genetic counseling profession; concerns regarding a current shortage of available jobs for medical geneticists persisted., Conclusion: To create a foundation that can support a genetic counseling profession in Mexico, the clinical significance of medical genetics must be promoted nationwide. Potential approaches include: requiring medical genetics coursework, developing community genetics services, and increasing jobs for medical geneticists., (© 2019 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.)
- Published
- 2019
- Full Text
- View/download PDF
45. Mitogenomes illuminate the origin and migration patterns of the indigenous people of the Canary Islands.
- Author
-
Fregel R, Ordóñez AC, Santana-Cabrera J, Cabrera VM, Velasco-Vázquez J, Alberto V, Moreno-Benítez MA, Delgado-Darias T, Rodríguez-Rodríguez A, Hernández JC, Pais J, González-Montelongo R, Lorenzo-Salazar JM, Flores C, Cruz-de-Mercadal MC, Álvarez-Rodríguez N, Shapiro B, Arnay M, and Bustamante CD
- Subjects
- Africa, Northern ethnology, Europe ethnology, Genetic Drift, Genetics, Population, Genome, Mitochondrial, Humans, Middle East, Phylogeography, Sequence Analysis, DNA, Spain ethnology, Ethnicity genetics, High-Throughput Nucleotide Sequencing methods, Mitochondria genetics, Transients and Migrants classification
- Abstract
The Canary Islands' indigenous people have been the subject of substantial archaeological, anthropological, linguistic and genetic research pointing to a most probable North African Berber source. However, neither agreement about the exact point of origin nor a model for the indigenous colonization of the islands has been established. To shed light on these questions, we analyzed 48 ancient mitogenomes from 25 archaeological sites from the seven main islands. Most lineages observed in the ancient samples have a Mediterranean distribution, and belong to lineages associated with the Neolithic expansion in the Near East and Europe (T2c, J2a, X3a…). This phylogeographic analysis of Canarian ancient mitogenomes, the first of its kind, shows that some lineages are restricted to Central North Africa (H1cf, J2a2d and T2c1d3), while others have a wider distribution, including both West and Central North Africa, and, in some cases, Europe and the Near East (U6a1a1, U6a7a1, U6b, X3a, U6c1). In addition, we identify four new Canarian-specific lineages (H1e1a9, H4a1e, J2a2d1a and L3b1a12) whose coalescence dates correlate with the estimated time for the colonization of the islands (1st millennia CE). Additionally, we observe an asymmetrical distribution of mtDNA haplogroups in the ancient population, with certain haplogroups appearing more frequently in the islands closer to the continent. This reinforces results based on modern mtDNA and Y-chromosome data, and archaeological evidence suggesting the existence of two distinct migrations. Comparisons between insular populations show that some populations had high genetic diversity, while others were probably affected by genetic drift and/or bottlenecks. In spite of observing interinsular differences in the survival of indigenous lineages, modern populations, with the sole exception of La Gomera, are homogenous across the islands, supporting the theory of extensive human mobility after the European conquest., Competing Interests: Tibicena Arqueología y Patrimonio provided support in the form of salaries for authors V.A. and M.A.M.B and radiocarbon dating of four archaeological samples. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
- Published
- 2019
- Full Text
- View/download PDF
46. An admixture mapping meta-analysis implicates genetic variation at 18q21 with asthma susceptibility in Latinos.
- Author
-
Gignoux CR, Torgerson DG, Pino-Yanes M, Uricchio LH, Galanter J, Roth LA, Eng C, Hu D, Nguyen EA, Huntsman S, Mathias RA, Kumar R, Rodriguez-Santana J, Thakur N, Oh SS, McGarry M, Moreno-Estrada A, Sandoval K, Winkler CA, Seibold MA, Padhukasahasram B, Conti DV, Farber HJ, Avila P, Brigino-Buenaventura E, Lenoir M, Meade K, Serebrisky D, Borrell LN, Rodriguez-Cintron W, Thyne S, Joubert BR, Romieu I, Levin AM, Sienra-Monge JJ, Del Rio-Navarro BE, Gan W, Raby BA, Weiss ST, Bleecker E, Meyers DA, Martinez FJ, Gauderman WJ, Gilliland F, London SJ, Bustamante CD, Nicolae DL, Ober C, Sen S, Barnes K, Williams LK, Hernandez RD, and Burchard EG
- Subjects
- Chromosome Mapping, Humans, Polymorphism, Single Nucleotide, Asthma genetics, Chromosomes, Human, Pair 18, Genetic Predisposition to Disease, Hispanic or Latino genetics, Smad2 Protein genetics
- Abstract
Background: Asthma is a common but complex disease with racial/ethnic differences in prevalence, morbidity, and response to therapies., Objective: We sought to perform an analysis of genetic ancestry to identify new loci that contribute to asthma susceptibility., Methods: We leveraged the mixed ancestry of 3902 Latinos and performed an admixture mapping meta-analysis for asthma susceptibility. We replicated associations in an independent study of 3774 Latinos, performed targeted sequencing for fine mapping, and tested for disease correlations with gene expression in the whole blood of more than 500 subjects from 3 racial/ethnic groups., Results: We identified a genome-wide significant admixture mapping peak at 18q21 in Latinos (P = 6.8 × 10
-6 ), where Native American ancestry was associated with increased risk of asthma (odds ratio [OR], 1.20; 95% CI, 1.07-1.34; P = .002) and European ancestry was associated with protection (OR, 0.86; 95% CI, 0.77-0.96; P = .008). Our findings were replicated in an independent childhood asthma study in Latinos (P = 5.3 × 10-3 , combined P = 2.6 × 10-7 ). Fine mapping of 18q21 in 1978 Latinos identified a significant association with multiple variants 5' of SMAD family member 2 (SMAD2) in Mexicans, whereas a single rare variant in the same window was the top association in Puerto Ricans. Low versus high SMAD2 blood expression was correlated with case status (13.4% lower expression; OR, 3.93; 95% CI, 2.12-7.28; P < .001). In addition, lower expression of SMAD2 was associated with more frequent exacerbations among Puerto Ricans with asthma., Conclusion: Ancestry at 18q21 was significantly associated with asthma in Latinos and implicated multiple ancestry-informative noncoding variants upstream of SMAD2 with asthma susceptibility. Furthermore, decreased SMAD2 expression in blood was strongly associated with increased asthma risk and increased exacerbations., (Copyright © 2018. Published by Elsevier Inc.)- Published
- 2019
- Full Text
- View/download PDF
47. Polygenic risk scores: a biased prediction?
- Author
-
De La Vega FM and Bustamante CD
- Subjects
- Humans, Sequence Analysis, DNA, Genetic Predisposition to Disease, Genome-Wide Association Study methods, Multifactorial Inheritance, Polymorphism, Single Nucleotide
- Abstract
A new study highlights the biases and inaccuracies of polygenic risk scores (PRS) when predicting disease risk in individuals from populations other than those used in their derivation. The design bias of workhorse tools used for research, particularly genotyping arrays, contributes to these distortions. To avoid further inequities in health outcomes, the inclusion of diverse populations in research, unbiased genotyping, and methods of bias reduction in PRS are critical.
- Published
- 2018
- Full Text
- View/download PDF
48. Rapid evolution of a skin-lightening allele in southern African KhoeSan.
- Author
-
Lin M, Siford RL, Martin AR, Nakagome S, Möller M, Hoal EG, Bustamante CD, Gignoux CR, and Henn BM
- Subjects
- Adult, Africa, Southern, Alleles, Antiporters metabolism, Asian People genetics, Black People genetics, Demography methods, Evolution, Molecular, Female, Gene Flow, Genetic Variation genetics, Genetics, Population methods, Genotype, Haplotypes, Humans, Male, Phenotype, Phylogeny, Polymorphism, Single Nucleotide genetics, White People genetics, Antiporters genetics, Skin Pigmentation genetics
- Abstract
Skin pigmentation is under strong directional selection in northern European and Asian populations. The indigenous KhoeSan populations of far southern Africa have lighter skin than other sub-Saharan African populations, potentially reflecting local adaptation to a region of Africa with reduced UV radiation. Here, we demonstrate that a canonical Eurasian skin pigmentation gene, SLC24A5 , was introduced to southern Africa via recent migration and experienced strong adaptive evolution in the KhoeSan. To reconstruct the evolution of skin pigmentation, we collected phenotypes from over 400 ≠Khomani San and Nama individuals and high-throughput sequenced candidate pigmentation genes. The derived causal allele in SLC24A5 , p.Ala111Thr, significantly lightens basal skin pigmentation in the KhoeSan and explains 8 to 15% of phenotypic variance in these populations. The frequency of this allele (33 to 53%) is far greater than expected from colonial period European gene flow; however, the most common derived haplotype is identical among European, eastern African, and KhoeSan individuals. Using four-population demographic simulations with selection, we show that the allele was introduced into the KhoeSan only 2,000 y ago via a back-to-Africa migration and then experienced a selective sweep (s = 0.04 to 0.05 in ≠Khomani and Nama). The SLC24A5 locus is both a rare example of intense, ongoing adaptation in very recent human history, as well as an adaptive gene flow at a pigmentation locus in humans., Competing Interests: The authors declare no conflict of interest.
- Published
- 2018
- Full Text
- View/download PDF
49. Gut microbiome transition across a lifestyle gradient in Himalaya.
- Author
-
Jha AR, Davenport ER, Gautam Y, Bhandari D, Tandukar S, Ng KM, Fragiadakis GK, Holmes S, Gautam GP, Leach J, Sherchand JB, Bustamante CD, and Sonnenburg JL
- Subjects
- Adult, Bacteria genetics, Diet, Diet, Paleolithic, Feces microbiology, Female, Gastrointestinal Microbiome physiology, Genetics, Population methods, Geography, Humans, Male, Middle Aged, Nepal ethnology, RNA, Ribosomal, 16S genetics, Rural Population, Gastrointestinal Microbiome genetics, Life Style ethnology, Microbiota genetics
- Abstract
The composition of the gut microbiome in industrialized populations differs from those living traditional lifestyles. However, it has been difficult to separate the contributions of human genetic and geographic factors from lifestyle. Whether shifts away from the foraging lifestyle that characterize much of humanity's past influence the gut microbiome, and to what degree, remains unclear. Here, we characterize the stool bacterial composition of four Himalayan populations to investigate how the gut community changes in response to shifts in traditional human lifestyles. These groups led seminomadic hunting-gathering lifestyles until transitioning to varying levels of agricultural dependence upon farming. The Tharu began farming 250-300 years ago, the Raute and Raji transitioned 30-40 years ago, and the Chepang retain many aspects of a foraging lifestyle. We assess the contributions of dietary and environmental factors on their gut-associated microbes and find that differences in the lifestyles of Himalayan foragers and farmers are strongly correlated with microbial community variation. Furthermore, the gut microbiomes of all four traditional Himalayan populations are distinct from that of the Americans, indicating that industrialization may further exacerbate differences in the gut community. The Chepang foragers harbor an elevated abundance of taxa associated with foragers around the world. Conversely, the gut microbiomes of the populations that have transitioned to farming are more similar to those of Americans, with agricultural dependence and several associated lifestyle and environmental factors correlating with the extent of microbiome divergence from the foraging population. The gut microbiomes of Raute and Raji reveal an intermediate state between the Chepang and Tharu, indicating that divergence from a stereotypical foraging microbiome can occur within a single generation. Our results also show that environmental factors such as drinking water source and solid cooking fuel are significantly associated with the gut microbiome. Despite the pronounced differences in gut bacterial composition across populations, we found little differences in alpha diversity across lifestyles. These findings in genetically similar populations living in the same geographical region establish the key role of lifestyle in determining human gut microbiome composition and point to the next challenging steps of determining how large-scale gut microbiome reconfiguration impacts human biology., Competing Interests: The authors have declared that no competing interests exist.
- Published
- 2018
- Full Text
- View/download PDF
50. The clinical imperative for inclusivity: Race, ethnicity, and ancestry (REA) in genomics.
- Author
-
Popejoy AB, Ritter DI, Crooks K, Currey E, Fullerton SM, Hindorff LA, Koenig B, Ramos EM, Sorokin EP, Wand H, Wright MW, Zou J, Gignoux CR, Bonham VL, Plon SE, and Bustamante CD
- Subjects
- Alleles, Ethnicity, Genetic Testing methods, Genomics methods, Humans, Mutation, Prohibitins, Genetic Variation genetics
- Abstract
The Clinical Genome Resource (ClinGen) Ancestry and Diversity Working Group highlights the need to develop guidance on race, ethnicity, and ancestry (REA) data collection and use in clinical genomics. We present quantitative and qualitative evidence to characterize: (1) acquisition of REA data via clinical laboratory requisition forms, and (2) information disparity across populations in the Genome Aggregation Database (gnomAD) at clinically relevant sites ascertained from annotations in ClinVar. Our requisition form analysis showed substantial heterogeneity in clinical laboratory ascertainment of REA, as well as marked incongruity among terms used to define REA categories. There was also striking disparity across REA populations in the amount of information available about clinically relevant variants in gnomAD. European ancestral populations constituted the majority of observations (55.8%), allele counts (59.7%), and private alleles (56.1%) in gnomAD at 550 loci with "pathogenic" and "likely pathogenic" expert-reviewed variants in ClinVar. Our findings highlight the importance of implementing and supporting programs to increase diversity in genome sequencing and clinical genomics, as well as measuring uncertainty around population-level datasets that are used in variant interpretation. Finally, we suggest the need for a standardized REA data collection framework to be developed through partnerships and collaborations and adopted across clinical genomics., (© 2018 Wiley Periodicals, Inc.)
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.