1. Harmonizing Genetic Ancestry and Self-identified Race/Ethnicity in Genome-wide Association Studies
- Author
-
Themistocles L. Assimes, P. S. Sriram, Qin Hui, Stephen Mastorides, Zuhair Ballas, Yan V. Sun, Hua Tang, Marijana Vujkovic, Ronald G Washburn, Samuel M. Aguayo, Jennifer Moser, Gwenevere Anderson, Mary A. Whooley, Sumitra Muralidhar, Agnes Wallbom, Adriana M. Hung, Xuan-Mai T. Nguyen, Huaying Fang, Jennifer Greco, Rachel B. Ramoni, Amparo Gutierrez, Saiju Pyarajan, Stuart R. Warren, Rob Striker, Pran Iruvanti, Mark B. Hamner, Scott L. DuVall, Elizabeth R. Hauser, Christopher J. O'Donnell, Donald E. Humphries, Jon B. Klein, Nora R. Ratcliffe, John M. Wells, Maureen Murdoch, Gerardo Villareal, Laurence Kaminsky, Peter W. F. Wilson, Mary E. Oehlert, Mary Brophy, Stacey B. Whitbourne, Louis J. Dell’Italia, Grant D. Huang, Ronald Fernando, Dean P. Argyres, Jie Huang, Hongyu Zhao, Scott Kinlay, Kelly Cho, Jeff Whittle, Scott M. Damrauer, Jacqueline Honerlaw, Sunil K. Ahuja, Laurence Meyer, Brooks Robey, John B. Harley, Gretchen Gibson, John Concato, Rachel McArdle, David Cohen, Krisann K. Oursler, Robin A. Hurley, Sujata Bhushan, Salvador Gutierrez, D Jhala, John J. Callaghan, Ron B. Schifman, Nhan Do, Junzhe Xu, Jim Breeling, Jessica V. Brewer, Elif Sonel, Kyong-Mi Chang, Peter W.F. Wilson, Brady Stephens, Shahpoor Shayan, Philip S. Tsao, Joseph Fayad, Michael Godschalk, Jack H. Lichy, Scott Sherman, Shing Shing Yeh, Alan Swann, Michael Rauchman, Samir Gupta, Satish C. Sharma, Edward Boyko, J. Michael Gaziano, Julie Lynch, Timothy R. Morgan, Jean C. Beckham, Todd Stapley, Malcolm Buford, Richard J. Servatius, Hermes Florez, and Kathlyn Sue Haddock
- Subjects
0303 health sciences ,Genetic diversity ,Support Vector Machine ,Genetic genealogy ,Racial Groups ,Ethnic group ,Genome-wide association study ,Article ,Genetic architecture ,Machine Learning ,03 medical and health sciences ,Race (biology) ,0302 clinical medicine ,Evolutionary biology ,Ethnicity ,Genetics ,Trait ,Humans ,Psychology ,Algorithms ,030217 neurology & neurosurgery ,Genetics (clinical) ,Genome-Wide Association Study ,030304 developmental biology ,Genetic association - Abstract
Large-scale multi-ethnic cohorts offer unprecedented opportunities to elucidate the genetic factors influencing complex traits related to health and disease among minority populations. At the same time, the genetic diversity in these cohorts presents new challenges for analysis and interpretation. We consider the utility of race and/or ethnicity categories in genome-wide association studies (GWASs) of multi-ethnic cohorts. We demonstrate that race/ethnicity information enhances the ability to understand population-specific genetic architecture. To address the practical issue that self-identified racial/ethnic information may be incomplete, we propose a machine learning algorithm that produces a surrogate variable, termed HARE. We use height as a model trait to demonstrate the utility of HARE and ethnicity-specific GWASs.
- Published
- 2019