1. Enhanced statistical tests for GWAS in admixed populations: assessment using African Americans from CARe and a Breast Cancer Consortium
- Author
-
David S. Siscovick, Cameron D. Palmer, Regina G. Ziegler, Sandra L. Deming, Sue A. Ingles, Stephen J. Chanock, Wei Zheng, Christine B. Ambrosone, Esther M. John, Sarah G. Buxbaum, Noah Zaitlen, Bogdan Pasaniuc, Sarah J. Nyante, Robert C. Millikan, Jorge L. Rodriguez-Gil, Ermeg L. Akylbekova, Alkes L. Price, Arti Tandon, Guillaume Lettre, Leslie Bernstein, Elisa V. Bandera, George J. Papanicolaou, Gary K. Chen, Emma K. Larkin, Mingyao Li, Joe Mychaleckyj, Nick Patterson, James G. Wilson, L. Adrienne Cupples, Solomon K. Musani, Michael F. Press, David Reich, Myriam Fornage, Leslie A. Lange, Qiong Yang, Jennifer J. Hu, Simon Myers, Joel N. Hirschhorn, Jasmin Divers, Brian E. Henderson, Ingo Ruczinski, Lynette Ekunwe, W. H. Linda Kao, Christopher A. Haiman, Xiaofeng Zhu, and Schork, Nicholas J
- Subjects
Male ,Cancer Research ,Linkage disequilibrium ,Fibroblast Growth Factor ,Coronary Disease ,Genome-wide association study ,Linkage Disequilibrium ,0302 clinical medicine ,Gene Frequency ,Odds Ratio ,Genetics and Genomics/Genetics of Disease ,Genetics (clinical) ,Cancer ,Genetics and Genomics/Medical Genetics ,African Americans ,Genetics ,Principal Component Analysis ,0303 health sciences ,Genome ,Chromosome Mapping ,Single Nucleotide ,3. Good health ,Phenotype ,030220 oncology & carcinogenesis ,Genetics and Genomics/Gene Discovery ,Female ,Type 2 ,Algorithms ,Research Article ,Human ,Receptor ,lcsh:QH426-470 ,Genotype ,Population ,Breast Neoplasms ,Locus (genetics) ,Single-nucleotide polymorphism ,Computational biology ,Genetics and Genomics/Complex Traits ,Biology ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,Genetics and Genomics/Population Genetics ,Breast Cancer ,Diabetes Mellitus ,Humans ,SNP ,Receptor, Fibroblast Growth Factor, Type 2 ,Polymorphism ,Molecular Biology ,Allele frequency ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Genetic association ,Genome, Human ,Human Genome ,Genetic Variation ,Black or African American ,lcsh:Genetics ,Genetics, Population ,Genetics and Genomics/Disease Models ,Diabetes Mellitus, Type 2 ,Mathematics/Statistics ,Software ,Imputation (genetics) ,Genome-Wide Association Study ,Developmental Biology - Abstract
While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations., Author Summary This paper presents improved methodologies for the analysis of genome-wide association studies in admixed populations, which are populations that came about by the mixing of two or more distant continental populations over a few hundred years (e.g., African Americans or Latinos). Studies of admixed populations offer the promise of capturing additional genetic diversity compared to studies over homogeneous populations such as Europeans. In admixed populations, correlation between genetic variants exists both at a fine scale in the ancestral populations and at a coarse scale due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered either one or the other type of correlation, but not both. In this work we develop novel statistical methods that account for both types of genetic correlation, and we show that the combined approach attains greater statistical power than that achieved by applying either approach separately. We provide analysis of simulated and real data from major studies performed in African-American men and women to show the improvement obtained by our methods over the standard methods for analyzing association studies in admixed populations.
- Published
- 2016