1. Examining the Impact of Imputation Errors on Fine-Mapping Using DNA Methylation QTL as a Model Trait
- Author
-
Chundru, V Kartik, Marioni, Riccardo E, Prendergast, James G D, Vallerga, Costanza L, Lin, Tian, Berveridge, Allan J, Consortium, Sgpd, Gratten, Jacob, Hume, David A, Deary, Ian J, Wray, Naomi R, Visscher, Peter M, and McRae, Allan F
- Subjects
Linkage disequilibrium ,Quantitative Trait Loci ,Bayesian probability ,imputation ,Investigations ,Quantitative trait locus ,Biology ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,Quantitative Trait, Heritable ,Genetics ,Humans ,1000 Genomes Project ,Allele ,Association mapping ,030304 developmental biology ,0303 health sciences ,Whole Genome Sequencing ,030305 genetics & heredity ,Haplotype ,Reproducibility of Results ,DNA Methylation ,Reference Standards ,fine-mapping ,CpG-SNPs ,CpG Islands ,DNA-methylation ,Statistical Genetics and Genomics ,Imputation (genetics) ,Genome-Wide Association Study - Abstract
This study highlights dangers in over-interpreting fine-mapping results. Chundru et al. show that genotype imputation accuracy has a large impact on fine-mapping accuracy. They used DNA methylation at CpG-sites with a variant..., Genetic variants disrupting DNA methylation at CpG dinucleotides (CpG-SNP) provide a set of known causal variants to serve as models to test fine-mapping methodology. We use 1716 CpG-SNPs to test three fine-mapping approaches (Bayesian imputation-based association mapping, Bayesian sparse linear mixed model, and the J-test), assessing the impact of imputation errors and the choice of reference panel by using both whole-genome sequence (WGS), and genotype array data on the same individuals (n = 1166). The choice of imputation reference panel had a strong effect on imputation accuracy, with the 1000 Genomes Project Phase 3 (1000G) reference panel (n = 2504 from 26 populations) giving a mean nonreference discordance rate between imputed and sequenced genotypes of 3.2% compared to 1.6% when using the Haplotype Reference Consortium (HRC) reference panel (n = 32,470 Europeans). These imputation errors had an impact on whether the CpG-SNP was included in the 95% credible set, with a difference of ∼23% and ∼7% between the WGS and the 1000G and HRC imputed datasets, respectively. All of the fine-mapping methods failed to reach the expected 95% coverage of the CpG-SNP. This is attributed to secondary cis genetic effects that are unable to be statistically separated from the CpG-SNP, and through a masking mechanism where the effect of the methylation disrupting allele at the CpG-SNP is hidden by the effect of a nearby SNP that has strong linkage disequilibrium with the CpG-SNP. The reduced accuracy in fine-mapping a known causal variant in a low-level biological trait with imputed genetic data has implications for the study of higher-order complex traits and disease.
- Published
- 2019