Back to Search
Start Over
Genotype Imputation and Evaluation of Single Nucleotide Polymorphisms in 437 Women of East Asian Descent
- Publication Year :
- 2019
- Publisher :
- The University of North Carolina at Chapel Hill University Libraries, 2019.
-
Abstract
- As a result of increasing interests in discovering single nucleotide polymorphisms (SNPs) associated complex diseases and high cost of existing sequencing technology, genotype imputation is developed as a statistical method to overcome the limitations of current sequencing technology and to increase the power of the association method of Genome-Wide Association Studies (GWAS). In this paper, genotype imputation is conducted on Tianjin sample with two reference panels; the 1000 Genomes reference panel and the TOPMed reference panel. First, pre-imputation quality control is applied to remove individuals or genetic markers that may induce high error rates when conducting imputation. Principal component analysis was conducted to show the East Asian ancestry of the sample. Using Minimac3, imputation was performed on a sample of 437 individuals with 499,148 genetic variants after haplotype inference with Eagle software from Illumina 660W. Approximately 47 million and 88 million genetic variants were imputed using 1000G and TOPMed reference panels respectively. The estimated squared Pearson's correlation (R2) was used to determine which of the imputed SNPs passed the post-imputation quality control. Approximately 9.5 million imputed SNPs from the 1000G reference panel and 11 million imputed SNPs from the TOPMed reference panel exceeded the R2 threshold. To assess imputation quality, imputation was again performed on the original 437 individuals, but with 5% of the directly genotyped genetic variants randomly masked. The imputed variants for chromosomes 1, 11, and 21 were selected to calculate the true squared Pearson's correlation; and the masking results for the two reference panels were compared to determine which reference panel is more suited for this sample. Overall, the imputation provides an accurate set of genetic markers that can be used in the downstream GWAS analysis to explore SNPs associated with lung cancer.
Details
- Language :
- English
- Database :
- OpenAIRE
- Accession number :
- edsair.doi...........d46dab9d66a00a14db916bf3fa181b0d
- Full Text :
- https://doi.org/10.17615/xs5q-9255