1. Human ancestry indentification under resource constraints -- what can one chromosome tell us about human biogeographical ancestry?
- Author
-
Toma TT, Dawson JM, and Adjeroh DA
- Subjects
- Algorithms, Asian People classification, Asian People genetics, Black People classification, Black People genetics, Chromosomes classification, Genome-Wide Association Study, Humans, Polymorphism, Single Nucleotide, White People classification, White People genetics, Chromosomes genetics
- Abstract
Background: While continental level ancestry is relatively simple using genomic information, distinguishing between individuals from closely associated sub-populations (e.g., from the same continent) is still a difficult challenge., Methods: We study the problem of predicting human biogeographical ancestry from genomic data under resource constraints. In particular, we focus on the case where the analysis is constrained to using single nucleotide polymorphisms (SNPs) from just one chromosome. We propose methods to construct such ancestry informative SNP panels using correlation-based and outlier-based methods., Results: We accessed the performance of the proposed SNP panels derived from just one chromosome, using data from the 1000 Genome Project, Phase 3. For continental-level ancestry classification, we achieved an overall classification rate of 96.75% using 206 single nucleotide polymorphisms (SNPs). For sub-population level ancestry prediction, we achieved an average pairwise binary classification rates as follows: subpopulations in Europe: 76.6% (58 SNPs); Africa: 87.02% (87 SNPs); East Asia: 73.30% (68 SNPs); South Asia: 81.14% (75 SNPs); America: 85.85% (68 SNPs)., Conclusion: Our results demonstrate that one single chromosome (in particular, Chromosome 1), if carefully analyzed, could hold enough information for accurate prediction of human biogeographical ancestry. This has significant implications in terms of the computational resources required for analysis of ancestry, and in the applications of such analyses, such as in studies of genetic diseases, forensics, and soft biometrics.
- Published
- 2018
- Full Text
- View/download PDF