Back to Search Start Over

Evaluation of vicinity-based hidden Markov models for genotype imputation.

Authors :
Wang, Su
Kim, Miran
Jiang, Xiaoqian
Harmanci, Arif Ozgun
Source :
BMC Bioinformatics. 8/29/2022, Vol. 23 Issue 1, p1-26. 26p.
Publication Year :
2022

Abstract

Background: The decreasing cost of DNA sequencing has led to a great increase in our knowledge about genetic variation. While population-scale projects bring important insight into genotype–phenotype relationships, the cost of performing whole-genome sequencing on large samples is still prohibitive. In-silico genotype imputation coupled with genotyping-by-arrays is a cost-effective and accurate alternative for genotyping of common and uncommon variants. Imputation methods compare the genotypes of the typed variants with the large population-specific reference panels and estimate the genotypes of untyped variants by making use of the linkage disequilibrium patterns. Most accurate imputation methods are based on the Li–Stephens hidden Markov model, HMM, that treats the sequence of each chromosome as a mosaic of the haplotypes from the reference panel. Results: Here we assess the accuracy of vicinity-based HMMs, where each untyped variant is imputed using the typed variants in a small window around itself (as small as 1 centimorgan). Locality-based imputation is used recently by machine learning-based genotype imputation approaches. We assess how the parameters of the vicinity-based HMMs impact the imputation accuracy in a comprehensive set of benchmarks and show that vicinity-based HMMs can accurately impute common and uncommon variants. Conclusions: Our results indicate that locality-based imputation models can be effectively used for genotype imputation. The parameter settings that we identified can be used in future methods and vicinity-based HMMs can be used for re-structuring and parallelizing new imputation methods. The source code for the vicinity-based HMM implementations is publicly available at https://github.com/harmancilab/LoHaMMer. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14712105
Volume :
23
Issue :
1
Database :
Academic Search Index
Journal :
BMC Bioinformatics
Publication Type :
Academic Journal
Accession number :
158783575
Full Text :
https://doi.org/10.1186/s12859-022-04896-4