Back to Search Start Over

Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes.

Authors :
Ebler J
Ebert P
Clarke WE
Rausch T
Audano PA
Houwaart T
Mao Y
Korbel JO
Eichler EE
Zody MC
Dilthey AT
Marschall T
Source :
Nature genetics [Nat Genet] 2022 Apr; Vol. 54 (4), pp. 518-525. Date of Electronic Publication: 2022 Apr 11.
Publication Year :
2022

Abstract

Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers. In the present study, we propose a new algorithm, PanGenie, that leverages a haplotype-resolved pangenome reference together with k-mer counts from short-read sequencing data to genotype a wide spectrum of genetic variation-a process we refer to as genome inference. Compared with mapping-based approaches, PanGenie is more than 4 times faster at 30-fold coverage and achieves better genotype concordances for almost all variant types and coverages tested. Improvements are especially pronounced for large insertions (≥50 bp) and variants in repetitive regions, enabling the inclusion of these classes of variants in genome-wide association studies. PanGenie efficiently leverages the increasing amount of haplotype-resolved assemblies to unravel the functional impact of previously inaccessible variants while being faster compared with alignment-based workflows.<br /> (© 2022. The Author(s).)

Details

Language :
English
ISSN :
1546-1718
Volume :
54
Issue :
4
Database :
MEDLINE
Journal :
Nature genetics
Publication Type :
Academic Journal
Accession number :
35410384
Full Text :
https://doi.org/10.1038/s41588-022-01043-w