Back to Search
Start Over
Genotyping sequence-resolved copy-number variations using pangenomes reveals paralog-specific global diversity and expression divergence of duplicated genes.
- Source :
-
BioRxiv : the preprint server for biology [bioRxiv] 2024 Oct 24. Date of Electronic Publication: 2024 Oct 24. - Publication Year :
- 2024
-
Abstract
- Copy-number variable (CNV) genes are important in evolution and disease, yet sequence variation in CNV genes is a blindspot for large-scale studies. We present a method, ctyper, that leverages pangenomes to produce copy-number maps with allele-specific sequences containing locally phased variants of CNV genes from NGS reads. We extensively characterized accuracy and efficiency on a database of 3,351 CNV genes including HLA , SMN , and CYP2D6 as well as 212 non-CNV medically-relevant challenging genes. The genotypes capture 96.5% of underlying variants in new genomes, requiring 0.9 seconds per gene. Expression analysis of ctyper genotypes explains more variance than known eQTL variants. Comparing allele-specific expression quantified divergent expression on 7.94% of paralogs and tissue-specific biases on 4.7% of paralogs. We found reduced expression of SMN-1 converted from SMN-2, which potentially affects diagnosis of spinal muscular atrophy, and increased expression of a duplicative translocation of AMY2B . Overall, ctyper enables biobank-scale genotyping of CNV and challenging genes.
Details
- Language :
- English
- ISSN :
- 2692-8205
- Database :
- MEDLINE
- Journal :
- BioRxiv : the preprint server for biology
- Publication Type :
- Academic Journal
- Accession number :
- 39149335
- Full Text :
- https://doi.org/10.1101/2024.08.11.607269