1. VolcanoSV enables accurate and robust structural variant calling in diploid genomes from single-molecule long read sequencing.
- Author
-
Luo, Can, Liu, Yichen Henry, and Zhou, Xin Maizie
- Subjects
HUMAN genome ,INDIVIDUALIZED medicine ,GENE mapping ,SINGLE nucleotide polymorphisms ,GENOMES - Abstract
Structural variants (SVs) significantly contribute to human genome diversity and play a crucial role in precision medicine. Although advancements in single-molecule long-read sequencing offer a groundbreaking resource for SV detection, identifying SV breakpoints and sequences accurately and robustly remains challenging. We introduce VolcanoSV, an innovative hybrid SV detection pipeline that utilizes both a reference genome and local de novo assembly to generate a phased diploid assembly. VolcanoSV uses phased SNPs and unique k-mer similarity analysis, enabling precise haplotype-resolved SV discovery. VolcanoSV is adept at constructing comprehensive genetic maps encompassing SNPs, small indels, and all types of SVs, making it well-suited for human genomics studies. Our extensive experiments demonstrate that VolcanoSV surpasses state-of-the-art assembly-based tools in the detection of insertion and deletion SVs, exhibiting superior recall, precision, F1 scores, and genotype accuracy across a diverse range of datasets, including low-coverage (10x) datasets. VolcanoSV outperforms assembly-based tools in the identification of complex SVs, including translocations, duplications, and inversions, in both simulated and real cancer data. Moreover, VolcanoSV is robust to various evaluation parameters and accurately identifies breakpoints and SV sequences. Detecting genomic structural variants (SVs) using long-read sequencing remains challenging. Here the authors introduce a hybrid pipeline for precise haplotype-resolved SV discovery that outperforms current tools across diverse long-read sequencing datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF