Back to Search Start Over

EndHiC: assemble large contigs into chromosomal-level scaffolds using the Hi-C links from contig ends

Authors :
Wang, Sen
Wang, Hengchao
Jiang, Fan
Wang, Anqi
Liu, Hangwei
Zhao, Hanbo
Yang, Boyuan
Xu, Dong
Zhang, Yan
Fan, Wei
Publication Year :
2021

Abstract

Motivation: The application of PacBio HiFi and ultra-long ONT reads have achieved huge progress in the contig-level assembly, but it is still challenging to assemble large contigs into chromosomes with available Hi-C scaffolding software, which all compute the contact value between contigs using the Hi-C links from the whole contig regions. As the Hi-C links of two adjacent contigs concentrate only at the neighbor ends of the contigs, larger contig size will reduce the power to differentiate adjacent (signal) and non-adjacent (noise) contig linkages, leading to a higher rate of mis-assembly. Results: We present a software package EndHiC, which is suitable to assemble large contigs (> 1-Mb) into chromosomal-level scaffolds, using Hi-C links from only the contig end regions instead of the whole contig regions. Benefiting from the increased signal to noise ratio, EndHiC achieves much higher scaffolding accuracy compared to existing software LACHESIS, ALLHiC, and 3D-DNA. Moreover, EndHiC has few parameters, runs 10-1000 times faster than existing software, needs trivial memory, provides robustness evaluation, and allows graphic viewing of the scaffold results. The high scaffolding accuracy and user-friendly interface of EndHiC, liberate the users from labor-intensive manual checks and revision works. Availability and implementation: EndHiC is written in Perl, and is freely available at https://github.com/fanagislab/EndHiC. Contact: fanwei@caas.cn and milrazhang@163.com Supplementary information: Supplementary data are available at Bioinformatics online.<br />25 pages, 1 figure, 6 supplemental figures, and 6 supplemental Tables

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....aa64510aa2df940ca0928a6aa9be6586