Back to Search
Start Over
Correction to: SLR: a scaffolding algorithm based on long reads and contig classification
- Source :
- BMC Bioinformatics, BMC Bioinformatics, Vol 21, Iss 1, Pp 1-4 (2020)
- Publication Year :
- 2020
- Publisher :
- BioMed Central, 2020.
-
Abstract
- Scaffolding is an important step in genome assembly that orders and orients the contigs produced by assemblers. However, repetitive regions in contigs usually prevent scaffolding from producing accurate results. How to solve the problem of repetitive regions has received a great deal of attention. In the past few years, long reads sequenced by third-generation sequencing technologies (Pacific Biosciences and Oxford Nanopore) have been demonstrated to be useful for sequencing repetitive regions in genomes. Although some stand-alone scaffolding algorithms based on long reads have been presented, scaffolding still requires a new strategy to take full advantage of the characteristics of long reads.Here, we present a new scaffolding algorithm based on long reads and contig classification (SLR). Through the alignment information of long reads and contigs, SLR classifies the contigs into unique contigs and ambiguous contigs for addressing the problem of repetitive regions. Next, SLR uses only unique contigs to produce draft scaffolds. Then, SLR inserts the ambiguous contigs into the draft scaffolds and produces the final scaffolds. We compare SLR to three popular scaffolding tools by using long read datasets sequenced with Pacific Biosciences and Oxford Nanopore technologies. The experimental results show that SLR can produce better results in terms of accuracy and completeness. The open-source code of SLR is available at https://github.com/luojunwei/SLR.In this paper, we describes SLR, which is designed to scaffold contigs using long reads. We conclude that SLR can improve the completeness of genome assembly.
- Subjects :
- Computer science
computer.software_genre
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
03 medical and health sciences
0302 clinical medicine
Structural Biology
Molecular Biology
lcsh:QH301-705.5
030304 developmental biology
Repetitive Sequences, Nucleic Acid
0303 health sciences
Genome
Contig
business.industry
Applied Mathematics
Correction
High-Throughput Nucleotide Sequencing
Computer Science Applications
lcsh:Biology (General)
030220 oncology & carcinogenesis
lcsh:R858-859.7
Artificial intelligence
DNA microarray
business
computer
Natural language processing
Algorithms
Software
Subjects
Details
- Language :
- English
- ISSN :
- 14712105
- Volume :
- 21
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics
- Accession number :
- edsair.doi.dedup.....e0020547473ad7187b95636da89ce431