Back to Search Start Over

PECC: Correcting contigs based on paired-end read distribution

Authors :
Jianxin Wang
Binbin Wu
Yi Pan
Xiaodong Yan
Min Li
Junwei Luo
Fang-Xiang Wu
Source :
Computational Biology and Chemistry. 69:178-184
Publication Year :
2017
Publisher :
Elsevier BV, 2017.

Abstract

Motivation Cheap and fast next generation sequencing (NGS) technologies facilitate research of de novo assembly greatly. The reliability of contigs is critical to construct reliable scaffolding. However, contigs generated from most assemblers contain errors because of the limitation of assembly strategy and computation complexity. Among all these errors, the misassembly error is one of the most harmful types. Results In this paper, we propose a new method named “PECC” to identify and correct misassembly errors in contigs based on the paired-end read distribution. PECC extracts sequence regions with lower paired-end reads supports and verifies them based on the distribution of paired-end supports. To validate the effectiveness of PECC, we applied PECC to the contigs produced by five popular assemblers on four real datasets, and we also carried out experiments to analyze the influences of PECC on scaffolding. The results show that PECC can reduce misassembly errors and improve the performance of scaffolding results, which demonstrate the promising applications of PECC in de novo assembly.

Details

ISSN :
14769271
Volume :
69
Database :
OpenAIRE
Journal :
Computational Biology and Chemistry
Accession number :
edsair.doi.dedup.....a5a114d4918dc627fad0944a3e4127e7
Full Text :
https://doi.org/10.1016/j.compbiolchem.2017.03.012