Back to Search
Start Over
Scaffolding of long read assemblies using long range contact information
- Source :
- BMC Genomics, BMC Genomics, Vol 18, Iss 1, Pp 1-11 (2017)
- Publication Year :
- 2017
-
Abstract
- Background Long read technologies have revolutionized de novo genome assembly by generating contigs orders of magnitude longer than that of short read assemblies. Although assembly contiguity has increased, it usually does not reconstruct a full chromosome or an arm of the chromosome, resulting in an unfinished chromosome level assembly. To increase the contiguity of the assembly to the chromosome level, different strategies are used which exploit long range contact information between chromosomes in the genome. Methods We develop a scalable and computationally efficient scaffolding method that can boost the assembly contiguity to a large extent using genome-wide chromatin interaction data such as Hi-C. Results we demonstrate an algorithm that uses Hi-C data for longer-range scaffolding of de novo long read genome assemblies. We tested our methods on the human and goat genome assemblies. We compare our scaffolds with the scaffolds generated by LACHESIS based on various metrics. Conclusion Our new algorithm SALSA produces more accurate scaffolds compared to the existing state of the art method LACHESIS. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3879-z) contains supplementary material, which is available to authorized users.
- Subjects :
- 0106 biological sciences
0301 basic medicine
Scaffold
lcsh:QH426-470
Contiguity
lcsh:Biotechnology
Assembly
Sequence assembly
Computational biology
Biology
Long reads
Scaffolding
01 natural sciences
Genome
03 medical and health sciences
Contig Mapping
Hi-C
lcsh:TP248.13-248.65
Genetics
Animals
Humans
Contig
Goats
Methodology Article
Genomics
Range (mathematics)
lcsh:Genetics
030104 developmental biology
Scalability
DNA microarray
Algorithms
010606 plant biology & botany
Biotechnology
Subjects
Details
- ISSN :
- 14712164
- Volume :
- 18
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- BMC genomics
- Accession number :
- edsair.doi.dedup.....3f77693412f194a0696600d512c4d1fb