Back to Search
Start Over
Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence data
- Source :
- Scientific Reports
- Publication Year :
- 2016
- Publisher :
- Springer Science and Business Media LLC, 2016.
-
Abstract
- DNA assembly is a core methodological step in metagenomic pipelines used to study the structure and function within microbial communities. Here we investigate the utility of Pacific Biosciences long and high accuracy circular consensus sequencing (CCS) reads for metagenomic projects. We compared the application and performance of both PacBio CCS and Illumina HiSeq data with assembly and taxonomic binning algorithms using metagenomic samples representing a complex microbial community. Eight SMRT cells produced approximately 94 Mb of CCS reads from a biogas reactor microbiome sample that averaged 1319 nt in length and 99.7% accuracy. CCS data assembly generated a comparative number of large contigs greater than 1 kb, to those assembled from a ~190x larger HiSeq dataset (~18 Gb) produced from the same sample (i.e approximately 62% of total contigs). Hybrid assemblies using PacBio CCS and HiSeq contigs produced improvements in assembly statistics, including an increase in the average contig length and number of large contigs. The incorporation of CCS data produced significant enhancements in taxonomic binning and genome reconstruction of two dominant phylotypes, which assembled and binned poorly using HiSeq data alone. Collectively these results illustrate the value of PacBio CCS reads in certain metagenomics applications.
- Subjects :
- 0301 basic medicine
030106 microbiology
Sequence assembly
Hybrid genome assembly
Computational biology
Biology
computer.software_genre
Genome
Article
03 medical and health sciences
Consensus Sequence
Consensus sequence
Dna assembly
Phylogeny
030304 developmental biology
0303 health sciences
Base Composition
Multidisciplinary
Base Sequence
Contig
030306 microbiology
Nucleotides
Sequence Analysis, DNA
Structure and function
030104 developmental biology
Metagenomics
Metagenome
Pacific biosciences
Data mining
DNA, Circular
computer
Subjects
Details
- ISSN :
- 20452322
- Volume :
- 6
- Database :
- OpenAIRE
- Journal :
- Scientific Reports
- Accession number :
- edsair.doi.dedup.....9fcbb3b37b74f05ab029716094eb3ea1
- Full Text :
- https://doi.org/10.1038/srep25373