Start Over

The advantages and disadvantages of short- and long- read metagenomics to infer bacterial and eukaryotic community composition

Authors :: Nikki E. Freed
Olin K. Silander
William S. Pearman
Publication Year :: 2019
Publisher :: Research Square Platform LLC, 2019.
Abstract: Background The first step in understanding ecological community diversity and dynamics is quantifying community membership. An increasingly common method for doing so is through metagenomics. Because of the rapidly increasing popularity of this approach, a large number of computational tools and pipelines are available for analysing metagenomic data. However, the majority of these tools have been designed and benchmarked using highly accurate short read data (i.e. illumina), with few studies benchmarking classification accuracy for long error-prone reads (PacBio or Oxford Nanopore). In addition, few tools have been benchmarked for non-microbial communities. Results Here we use simulated error prone Oxford Nanopore and high accuracy Illumina read sets to systematically investigate the effects of sequence length and taxon type on classification accuracy for metagenomic data from both microbial and non-microbial communities. We show that very generally, classification accuracy is far lower for non-microbial communities, even at low taxonomic resolution (e.g. family rather than genus). Conclusions We then show that for two popular taxonomic classifiers, long error-prone reads can significantly increase classification accuracy, and this is most pronounced for non-microbial communities. This work provides insight on the expected accuracy for metagenomic analyses for different taxonomic groups, and establishes the point at which read length becomes more important than error rate for assigning the correct taxon.

Subjects :: Taxon
Community composition
Metagenomics
business.industry
Computer science
Nanopore sequencing
Taxonomic rank
Artificial intelligence
business
Machine learning
computer.software_genre
computer

Details

Database :: OpenAIRE
Accession number :: edsair.doi...........4c2318b6edd5d8fd000e5efbcde7693b
Full Text :: https://doi.org/10.21203/rs.2.10271/v1

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

The advantages and disadvantages of short- and long- read metagenomics to infer bacterial and eukaryotic community composition

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

The advantages and disadvantages of short- and long- read metagenomics to infer bacterial and eukaryotic community composition

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources