1. Pseudoalignment for metagenomic read assignment
- Author
-
Páll Melsted, Lior Pachter, Nicolas Bray, Harold Pimentel, and Lorian Schaeffer
- Subjects
0301 basic medicine ,Statistics and Probability ,Sequence analysis ,Computer science ,Machine learning ,computer.software_genre ,Quantitative Biology - Quantitative Methods ,Biochemistry ,Genome ,03 medical and health sciences ,0302 clinical medicine ,Quantitative Biology - Genomics ,Taxonomic rank ,Molecular Biology ,Quantitative Methods (q-bio.QM) ,Genomics (q-bio.GN) ,Bacteria ,business.industry ,Sequence Analysis, RNA ,Sequence Analysis, DNA ,Original Papers ,Computer Science Applications ,Computational Mathematics ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,030104 developmental biology ,Computational Theory and Mathematics ,Metagenomics ,FOS: Biological sciences ,Taxonomy (biology) ,Artificial intelligence ,business ,computer ,030217 neurology & neurosurgery ,Algorithms ,Genome, Bacterial ,Software - Abstract
We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data. In particular, we show that the recent idea of pseudoalignment introduced in the RNA-Seq context is suitable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software., Replaced accidentally duplicated figure with correct version; fixed some issues with figure generation and labeling; fixed problem with some missing genomes from database; added link to GitHub repo containing analysis code; included assessment of aggregate sensitivity and precision; clarified assessment metrics used
- Published
- 2016