1. TIPP2: metagenomic taxonomic profiling using phylogenetic markers
- Author
-
Tandy Warnow, Erin K. Molloy, Mihai Pop, and Nidhi Shah
- Subjects
Statistics and Probability ,Profiling (computer programming) ,0303 health sciences ,Phylogenetic tree ,AcademicSubjects/SCI01060 ,Computer science ,Computational biology ,Biochemistry ,Genome ,Original Papers ,Computer Science Applications ,Set (abstract data type) ,03 medical and health sciences ,Computational Mathematics ,0302 clinical medicine ,Taxon ,Computational Theory and Mathematics ,Metagenomics ,Microbiome ,Precision and recall ,Molecular Biology ,Sequence Analysis ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
Motivation Metagenomics has revolutionized microbiome research by enabling researchers to characterize the composition of complex microbial communities. Taxonomic profiling is one of the critical steps in metagenomic analyses. Marker genes, which are single-copy and universally found across Bacteria and Archaea, can provide accurate estimates of taxon abundances in the sample. Results We present TIPP2, a marker gene-based abundance profiling method, which combines phylogenetic placement with statistical techniques to control classification precision and recall. TIPP2 includes an updated set of reference packages and several algorithmic improvements over the original TIPP method. We find that TIPP2 provides comparable or better estimates of abundance than other profiling methods (including Bracken, mOTUsv2 and MetaPhlAn2), and strictly dominates other methods when there are under-represented (novel) genomes present in the dataset. Availability and implementation The code for our method is freely available in open-source form at https://github.com/smirarab/sepp/blob/tipp2/README.TIPP.md. The code and procedure to create new reference packages for TIPP2 are available at https://github.com/shahnidhi/TIPP_reference_package. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2021