1. BugSeq: a highly accurate cloud platform for long-read metagenomic analyses
- Author
-
Steven H. Huang, Jeremy Fan, and Samuel D. Chorlton
- Subjects
Nanopore ,Computer science ,Sample (material) ,Cloud computing ,lcsh:Computer applications to medicine. Medical informatics ,computer.software_genre ,Microbiology ,Biochemistry ,Third-generation ,03 medical and health sciences ,0302 clinical medicine ,Species level ,Structural Biology ,Classifier (linguistics) ,Humans ,Sequencing ,lcsh:QH301-705.5 ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,business.industry ,Applied Mathematics ,High-Throughput Nucleotide Sequencing ,Biological classification ,Cloud Computing ,Computer Science Applications ,Nanopore Sequencing ,lcsh:Biology (General) ,Metagenomics ,Scalability ,lcsh:R858-859.7 ,Metagenome ,Long-read ,Nanopore sequencing ,Data mining ,business ,computer ,Software ,030217 neurology & neurosurgery - Abstract
Background As the use of nanopore sequencing for metagenomic analysis increases, tools capable of performing long-read taxonomic classification (ie. determining the composition of a sample) in a fast and accurate manner are needed. Existing tools were either designed for short-read data (eg. Centrifuge), take days to analyse modern sequencer outputs (eg. MetaMaps) or suffer from suboptimal accuracy (eg. CDKAM). Additionally, all tools require command line expertise and do not scale in the cloud. Results We present BugSeq, a novel, highly accurate metagenomic classifier for nanopore reads. We evaluate BugSeq on simulated data, mock microbial communities and real clinical samples. On the ZymoBIOMICS Even and Log communities, BugSeq (F1 = 0.95 at species level) offers better read classification than MetaMaps (F1 = 0.89–0.94) in a fraction of the time. BugSeq significantly improves on the accuracy of Centrifuge (F1 = 0.79–0.93) and CDKAM (F1 = 0.91–0.94) while offering competitive run times. When applied to 41 samples from patients with lower respiratory tract infections, BugSeq produces greater concordance with microbiological culture and qPCR compared with “What’s In My Pot” analysis. Conclusion BugSeq is deployed to the cloud for easy and scalable long-read metagenomic analyses. BugSeq is freely available for non-commercial use at https://bugseq.com/free.
- Published
- 2021