Back to Search Start Over

OGUs enable effective, phylogeny-aware analysis of even shallow metagenome community structures

Authors :
Leo Lahti
Lejzerowicz F
Shi Huang
Jack A. Gilbert
Pedro Belda-Ferre
Qiyun Zhu
Daniel McDonald
Das P
Sepich-Poore Gd
Justin P. Shaffer
Yu J
Guillaume Méric
Rob Knight
Niina Haiminen
Hyun-Chul Kim
Teemu J. Niiranen
Michael Inouye
Aki S. Havulinna
Antonio Gonzalez
George Armstrong
Yoshiki Vázquez-Baeza
Austin D. Swafford
McGrath I
Salomaa
Kuczynski J
Miten Jain
Publication Year :
2021
Publisher :
Cold Spring Harbor Laboratory, 2021.

Abstract

We introduce Operational Genomic Unit (OGU), a metagenome analysis strategy that directly exploits sequence alignment hits to individual reference genomes as the minimum unit for assessing the diversity of microbial communities and their relevance to environmental factors. This approach is independent from taxonomic classification, granting the possibility of maximal resolution of community composition, and organizes features into an accurate hierarchy using a phylogenomic tree. The outputs are suitable for contemporary analytical protocols for community ecology, differential abundance and supervised learning while supporting phylogenetic methods, such as UniFrac and phylofactorization, that are seldomly applied to shotgun metagenomics despite being prevalent in 16S rRNA gene amplicon studies. As demonstrated in one synthetic and two real-world case studies, the OGU method produces biologically meaningful patterns from microbiome datasets. Such patterns further remain detectable at very low metagenomic sequencing depths. Compared with taxonomic unit-based analyses implemented in currently adopted metagenomics tools, and the analysis of 16S rRNA gene amplicon sequence variants, this method shows superiority in informing biologically relevant insights, including stronger correlation with body environment and host sex on the Human Microbiome Project dataset, and more accurate prediction of human age by the gut microbiomes in the Finnish population. We provide Woltka, a bioinformatics tool to implement this method, with full integration with the QIIME 2 package and the Qiita web platform, to facilitate OGU adoption in future metagenomics studies.ImportanceShotgun metagenomics is a powerful, yet computationally challenging, technique compared to 16S rRNA gene amplicon sequencing for decoding the composition and structure of microbial communities. However, current analyses of metagenomic data are primarily based on taxonomic classification, which is limited in feature resolution compared to 16S rRNA amplicon sequence variant analysis. To solve these challenges, we introduce Operational Genomic Units (OGUs), which are the individual reference genomes derived from sequence alignment results, without further assigning them taxonomy. The OGU method advances current read-based metagenomics in two dimensions: (i) providing maximal resolution of community composition while (ii) permitting use of phylogeny-aware tools. Our analysis of real-world datasets shows several advantages over currently adopted metagenomic analysis methods and the finest-grained 16S rRNA analysis methods in predicting biological traits. We thus propose the adoption of OGU as standard practice in metagenomic studies.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........0d7498c3e8eb9b49f8542a933bfc83d8