Back to Search Start Over

Optimising high-throughput sequencing data analysis, from gene database selection to the analysis of compositional data: a case study on tropical soil nematodes

Authors :
Simin Wang
Dominik Schneider
Tamara R. Hartke
Johannes Ballauff
Carina Carneiro de Melo Moura
Garvin Schulz
Zhipeng Li
Andrea Polle
Rolf Daniel
Oliver Gailing
Bambang Irawan
Stefan Scheu
Valentyna Krashevska
Source :
Frontiers in Ecology and Evolution, Vol 12 (2024)
Publication Year :
2024
Publisher :
Frontiers Media S.A., 2024.

Abstract

IntroductionHigh-throughput sequencing (HTS) provides an efficient and cost-effective way to generate large amounts of sequence data, providing a very powerful tool to analyze biodiversity of soil organisms. However, marker-based methods and the resulting datasets come with a range of challenges and disputes, including incomplete reference databases, controversial sequence similarity thresholds for delimitating taxa, and downstream compositional data analysis. MethodsHere, we use HTS data from a soil nematode biodiversity experiment to explore standardized HTS data processing procedures. We compared the taxonomic assignment performance of two main rDNA reference databases (SILVA and PR2). We tested whether the same ecological patterns are detected with Amplicon Sequence Variants (ASV; 100% similarity) versus classical Operational Taxonomic Units (OTU; 97% similarity). Further, we tested how different HTS data normalization methods affect the recovery of beta diversity patterns and the identification of differentially abundant taxa.ResultsAt this time, the SILVA 138 eukaryotic database performed better than the PR2 4.12 database, assigning more reads to family level and providing higher phylogenetic resolution. ASV- and OTU-based alpha and beta diversity of nematodes correlated closely, indicating that OTU-based studies represent useful reference points. For downstream data analyses, our results indicate that loss of data during subsampling under rarefaction-based methods might reduce the sensitivity of the method, e.g. underestimate the differences between nematode communities under different treatments, while the clr-transformation-based methods may overestimate effects. The Analysis of Compositions of Microbiome with Bias Correction approach (ANCOM-BC) retains all data and accounts for uneven sampling fractions for each sample, suggesting that this is currently the optimal method to analyze compositional data.DiscussionOverall, our study highlights the importance of comparing and selecting taxonomic reference databases before data analyses, and provides solid evidence for the similarity and comparability between OTU- and ASV-based nematode studies. Further, the results highlight the potential weakness of rarefaction-based and clr-transformation-based methods. We recommend future studies use ASV and that both the taxonomic reference databases and normalization strategies are carefully tested and selected before analyzing the data.

Details

Language :
English
ISSN :
2296701X
Volume :
12
Database :
Directory of Open Access Journals
Journal :
Frontiers in Ecology and Evolution
Publication Type :
Academic Journal
Accession number :
edsdoj.f5bb0cae83047f5bb0264f9eef691e6
Document Type :
article
Full Text :
https://doi.org/10.3389/fevo.2024.1168288