1. Connectivity Based Method for Clustering Microbial Communities from Metagenomics Data of Water and Soil Samples
- Author
-
Rahman, JS, Li, J, Xie, J, Fogelman, S, Blumenstein, M, Rahman, JS, Li, J, Xie, J, Fogelman, S, and Blumenstein, M
- Abstract
© 2018 IEEE. Understanding microbial community structure of metagenomics water and soil samples is a key process in discovering functions and impact of microorganisms on human and animal health. Evolution of Next Generation Sequencing (NGS) technology has encouraged researchers to sequence large quantity of microbial data from environmental sources. Clustering marker gene sequences into Operational Taxonomic Units (OTU) is the most significant task in microbial community analysis. Several methods have been developed over the years to improve OTU picking strategies. However, building strongly connected OTUs is a major issue in majority of these methods. Herein we present ConClust, a novel method for clustering OTUs that is based on quantifying connectivity among the sequences. Experimental analysis on two synthetic datasets and two real world datasets from water and soil samples demonstrate that our method can mine robust OTUs. Our method can be highly benelicial to study functions of known and unknown microbes and analyze their positive and negative effect on the environment as well as human and animal health.
- Published
- 2018