1. Identification of multivariable Boolean patterns in microbiome and microbial gene composition data.
- Author
-
Golovko, George, Khanipov, Kamil, Reyes, Victor, Pinchuk, Irina, and Fofanov, Yuriy
- Subjects
- *
MICROBIAL genes , *DUODENAL ulcers , *STOMACH ulcers , *BIOLOGICAL systems , *MICROBIAL genomics , *METAGENOMICS , *FEATURE selection - Abstract
Virtually every biological system is governed by complex relations among its components. Identifying such relations requires a rigorous or heuristics-based search for patterns among variables/features of a system. Various algorithms have been developed to identify two-dimensional (involving two variables) patterns employing correlation, covariation, mutual information, etc. It seems obvious, however, that comprehensive descriptions of complex biological systems need also to include more complicated multivariable relations, which can only be described using patterns that simultaneously embrace 3, 4, and more variables. The goal of this manuscript is to (a) introduce a novel type of associations (multivariable Boolean patterns) that can be manifested between features of complex systems but cannot be identified (described) by traditional pair-vise metrics; (b) propose patterns classification method, and (c) provide a novel definition of the pattern's strength (pattern's score) able to accommodate heterogeneous multi-omics data. To demonstrate the presence of such patterns, we performed a search for all possible 2-, 3-, and 4-dimensional patterns in historical data from the Human Microbiome Project (15 body sites) and collection of H. pylori genomes associated with gastric ulcers, gastritis, and duodenal ulcers. In all datasets under consideration, we were able to identify hundreds of statistically significant multivariable patterns. These results suggest that such patterns can be common in microbial genomics/microbiomics systems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF