Back to Search
Start Over
iModulonMiner and PyModulon: Software for unsupervised mining of gene expression compendia.
- Source :
-
PLoS computational biology [PLoS Comput Biol] 2024 Oct 23; Vol. 20 (10), pp. e1012546. Date of Electronic Publication: 2024 Oct 23 (Print Publication: 2024). - Publication Year :
- 2024
-
Abstract
- Public gene expression databases are a rapidly expanding resource of organism responses to diverse perturbations, presenting both an opportunity and a challenge for bioinformatics workflows to extract actionable knowledge of transcription regulatory network function. Here, we introduce a five-step computational pipeline, called iModulonMiner, to compile, process, curate, analyze, and characterize the totality of RNA-seq data for a given organism or cell type. This workflow is centered around the data-driven computation of co-regulated gene sets using Independent Component Analysis, called iModulons, which have been shown to have broad applications. As a demonstration, we applied this workflow to generate the iModulon structure of Bacillus subtilis using all high-quality, publicly-available RNA-seq data. Using this structure, we predicted regulatory interactions for multiple transcription factors, identified groups of co-expressed genes that are putatively regulated by undiscovered transcription factors, and predicted properties of a recently discovered single-subunit phage RNA polymerase. We also present a Python package, PyModulon, with functions to characterize, visualize, and explore computed iModulons. The pipeline, available at https://github.com/SBRG/iModulonMiner, can be readily applied to diverse organisms to gain a rapid understanding of their transcriptional regulatory network structure and condition-specific activity.<br />Competing Interests: The authors declare that they have no competing interests.<br /> (Copyright: © 2024 Sastry et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
- Subjects :
- Databases, Genetic
Transcription Factors genetics
Transcription Factors metabolism
Gene Expression Profiling methods
Gene Expression Regulation, Bacterial genetics
Software
Bacillus subtilis genetics
Bacillus subtilis metabolism
Computational Biology methods
Gene Regulatory Networks genetics
Data Mining methods
Subjects
Details
- Language :
- English
- ISSN :
- 1553-7358
- Volume :
- 20
- Issue :
- 10
- Database :
- MEDLINE
- Journal :
- PLoS computational biology
- Publication Type :
- Academic Journal
- Accession number :
- 39441835
- Full Text :
- https://doi.org/10.1371/journal.pcbi.1012546