Back to Search Start Over

iModulonMiner and PyModulon: Software for unsupervised mining of gene expression compendia.

Authors :
Sastry AV
Yuan Y
Poudel S
Rychel K
Yoo R
Lamoureux CR
Li G
Burrows JT
Chauhan S
Haiman ZB
Al Bulushi T
Seif Y
Palsson BO
Zielinski DC
Source :
PLoS computational biology [PLoS Comput Biol] 2024 Oct 23; Vol. 20 (10), pp. e1012546. Date of Electronic Publication: 2024 Oct 23 (Print Publication: 2024).
Publication Year :
2024

Abstract

Public gene expression databases are a rapidly expanding resource of organism responses to diverse perturbations, presenting both an opportunity and a challenge for bioinformatics workflows to extract actionable knowledge of transcription regulatory network function. Here, we introduce a five-step computational pipeline, called iModulonMiner, to compile, process, curate, analyze, and characterize the totality of RNA-seq data for a given organism or cell type. This workflow is centered around the data-driven computation of co-regulated gene sets using Independent Component Analysis, called iModulons, which have been shown to have broad applications. As a demonstration, we applied this workflow to generate the iModulon structure of Bacillus subtilis using all high-quality, publicly-available RNA-seq data. Using this structure, we predicted regulatory interactions for multiple transcription factors, identified groups of co-expressed genes that are putatively regulated by undiscovered transcription factors, and predicted properties of a recently discovered single-subunit phage RNA polymerase. We also present a Python package, PyModulon, with functions to characterize, visualize, and explore computed iModulons. The pipeline, available at https://github.com/SBRG/iModulonMiner, can be readily applied to diverse organisms to gain a rapid understanding of their transcriptional regulatory network structure and condition-specific activity.<br />Competing Interests: The authors declare that they have no competing interests.<br /> (Copyright: © 2024 Sastry et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)

Details

Language :
English
ISSN :
1553-7358
Volume :
20
Issue :
10
Database :
MEDLINE
Journal :
PLoS computational biology
Publication Type :
Academic Journal
Accession number :
39441835
Full Text :
https://doi.org/10.1371/journal.pcbi.1012546