1. Inferring and analyzing gene regulatory networks from multi-factorial expression data: a complete and interactive suite
- Author
-
Océane Cassan, Sophie Lèbre, Antoine Martin, Biochimie et Physiologie Moléculaire des Plantes (BPMP), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Institut national d’études supérieures agronomiques de Montpellier (Montpellier SupAgro), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Institut Montpelliérain Alexander Grothendieck (IMAG), and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
0106 biological sciences ,[SDV]Life Sciences [q-bio] ,Gene regulatory network ,Inference ,Multifactorial transcriptomic analysis ,Biology ,Ontology (information science) ,QH426-470 ,computer.software_genre ,01 natural sciences ,MESH: Gene Expression Profiling ,MESH: Software ,03 medical and health sciences ,Model-based clustering ,Gene regulatory network inference ,Genetics ,Cluster Analysis ,[SDV.BV]Life Sciences [q-bio]/Vegetal Biology ,Analysis workflow ,Gene Regulatory Networks ,Cluster analysis ,MESH: Gene Regulatory Networks ,030304 developmental biology ,0303 health sciences ,Gene Expression Profiling ,MESH: Transcriptome ,Computational Biology ,MESH: Cluster Analysis ,Random forest ,Graphical user interface ,Workflow ,Data mining ,User interface ,Web service ,Transcriptome ,computer ,Software ,TP248.13-248.65 ,MESH: Computational Biology ,010606 plant biology & botany ,Biotechnology - Abstract
Background High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies. Results We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses. Conclusions We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service (https://diane.bpmp.inrae.fr), or can be installed and locally launched as a complete R package.
- Published
- 2021