Back to Search Start Over

Probing instructions for expression regulation in gene nucleotide compositions

Authors :
Chloé, Bessière
May, Taha
Florent, Petitprez
Jimmy, Vandel
Jean-Michel, Marin
Laurent, Bréhélin
Sophie, Lèbre
Charles-Henri, Lecellier
Institut de Biologie Computationnelle (IBC)
Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
Institut de Génétique Moléculaire de Montpellier (IGMM)
Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
Méthodes et Algorithmes pour la Bioinformatique (MAB)
Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM)
Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
Institut Montpelliérain Alexander Grothendieck (IMAG)
Université Paul-Valéry - Montpellier 3 (UPVM)
ANR-11-BINF-0002,IBC,Institut de biologie Computationnelle(2011)
Université de Montpellier (UM)-Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)
Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
Université Paul-Valéry - Montpellier 3 (UM3)
ANR-11-BINF-0002,IBC,Institut de Biologie Computationnelle de Montpellier(2011)
Institut de Biologie Computationnelle ( IBC )
Centre de Coopération Internationale en Recherche Agronomique pour le Développement ( CIRAD ) -Institut National de la Recherche Agronomique ( INRA ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université de Montpellier ( UM ) -Centre National de la Recherche Scientifique ( CNRS )
Institut de Génétique Moléculaire de Montpellier ( IGMM )
Université de Montpellier ( UM ) -Centre National de la Recherche Scientifique ( CNRS )
Méthodes et Algorithmes pour la Bioinformatique ( MAB )
Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier ( LIRMM )
Université de Montpellier ( UM ) -Centre National de la Recherche Scientifique ( CNRS ) -Université de Montpellier ( UM ) -Centre National de la Recherche Scientifique ( CNRS )
Institut Montpelliérain Alexander Grothendieck ( IMAG )
ANR-11-BINF-0002,IBC,Institut de Biologie Computationnelle de Montpellier ( 2011 )
Source :
PLoS Computational Biology, PLoS Computational Biology, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Vol 14, Iss 1, p e1005921 (2018), PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. 〈10.1371/journal.pcbi.1005921〉
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

Gene expression is orchestrated by distinct regulatory regions to ensure a wide variety of cell types and functions. A challenge is to identify which regulatory regions are active, what are their associated features and how they work together in each cell type. Several approaches have tackled this problem by modeling gene expression based on epigenetic marks, with the ultimate goal of identifying driving regions and associated genomic variations that are clinically relevant in particular in precision medicine. However, these models rely on experimental data, which are limited to specific samples (even often to cell lines) and cannot be generated for all regulators and all patients. In addition, we show here that, although these approaches are accurate in predicting gene expression, inference of TF combinations from this type of models is not straightforward. Furthermore these methods are not designed to capture regulation instructions present at the sequence level, before the binding of regulators or the opening of the chromatin. Here, we probe sequence-level instructions for gene expression and develop a method to explain mRNA levels based solely on nucleotide features. Our method positions nucleotide composition as a critical component of gene expression. Moreover, our approach, able to rank regulatory regions according to their contribution, unveils a strong influence of the gene body sequence, in particular introns. We further provide evidence that the contribution of nucleotide content can be linked to co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains.<br />Author summary Identifying a maximum of DNA determinants implicated in gene regulation will accelerate genetic analyses and precision medicine approaches by identifying key gene features. In that context decoding the sequence-level instructions for gene regulation is of prime importance. Among global efforts to achieve this objective, we propose a novel approach able to explain gene expression in each patient sample using only DNA features. Our approach, which is as accurate as methods based on epigenetics data, reveals a strong influence of the nucleotide content of gene body sequences, in particular introns. In contrast to canonical regulations mediated by specific DNA motifs, our model unveils a contribution of global nucleotide content notably in co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains. Overall our study confirms and takes advantage of the existence of sequence-level instructions for gene expression, which lie in genomic regions largely underestimated in regulatory genomics but which appear to be linked to chromatin architecture.

Details

Language :
English
ISSN :
1553734X and 15537358
Database :
OpenAIRE
Journal :
PLoS Computational Biology, PLoS Computational Biology, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Vol 14, Iss 1, p e1005921 (2018), PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. 〈10.1371/journal.pcbi.1005921〉
Accession number :
edsair.pmid.dedup....dc15a010928c6c41c23f8d241d95dcee
Full Text :
https://doi.org/10.1371/journal.pcbi.1005921⟩