Back to Search
Start Over
Probing instructions for expression regulation in gene nucleotide compositions
- Source :
- PLoS Computational Biology, PLoS Computational Biology, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Vol 14, Iss 1, p e1005921 (2018), PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. 〈10.1371/journal.pcbi.1005921〉
- Publication Year :
- 2018
- Publisher :
- HAL CCSD, 2018.
-
Abstract
- Gene expression is orchestrated by distinct regulatory regions to ensure a wide variety of cell types and functions. A challenge is to identify which regulatory regions are active, what are their associated features and how they work together in each cell type. Several approaches have tackled this problem by modeling gene expression based on epigenetic marks, with the ultimate goal of identifying driving regions and associated genomic variations that are clinically relevant in particular in precision medicine. However, these models rely on experimental data, which are limited to specific samples (even often to cell lines) and cannot be generated for all regulators and all patients. In addition, we show here that, although these approaches are accurate in predicting gene expression, inference of TF combinations from this type of models is not straightforward. Furthermore these methods are not designed to capture regulation instructions present at the sequence level, before the binding of regulators or the opening of the chromatin. Here, we probe sequence-level instructions for gene expression and develop a method to explain mRNA levels based solely on nucleotide features. Our method positions nucleotide composition as a critical component of gene expression. Moreover, our approach, able to rank regulatory regions according to their contribution, unveils a strong influence of the gene body sequence, in particular introns. We further provide evidence that the contribution of nucleotide content can be linked to co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains.<br />Author summary Identifying a maximum of DNA determinants implicated in gene regulation will accelerate genetic analyses and precision medicine approaches by identifying key gene features. In that context decoding the sequence-level instructions for gene regulation is of prime importance. Among global efforts to achieve this objective, we propose a novel approach able to explain gene expression in each patient sample using only DNA features. Our approach, which is as accurate as methods based on epigenetics data, reveals a strong influence of the nucleotide content of gene body sequences, in particular introns. In contrast to canonical regulations mediated by specific DNA motifs, our model unveils a contribution of global nucleotide content notably in co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains. Overall our study confirms and takes advantage of the existence of sequence-level instructions for gene expression, which lie in genomic regions largely underestimated in regulatory genomics but which appear to be linked to chromatin architecture.
- Subjects :
- Decision Analysis
Gene Expression
Regulatory Sequences, Nucleic Acid
Biochemistry
Database and Informatics Methods
Neoplasms
Biology (General)
Promoter Regions, Genetic
Base Composition
DNA methylation
Genomics
Chromatin
Nucleic acids
Enhancer Elements, Genetic
Engineering and Technology
Epigenetics
DNA modification
Sequence Analysis
Management Engineering
Chromatin modification
Research Article
Chromosome biology
Cell biology
DNA Copy Number Variations
Bioinformatics
QH301-705.5
Quantitative Trait Loci
Nucleotide Sequencing
Research and Analysis Methods
Genome Complexity
Polymorphism, Single Nucleotide
Sequence Motif Analysis
[ INFO.INFO-BI ] Computer Science [cs]/Bioinformatics [q-bio.QM]
Genetics
Humans
Gene Regulation
RNA, Messenger
Gene Prediction
Molecular Biology Techniques
Sequencing Techniques
Molecular Biology
Models, Genetic
Genome, Human
Decision Trees
Computational Biology
Biology and Life Sciences
DNA
Genome Analysis
Introns
Gene Expression Regulation
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
Transcription Factors
Subjects
Details
- Language :
- English
- ISSN :
- 1553734X and 15537358
- Database :
- OpenAIRE
- Journal :
- PLoS Computational Biology, PLoS Computational Biology, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Vol 14, Iss 1, p e1005921 (2018), PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. ⟨10.1371/journal.pcbi.1005921⟩, PLoS Computational Biology, Public Library of Science, 2018, 14 (1), pp.e1005921. 〈10.1371/journal.pcbi.1005921〉
- Accession number :
- edsair.pmid.dedup....dc15a010928c6c41c23f8d241d95dcee
- Full Text :
- https://doi.org/10.1371/journal.pcbi.1005921⟩