Back to Search
Start Over
A novel computational framework for genome-scale alternative transcription units prediction.
- Source :
-
Briefings in bioinformatics [Brief Bioinform] 2021 Nov 05; Vol. 22 (6). - Publication Year :
- 2021
-
Abstract
- Alternative transcription units (ATUs) are dynamically encoded under different conditions and display overlapping patterns (sharing one or more genes) under a specific condition in bacterial genomes. Genome-scale identification of ATUs is essential for studying the emergence of human diseases caused by bacterial organisms. However, it is unrealistic to identify all ATUs using experimental techniques because of the complexity and dynamic nature of ATUs. Here, we present the first-of-its-kind computational framework, named SeqATU, for genome-scale ATU prediction based on next-generation RNA-Seq data. The framework utilizes a convex quadratic programming model to seek an optimum expression combination of all of the to-be-identified ATUs. The predicted ATUs in Escherichia coli reached a precision of 0.77/0.74 and a recall of 0.75/0.76 in the two RNA-Sequencing datasets compared with the benchmarked ATUs from third-generation RNA-Seq data. In addition, the proportion of 5'- or 3'-end genes of the predicted ATUs, having documented transcription factor binding sites and transcription termination sites, was three times greater than that of no 5'- or 3'-end genes. We further evaluated the predicted ATUs by Gene Ontology and Kyoto Encyclopedia of Genes and Genomes functional enrichment analyses. The results suggested that gene pairs frequently encoded in the same ATUs are more functionally related than those that can belong to two distinct ATUs. Overall, these results demonstrated the high reliability of predicted ATUs. We expect that the new insights derived by SeqATU will not only improve the understanding of the transcription mechanism of bacteria but also guide the reconstruction of a genome-scale transcriptional regulatory network.<br /> (© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.)
- Subjects :
- Algorithms
Bacteria genetics
Databases, Genetic
Escherichia coli genetics
Genome, Bacterial
Genomics methods
Humans
RNA, Messenger genetics
RNA-Seq
Single-Cell Analysis methods
Terminator Regions, Genetic
Transcription Initiation Site
Computational Biology methods
Genome-Wide Association Study methods
RNA Isoforms
Transcription, Genetic
Subjects
Details
- Language :
- English
- ISSN :
- 1477-4054
- Volume :
- 22
- Issue :
- 6
- Database :
- MEDLINE
- Journal :
- Briefings in bioinformatics
- Publication Type :
- Academic Journal
- Accession number :
- 33957668
- Full Text :
- https://doi.org/10.1093/bib/bbab162