Back to Search Start Over

The human genome contains over a million autonomous exons.

Authors :
Stepankiw N
Yang AWH
Hughes TR
Source :
Genome research [Genome Res] 2023 Dec 01; Vol. 33 (11), pp. 1865-1878. Date of Electronic Publication: 2023 Dec 01.
Publication Year :
2023

Abstract

Mammalian mRNA and lncRNA exons are often small compared to introns. The exon definition model predicts that exons splice autonomously, dependent on proximal exon sequence features, explaining their delineation within large introns. This model has not been examined on a genome-wide scale, however, leaving open the question of how often mRNA and lncRNA exons are autonomous. It is also unknown how frequently such exons can arise by chance. Here, we directly assayed large fragments (500-1000 bp) of the human genome by exon trapping, which detects exons spliced into a heterologous transgene, here designed with a large intron context. We define the trapped exons as "autonomous." We obtained ∼1.25 million trapped exons, including most known mRNA and well-annotated lncRNA internal exons, demonstrating that human exons are predominantly autonomous. mRNA exons are trapped with the highest efficiency. Nearly a million of the trapped exons are unannotated, most located in intergenic regions and antisense to mRNA, with depletion from the forward strand of introns. These exons are not conserved, suggesting they are nonfunctional and arose from random mutations. They are nonetheless highly enriched with known splicing promoting sequence features that delineate known exons. Novel autonomous exons are more numerous than annotated lncRNA exons, and computational models also indicate they will occur with similar frequency in any randomly generated sequence. These results show that most human coding exons splice autonomously, and provide an explanation for the existence of many unconserved lncRNAs, as well as a new annotation and inclusion levels of spliceable loci in the human genome.<br /> (© 2023 Stepankiw et al.; Published by Cold Spring Harbor Laboratory Press.)

Details

Language :
English
ISSN :
1549-5469
Volume :
33
Issue :
11
Database :
MEDLINE
Journal :
Genome research
Publication Type :
Academic Journal
Accession number :
37945377
Full Text :
https://doi.org/10.1101/gr.277792.123