Back to Search
Start Over
Deeply Mining a Universe of Peptides Encoded by Long Noncoding RNAs
- Source :
- Molecular & Cellular Proteomics : MCP
- Publication Year :
- 2021
- Publisher :
- Elsevier BV, 2021.
-
Abstract
- Many small ORFs embedded in long noncoding RNA (lncRNA) transcripts have been shown to encode biologically functional polypeptides (small ORF-encoded polypeptides [SEPs]) in different organisms. Despite some novel SEPs have been found, the identification is still hampered by their poor predictability, diminutive size, and low relative abundance. Here, we take advantage of NONCODE, a repository containing the most complete collection and annotation of lncRNA transcripts from different species, to build a novel database that attempts to maximize a collection of SEPs from human and mouse lncRNA transcripts. In order to further improve SEP discovery, we implemented two effective and complementary polypeptide enrichment strategies using 30-kDa molecular weight cutoff filter and C8 solid-phase extraction column. These combined strategies enabled us to discover 353 SEPs from eight human cell lines and 409 SEPs from three mouse cell lines and eight mouse tissues. Importantly, 19 of them were then verified through in vitro expression, immunoblotting, parallel reaction monitoring, and synthetic peptides. Subsequent bioinformatics analysis revealed that some of the physical and chemical properties of these novel SEPs, including amino acid composition and codon usage, are different from those commonly found in canonical proteins. Intriguingly, nearly 65% of the identified SEPs were found to be initiated with non-AUG start codons. The 762 novel SEPs probably represent the largest number of SEPs detected by MS reported to date. These novel SEPs might not only provide new clues for the annotation of noncoding elements in the genome but also serve as a valuable resource for functional study.<br />Graphical abstract<br />Highlights • Complementary enrichment strategies combined with membrane filtration and C8 SPE. • A combined database with the comprehensive putative SEPs and canonical proteins used. • Seven hundred sixty-two novel SEPs identified from human cell lines, mouse cell lines, and mouse tissues. • Nineteen SEPs have been validated by fusion expression or synthetic peptides.<br />In Brief This study proposed a new and effective strategy for the improved discovery and identification of novel SEPs, including the construction of databases maximally collecting all putative small ORFs from human and mouse lncRNA transcripts in NONCODE and the effective enrichment of polypeptides based on 30-kDa molecular weight cutoff (MWCO) membrane and C8 solid-phase extraction column. This effort led to the discovery of 762 novel lncRNA-encoded SEPs from multiple cell lines and tissues.
- Subjects :
- Male
enrichment
MWCO, molecular weight cutoff
smORF, short or small ORF
Bioinformatics analysis
Computational biology
Biology
ENCODE
Biochemistry
Genome
Mass Spectrometry
Cell Line
Analytical Chemistry
lincRNA, long intergenic noncoding RNA
Open Reading Frames
03 medical and health sciences
Start codon
PRM, parallel reaction monitoring
Animals
Humans
long noncoding RNA
ORFS
Molecular Biology
FA, formic acid
030304 developmental biology
EGFP, enhanced GFP
0303 health sciences
DMEM, Dulbecco's modified Eagle's medium
Research
NONCODE database
SPE, solid-phase extraction
030302 biochemistry & molecular biology
HEK 293 cells
MS
ACN, acetonitrile
HEK293T, human embryonic kidney 293T
MEF, mouse embryonic fibroblast
Long non-coding RNA
Mice, Inbred C57BL
smORF-encoded polypeptides
Codon usage bias
SEP, small ORF-encoded polypeptide
mESC, mouse embryonic stem cell
Female
RNA, Long Noncoding
LC–MS/MS, LC–tandem MS
lncRNA, long noncoding RNA
Peptides
Subjects
Details
- ISSN :
- 15359476
- Volume :
- 20
- Database :
- OpenAIRE
- Journal :
- Molecular & Cellular Proteomics
- Accession number :
- edsair.doi.dedup.....fa75c51dd85f83820ceed91f57b9234f
- Full Text :
- https://doi.org/10.1016/j.mcpro.2021.100109