Back to Search Start Over

The Integration of Proteogenomics and Ribosome Profiling Circumvents Key Limitations to Increase the Coverage and Confidence of Novel Microproteins.

Authors :
de Souza EV
Bookout AL
Barnes CA
Miller B
Machado P
Basso LA
Bizarro CV
Saghatelian A
Source :
BioRxiv : the preprint server for biology [bioRxiv] 2023 Oct 02. Date of Electronic Publication: 2023 Oct 02.
Publication Year :
2023

Abstract

There has been a dramatic increase in the identification of non-conical translation and a significant expansion of the protein-coding genome and proteome. Among the strategies used to identify novel small ORFs (smORFs), Ribosome profiling (Ribo-Seq) is the gold standard for the annotation of novel coding sequences by reporting on smORF translation. In Ribo-Seq, ribosome-protected footprints (RPFs) that map to multiple sites in the genome are computationally removed since they cannot unambiguously be assigned to a specific genomic location, or to a specific transcript in the case of multiple isoforms. Furthermore, RPFs necessarily result in short (25-34 nucleotides) reads, increasing the chance of ambiguous and multi-mapping alignments, such that smORFs that reside in these regions cannot be identified by Ribo-Seq. Here, we show that the inclusion of proteogenomics to create a Ribosome Profiling and Proteogenomics Pipeline (RP3) bypasses this limitation to identify a group of microprotein-encoding smORFs that are missed by current Ribo-Seq pipelines. Moreover, we show that the microproteins identified by RP3 have different sequence compositions from the ones identified by Ribo-Seq-only pipelines, which can affect proteomics identification. In aggregate, the development of RP3 maximizes the detection and confidence of protein-encoding smORFs and microproteins.

Details

Language :
English
ISSN :
2692-8205
Database :
MEDLINE
Journal :
BioRxiv : the preprint server for biology
Accession number :
37808637
Full Text :
https://doi.org/10.1101/2023.09.27.559809