Back to Search
Start Over
Mapping the Arabidopsis thaliana proteome in PeptideAtlas and the nature of the unobserved (dark) proteome; strategies towards a complete proteome.
- Source :
-
BioRxiv : the preprint server for biology [bioRxiv] 2023 Jun 05. Date of Electronic Publication: 2023 Jun 05. - Publication Year :
- 2023
-
Abstract
- This study describes a new release of the Arabidopsis thaliana PeptideAtlas proteomics resource providing protein sequence coverage, matched mass spectrometry (MS) spectra, selected PTMs, and metadata. 70 million MS/MS spectra were matched to the Araport11 annotation, identifying ∼0.6 million unique peptides and 18267 proteins at the highest confidence level and 3396 lower confidence proteins, together representing 78.6% of the predicted proteome. Additional identified proteins not predicted in Araport11 should be considered for building the next Arabidopsis genome annotation. This release identified 5198 phosphorylated proteins, 668 ubiquitinated proteins, 3050 N-terminally acetylated proteins and 864 lysine-acetylated proteins and mapped their PTM sites. MS support was lacking for 21.4% (5896 proteins) of the predicted Araport11 proteome - the 'dark' proteome. This dark proteome is highly enriched for certain ( e.g. CLE, CEP, IDA, PSY) but not other ( e.g. THIONIN, CAP,) signaling peptides families, E3 ligases, TFs, and other proteins with unfavorable physicochemical properties. A machine learning model trained on RNA expression data and protein properties predicts the probability for proteins to be detected. The model aids in discovery of proteins with short-half life ( e.g. SIG1,3 and ERF-VII TFs) and completing the proteome. PeptideAtlas is linked to TAIR, JBrowse, PPDB, SUBA, UniProtKB and Plant PTM Viewer.
Details
- Language :
- English
- Database :
- MEDLINE
- Journal :
- BioRxiv : the preprint server for biology
- Accession number :
- 37333403
- Full Text :
- https://doi.org/10.1101/2023.06.01.543322