Back to Search Start Over

Annotating N termini for the human proteome project: N termini and Nα-acetylation status differentiate stable cleaved protein species from degradation remnants in the human erythrocyte proteome.

Authors :
Lange PF
Huesgen PF
Nguyen K
Overall CM
Source :
Journal of proteome research [J Proteome Res] 2014 Apr 04; Vol. 13 (4), pp. 2028-44. Date of Electronic Publication: 2014 Mar 10.
Publication Year :
2014

Abstract

A goal of the Chromosome-centric Human Proteome Project is to identify all human protein species. With 3844 proteins annotated as "missing", this is challenging. Moreover, proteolytic processing generates new protein species with characteristic neo-N termini that are frequently accompanied by altered half-lives, function, interactions, and location. Enucleated and largely void of internal membranes and organelles, erythrocytes are simple yet proteomically challenging cells due to the high hemoglobin content and wide dynamic range of protein concentrations that impedes protein identification. Using the N-terminomics procedure TAILS, we identified 1369 human erythrocyte natural and neo-N-termini and 1234 proteins. Multiple semitryptic N-terminal peptides exhibited improved mass spectrometric identification properties versus the intact tryptic peptide enabling identification of 281 novel erythrocyte proteins and six missing proteins identified for the first time in the human proteome. With an improved bioinformatics workflow, we developed a new classification system and the Terminus Cluster Score. Thereby we described a new stabilizing N-end rule for processed protein termini, which discriminates novel protein species from degradation remnants, and identified protein domain hot spots susceptible to cleavage. Strikingly, 68% of the N-termini were within genome-encoded protein sequences, revealing alternative translation initiation sites, pervasive endoproteolytic processing, and stabilization of protein fragments in vivo. The mass spectrometry proteomics data have been deposited to ProteomeXchange with the data set identifier <PXD000434>.

Details

Language :
English
ISSN :
1535-3907
Volume :
13
Issue :
4
Database :
MEDLINE
Journal :
Journal of proteome research
Publication Type :
Academic Journal
Accession number :
24555563
Full Text :
https://doi.org/10.1021/pr401191w