Back to Search Start Over

Construction and assessment of individualized proteogenomic databases for large-scale analysis of nonsynonymous single nucleotide variants.

Authors :
Krug K
Popic S
Carpy A
Taumer C
Macek B
Source :
Proteomics [Proteomics] 2014 Dec; Vol. 14 (23-24), pp. 2699-708. Date of Electronic Publication: 2014 Nov 17.
Publication Year :
2014

Abstract

Next-generation sequencing projects focusing on genomes and transcriptomes identify millions of single nucleotide variants (SNVs), many of which result in single amino acid substitutions. These nonsynonymous (ns) SNVs are typically not incorporated into protein sequence databases used to identify MS/MS data. Here, we perform a comparative analysis of the assembly of nsSNV-containing proteogenomic databases. We use a comprehensive transcriptome and proteome dataset of HeLa cells from the literature to derive and to incorporate SNVs into databases applicable to proteomics search engines, and to assess their performance in the identification of nsSNVs. We assemble the databases by (1) translation of SNV-containing transcripts into all possible reading frames, (2) translation of predicted reading frame, (3) prediction of nsSNVs and subsequent incorporation into canonical protein sequences. We show substantial differences between generated databases in terms of represented nsSNVs and theoretical search space, affecting sensitivity and specificity of database search. We query the databases with >2.2M high-resolution MS/MS spectra using MaxQuant software and identify 451 variant peptides, containing 401 nsSNVs. We conclude that prediction of reading frame and, if applicable, SNV effect result in comprehensive yet compact databases necessary to retain sensitivity in large-scale analysis of nsSNVs called from transcriptomics data.<br /> (© 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.)

Details

Language :
English
ISSN :
1615-9861
Volume :
14
Issue :
23-24
Database :
MEDLINE
Journal :
Proteomics
Publication Type :
Academic Journal
Accession number :
25251379
Full Text :
https://doi.org/10.1002/pmic.201400219