Back to Search
Start Over
Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants
- Source :
- Cancers, Volume 13, Issue 20, Cancers, Vol 13, Iss 5034, p 5034 (2021), Cancers, 13(20). MDPI
- Publication Year :
- 2021
- Publisher :
- Multidisciplinary Digital Publishing Institute, 2021.
-
Abstract
- Simple Summary:& nbsp;We reanalyzed a publicly available breast cancer proteomics dataset consisting of 122 human tumor samples using a scalable cloud computing workflow. By doing so, we were able to search these files against millions of known human sequence variants and hundreds of common post-translational protein modifications, thereby demonstrating the power of cloud computing to address proteomic data in a true biological context. We identified thousands of relevant sequence variants and PTMs, indicating that the original studies may have only scratched the surface of the true value of the CPTAC studies completed to date. We present the results of this reanalysis in a searchable web interface for community analysis and validation.The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has provided some of the most in-depth analyses of the phenotypes of human tumors ever constructed. Today, the majority of proteomic data analysis is still performed using software housed on desktop computers which limits the number of sequence variants and post-translational modifications that can be considered. The original CPTAC studies limited the search for PTMs to only samples that were chemically enriched for those modified peptides. Similarly, the only sequence variants considered were those with strong evidence at the exon or transcript level. In this multi-institutional collaborative reanalysis, we utilized unbiased protein databases containing millions of human sequence variants in conjunction with hundreds of common post-translational modifications. Using these tools, we identified tens of thousands of high-confidence PTMs and sequence variants. We identified 4132 phosphorylated peptides in nonenriched samples, 93% of which were confirmed in the samples which were chemically enriched for phosphopeptides. In addition, our results also cover 90% of the high-confidence variants reported by the original proteogenomics study, without the need for sample specific next-generation sequencing. Finally, we report fivefold more somatic and germline variants that have an independent evidence at the peptide level, including mutations in ERRB2 and BCAS1. In this reanalysis of CPTAC proteomic data with cloud computing, we present an openly available and searchable web resource of the highest-coverage proteomic profiling of human tumors described to date.
- Subjects :
- Cancer Research
CPTAC
Proteomic Profiling
business.industry
cloud computing
Neoplasms. Tumors. Oncology. Including cancer and carcinogens
tumor proteomics
Cloud computing
Computational biology
Biology
Proteogenomics
Proteomics
Phenotype
Article
Exon
Protein sequencing
proteomics
Oncology
proteogenomics
post-translational modifications
cancer
1112 Oncology and Carcinogenesis
business
RC254-282
Sequence (medicine)
Subjects
Details
- Language :
- English
- ISSN :
- 20726694
- Database :
- OpenAIRE
- Journal :
- Cancers
- Accession number :
- edsair.doi.dedup.....d37b34935f469fcea3d43165ef7dcf77
- Full Text :
- https://doi.org/10.3390/cancers13205034