1. Strategies to enable large-scale proteomics for reproducible research
- Author
-
Keith Ashman, Asim Anees, Terence P. Speed, Erin K. Sykes, Roger R. Reddel, Yansheng Liu, Jennifer M. S. Koh, Jean Yang, Merridee A. Wouters, Steven G. Williams, Peter J. Wild, Anna deFazio, Natasha Lucas, Max Wittman, Dylan Xavier, Michael Hecker, Sadia Mahboob, Michael Dausmann, Ruedi Aebersold, Peter G. Hains, Brett Tully, Rohan Shah, Phillip J. Robinson, Qing Zhong, Rosemary L. Balleine, Srikanth S. Manda, and Rebecca C. Poulos
- Subjects
Male ,Proteomics ,0301 basic medicine ,Proteome ,Computer science ,Science ,Pipeline (computing) ,General Physics and Astronomy ,Saccharomyces cerevisiae ,Proteome informatics ,computer.software_genre ,Quantitative accuracy ,Mass Spectrometry ,Article ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,0302 clinical medicine ,Cell Line, Tumor ,Biomarkers, Tumor ,Humans ,Data-independent acquisition ,lcsh:Science ,Cancer ,Ovarian Neoplasms ,Data processing ,Reproducibility ,Multidisciplinary ,Scale (chemistry) ,High-throughput screening ,Prostatic Neoplasms ,Reproducibility of Results ,General Chemistry ,Missing data ,HEK293 Cells ,030104 developmental biology ,030220 oncology & carcinogenesis ,Female ,lcsh:Q ,Data mining ,computer - Abstract
Reproducible research is the bedrock of experimental science. To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop computational methods for improving quantitative accuracy. We perform 1560 data independent acquisition (DIA)-MS runs of eight samples containing known proportions of ovarian and prostate cancer tissue and yeast, or control HEK293T cells. Replicates are run on six mass spectrometers operating continuously with varying maintenance schedules over four months, interspersed with ~5000 other runs. We utilise negative controls and replicates to remove unwanted variation and enhance biological signal, outperforming existing methods. We also design a method for reducing missing values. Integrating these computational modules into a pipeline (ProNorM), we mitigate variation among instruments over time and accurately predict tissue proportions. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics., Nature Communications, 11 (1), ISSN:2041-1723
- Published
- 2020