51. Community Evaluation of Glycoproteomics Informatics Solutions Reveals High-Performance Search Strategies of SerumN- andO-Glycopeptide Data
- Author
-
Wantao Ying, Josef M. Penninger, Yehia Mechref, Gun Wook Park, Weiqian Cao, Morten Thaysen-Andersen, Jingfu Zhao, Rebeca Kawahara, Radoslav Goldman, Kai-Hooi Khoo, Mingqi Liu, Marcus Hoffmann, Nicolle H. Packer, Benjamin L. Schulz, Erdmann Rapp, Enes Sakalli, Miloslav Sanda, Yingwei Hu, Hui Zhang, Jonas Nilsson, Doron Kletter, Sriram Neelamegham, Nathan Edwards, Cassandra L. Pegg, Pengyuan Yang, Jong Shin Yoo, Hung-Yi Wu, Daniel Kolarich, Adam Pap, Robert J. Chalkley, Georgy Sofronov, Benjamin L. Parker, Terry Nguyen-Khuong, Kai Cheng, Yong Zhang, Bo Meng, Nichollas E. Scott, Benoit Liquet-Weiland, Joseph Zaia, Sergey Y. Vakhrushev, Markus Pioch, Johannes Stadlmann, Toan K. Phung, Marshall Bern, Christina M. Woo, Katalin F. Medzihradszky, Stuart M. Haslam, Giuseppe Palmisano, Anastasia Chernykh, Göran Larson, Matthew S F Choo, Jin Young Kim, Yifan Huang, and Kathirvel Alagesan
- Subjects
Profiling (computer programming) ,Software ,Computer science ,business.industry ,Informatics ,Human proteome project ,business ,Community evaluation ,Data science ,Tandem mass spectrum ,Glycoproteomics - Abstract
Glycoproteome profiling (glycoproteomics) is a powerful yet analytically challenging research tool. The complex tandem mass spectra generated from glycopeptide mixtures require sophisticated analysis pipelines for structural determination. Diverse software aiding the process have appeared, but their relative performance remains untested. Conducted through the HUPO Human Proteome Project – Human Glycoproteomics Initiative, this community study, comprising both developers and users of glycoproteomics software, evaluates the performance of informatics solutions for system-wide glycopeptide analysis. Mass spectrometry-based glycoproteomics datasets from human serum were shared with all teams. The relative team performance forN- andO-glycopeptide data analysis was comprehensively established and validated through orthogonal performance tests. Excitingly, several high-performance glycoproteomics informatics solutions were identified. While the study illustrated that significant informatics challenges remain, as indicated by a high discordance between annotated glycopeptides, lists of high-confidence (consensus) glycopeptides were compiled from the standardised team reports. Deep analysis of the performance data revealed key performance-associated search variables and led to recommendations for improved “high coverage” and “high accuracy” glycoproteomics search strategies. This study concludes that diverse software for comprehensive glycopeptide data analysis exist, points to several high-performance search strategies, and specifies key variables that may guide future software developments and assist informatics decision-making in glycoproteomics.
- Published
- 2021
- Full Text
- View/download PDF