Start Over

A learned score function improves the power of mass spectrometry database search.

Authors :: Ananth, Varun
Sanders, Justin
Yilmaz, Melih
Wen, Bo
Oh, Sewoong
Source :: Bioinformatics. 2024 Supplement, Vol. 40, pi410-i417. 8p.
Publication Year :: 2024
Abstract: Motivation One of the core problems in the analysis of protein tandem mass spectrometry data is the peptide assignment problem: determining, for each observed spectrum, the peptide sequence that was responsible for generating the spectrum. Two primary classes of methods are used to solve this problem: database search and de novo peptide sequencing. State-of-the-art methods for de novo sequencing use machine learning methods, whereas most database search engines use hand-designed score functions to evaluate the quality of a match between an observed spectrum and a candidate peptide from the database. We hypothesized that machine learning models for de novo sequencing implicitly learn a score function that captures the relationship between peptides and spectra, and thus may be re-purposed as a score function for database search. Because this score function is trained from massive amounts of mass spectrometry data, it could potentially outperform existing, hand-designed database search tools. Results To test this hypothesis, we re-engineered Casanovo, which has been shown to provide state-of-the-art de novo sequencing capabilities, to assign scores to given peptide-spectrum pairs. We then evaluated the statistical power of this Casanovo score function, Casanovo-DB, to detect peptides on a benchmark of three mass spectrometry runs from three different species. In addition, we show that re-scoring with the Percolator post-processor benefits Casanovo-DB more than other score functions, further increasing the number of detected peptides. [ABSTRACT FROM AUTHOR]

Subjects :: *MACHINE learning
*TANDEM mass spectrometry
*AMINO acid sequence
*DATABASE searching
*DATABASES

Details

Language :: English
ISSN :: 13674803
Volume :: 40
Database :: Academic Search Index
Journal :: Bioinformatics
Publication Type :: Academic Journal
Accession number :: 178778991
Full Text :: https://doi.org/10.1093/bioinformatics/btae218

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A learned score function improves the power of mass spectrometry database search.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A learned score function improves the power of mass spectrometry database search.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources