Back to Search Start Over

Predicting protein functions using positive-unlabeled ranking with ontology-based priors.

Authors :
Zhapa-Camacho, Fernando
Tang, Zhenwei
Kulmanov, Maxat
Hoehndorf, Robert
Source :
Bioinformatics. 2024 Supplement, Vol. 40, pi401-i409. 9p.
Publication Year :
2024

Abstract

Automated protein function prediction is a crucial and widely studied problem in bioinformatics. Computationally, protein function is a multilabel classification problem where only positive samples are defined and there is a large number of unlabeled annotations. Most existing methods rely on the assumption that the unlabeled set of protein function annotations are negatives, inducing the false negative issue, where potential positive samples are trained as negatives. We introduce a novel approach named PU-GO, wherein we address function prediction as a positive-unlabeled ranking problem. We apply empirical risk minimization, i.e. we minimize the classification risk of a classifier where class priors are obtained from the Gene Ontology hierarchical structure. We show that our approach is more robust than other state-of-the-art methods on similarity-based and time-based benchmark datasets. Availability and implementation Data and code are available at https://github.com/bio-ontology-research-group/PU-GO. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13674803
Volume :
40
Database :
Academic Search Index
Journal :
Bioinformatics
Publication Type :
Academic Journal
Accession number :
178779010
Full Text :
https://doi.org/10.1093/bioinformatics/btae237