Back to Search
Start Over
Predicting protein functions using positive-unlabeled ranking with ontology-based priors.
- Source :
-
Bioinformatics . 2024 Supplement, Vol. 40, pi401-i409. 9p. - Publication Year :
- 2024
-
Abstract
- Automated protein function prediction is a crucial and widely studied problem in bioinformatics. Computationally, protein function is a multilabel classification problem where only positive samples are defined and there is a large number of unlabeled annotations. Most existing methods rely on the assumption that the unlabeled set of protein function annotations are negatives, inducing the false negative issue, where potential positive samples are trained as negatives. We introduce a novel approach named PU-GO, wherein we address function prediction as a positive-unlabeled ranking problem. We apply empirical risk minimization, i.e. we minimize the classification risk of a classifier where class priors are obtained from the Gene Ontology hierarchical structure. We show that our approach is more robust than other state-of-the-art methods on similarity-based and time-based benchmark datasets. Availability and implementation Data and code are available at https://github.com/bio-ontology-research-group/PU-GO. [ABSTRACT FROM AUTHOR]
- Subjects :
- *GENE ontology
*SET functions
*CLASSIFICATION
*ANNOTATIONS
*BIOINFORMATICS
Subjects
Details
- Language :
- English
- ISSN :
- 13674803
- Volume :
- 40
- Database :
- Academic Search Index
- Journal :
- Bioinformatics
- Publication Type :
- Academic Journal
- Accession number :
- 178779010
- Full Text :
- https://doi.org/10.1093/bioinformatics/btae237