Back to Search
Start Over
On the design of a similarity function for sparse binary data with application on protein function annotation
- Source :
- Knowledge-Based Systems, Knowledge-Based Systems, 2022, 238, pp.107863. ⟨10.1016/j.knosys.2021.107863⟩
- Publication Year :
- 2022
- Publisher :
- Elsevier BV, 2022.
-
Abstract
- International audience; Automatic protein function annotation is a challenging task that is fundamental in many medical applications. Indeed, the capability to predict whether a protein has a given function is a key step for disease understanding and drug design. For such reasons, many authors have proposed computational methods for protein function prediction. One key element that is present in many proposals is similarity functions. Such functions are often used to compute the pairwise similarity between two proteins. It is commonly accepted that proteins with similar structures share the same function. Nevertheless, no previous works have focused on proposing a similarity function that is specifically designed for protein function annotation. In this work, we analyze the best similarity functions for the protein function annotation task and propose a new one. We performed experiments in a simple pairwise similarity scenario and also using our proposal as part of a more complex protein function annotation method. Based on the results, we can state that our proposal is a valid alternative as a building block of many protein function annotation methods.
- Subjects :
- [INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB]
Information Systems and Management
Sparse data
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Management Information Systems
ComputingMethodologies_PATTERNRECOGNITION
Similarity function
[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
Artificial Intelligence
Protein function annotation
Protein structure
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
Software
Subjects
Details
- ISSN :
- 09507051 and 18727409
- Volume :
- 238
- Database :
- OpenAIRE
- Journal :
- Knowledge-Based Systems
- Accession number :
- edsair.doi.dedup.....12640508e971f986c444bce1ad1e0b2a
- Full Text :
- https://doi.org/10.1016/j.knosys.2021.107863