Back to Search Start Over

On the design of a similarity function for sparse binary data with application on protein function annotation

Authors :
Marcelo B.A. Veras
Bishnu Sarker
Sabeur Aridhi
João P.P. Gomes
José A.F. Macêdo
Engelbert Mephu Nguifo
Marie-Dominique Devignes
Malika Smaïl-Tabbone
State University of Ceara / Universidade Estadual do Ceara (UECE)
Computational Algorithms for Protein Structures and Interactions (CAPSID)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Université Clermont Auvergne (UCA)
Source :
Knowledge-Based Systems, Knowledge-Based Systems, 2022, 238, pp.107863. ⟨10.1016/j.knosys.2021.107863⟩
Publication Year :
2022
Publisher :
Elsevier BV, 2022.

Abstract

International audience; Automatic protein function annotation is a challenging task that is fundamental in many medical applications. Indeed, the capability to predict whether a protein has a given function is a key step for disease understanding and drug design. For such reasons, many authors have proposed computational methods for protein function prediction. One key element that is present in many proposals is similarity functions. Such functions are often used to compute the pairwise similarity between two proteins. It is commonly accepted that proteins with similar structures share the same function. Nevertheless, no previous works have focused on proposing a similarity function that is specifically designed for protein function annotation. In this work, we analyze the best similarity functions for the protein function annotation task and propose a new one. We performed experiments in a simple pairwise similarity scenario and also using our proposal as part of a more complex protein function annotation method. Based on the results, we can state that our proposal is a valid alternative as a building block of many protein function annotation methods.

Details

ISSN :
09507051 and 18727409
Volume :
238
Database :
OpenAIRE
Journal :
Knowledge-Based Systems
Accession number :
edsair.doi.dedup.....12640508e971f986c444bce1ad1e0b2a
Full Text :
https://doi.org/10.1016/j.knosys.2021.107863