Back to Search
Start Over
A Refined 3-in-1 Fused Protein Similarity Measure: Application in Threshold-Free Hub Detection
- Source :
- IEEE/ACM transactions on computational biology and bioinformatics. 19(1)
- Publication Year :
- 2020
-
Abstract
- An exhaustive literature survey shows that finding protein/gene similarity is an important step towards solving widespread bioinformatics problems, such as predicting protein-protein interactions, analyzing Protein-Protein Interaction Networks (PPINs), gene prioritization, and disease gene/protein detection. In this article, we have proposed an improved 3-in-1 fused protein similarity measure called FuSim-II. It is built upon combining the weighted average of biological knowledge extracted from three potential genomic/ proteomic resources such as Gene Ontology (GO), PPIN, and protein sequence. Furthermore, we have shown the application of the proposed measure in detecting potential hub-proteins from a given PPIN. Aiming that, we have proposed a multi-objective clustering-based protein hub detection framework with FuSim-II working as the underlying proximity measure. The PPINs of H. Sapiens and M. Musculus organisms are chosen for experimental purposes. Unlike most of the existing hub-detection methods, the proposed technique does not require to follow any protein degree cut-off or threshold to define hubs. A thorough assessment of efficiency between proposed and existing eight protein similarity measures along with eight single/multi-objective clustering methods has been carried out. Internal cluster validity indices like Silhouette and Davies Bouldin (DB) are deployed to accomplish analytical study. Also, a comparative performance analysis between proposed and five existing hub-proteins detection algorithms is conducted through the enrichment of essentiality study. The reported results show the improved performance of FuSim-II over existing protein similarity measures in terms of identifying functionally related proteins as well as relevant hub-proteins. Supplementary material is available at http://csse.szu.edu.cn/staff/cuilz/eng/index.html.
- Subjects :
- Proteomics
Proximity measure
Gene ontology
Computer science
Applied Mathematics
0206 medical engineering
Computational Biology
Proteins
02 engineering and technology
computer.software_genre
Measure (mathematics)
Protein sequencing
Gene Ontology
Genetic similarity
Protein similarity
Genetics
Cluster Analysis
Data mining
Literature survey
Cluster analysis
computer
020602 bioinformatics
Algorithms
Biotechnology
Subjects
Details
- ISSN :
- 15579964
- Volume :
- 19
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- IEEE/ACM transactions on computational biology and bioinformatics
- Accession number :
- edsair.doi.dedup.....063227a494ca336633fbb718cada3b2e