Back to Search Start Over

Comparative Analysis of Unsupervised Protein Similarity Prediction Based on Graph Embedding.

Authors :
Zhang, Yuanyuan
Wang, Ziqi
Wang, Shudong
Shang, Junliang
Source :
Frontiers in Genetics; 9/22/2021, Vol. 12, p1-11, 11p
Publication Year :
2021

Abstract

The study of protein–protein interaction and the determination of protein functions are important parts of proteomics. Computational methods are used to study the similarity between proteins based on Gene Ontology (GO) to explore their functions and possible interactions. GO is a series of standardized terms that describe gene products from molecular functions, biological processes, and cell components. Previous studies on assessing the similarity of GO terms were primarily based on Information Content (IC) between GO terms to measure the similarity of proteins. However, these methods tend to ignore the structural information between GO terms. Therefore, considering the structural information of GO terms, we systematically analyze the performance of the GO graph and GO Annotation (GOA) graph in calculating the similarity of proteins using different graph embedding methods. When applied to the actual Human and Yeast datasets, the feature vectors of GO terms and proteins are learned based on different graph embedding methods. To measure the similarity of the proteins annotated by different GO numbers, we used Dynamic Time Warping (DTW) and cosine to calculate protein similarity in GO graph and GOA graph, respectively. Link prediction experiments were then performed to evaluate the reliability of protein similarity networks constructed by different methods. It is shown that graph embedding methods have obvious advantages over the traditional IC-based methods. We found that random walk graph embedding methods, in particular, showed excellent performance in calculating the similarity of proteins. By comparing link prediction experiment results from GO(DTW) and GOA(cosine) methods, it is shown that GO(DTW) features provide highly effective information for analyzing the similarity among proteins. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
16648021
Volume :
12
Database :
Complementary Index
Journal :
Frontiers in Genetics
Publication Type :
Academic Journal
Accession number :
152579748
Full Text :
https://doi.org/10.3389/fgene.2021.744334