Back to Search Start Over

Accurately Modeling Biased Random Walks on Weighted Graphs Using $\textit{Node2vec+}$

Authors :
Liu, Renming
Hirn, Matthew
Krishnan, Arjun
Publication Year :
2021

Abstract

Node embedding is a powerful approach for representing the structural role of each node in a graph. $\textit{Node2vec}$ is a widely used method for node embedding that works by exploring the local neighborhoods via biased random walks on the graph. However, $\textit{node2vec}$ does not consider edge weights when computing walk biases. This intrinsic limitation prevents $\textit{node2vec}$ from leveraging all the information in weighted graphs and, in turn, limits its application to many real-world networks that are weighted and dense. Here, we naturally extend $\textit{node2vec}$ to $\textit{node2vec+}$ in a way that accounts for edge weights when calculating walk biases, but which reduces to $\textit{node2vec}$ in the cases of unweighted graphs or unbiased walks. We empirically show that $\textit{node2vec+}$ is more robust to additive noise than $\textit{node2vec}$ in weighted graphs using two synthetic datasets. We also demonstrate that $\textit{node2vec+}$ significantly outperforms $\textit{node2vec}$ on a commonly benchmarked multi-label dataset (Wikipedia). Furthermore, we test $\textit{node2vec+}$ against GCN and GraphSAGE using various challenging gene classification tasks on two protein-protein interaction networks. Despite some clear advantages of GCN and GraphSAGE, they show comparable performance with $\textit{node2vec+}$. Finally, $\textit{node2vec+}$ can be used as a general approach for generating biased random walks, benefiting all existing methods built on top of $\textit{node2vec}$. $\textit{Node2vec+}$ is implemented as part of $\texttt{PecanPy}$, which is available at https://github.com/krishnanlab/PecanPy .<br />Comment: The final analysis on gene classification in the previous version was incorrect. A bug in the code for GNN evaluation causes the GNN to have access to part of the testing data during training, and thus significantly biased the true testing evaluation for GNNs. A revision will be released that correct for this error will be released shortly

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2109.08031
Document Type :
Working Paper