1. Topical network embedding
- Author
-
Yufei Tang, Min Shi, Xingquan Zhu, Jianxun Liu, and Haibo He
- Subjects
Topic model ,Dependency (UML) ,Theoretical computer science ,Computer Networks and Communications ,Computer science ,Context (language use) ,02 engineering and technology ,Computer Science Applications ,CORA dataset ,020204 information systems ,Node (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Embedding ,020201 artificial intelligence & image processing ,Relevance (information retrieval) ,Pairwise comparison ,Information Systems - Abstract
Networked data involve complex information from multifaceted channels, including topology structures, node content, and/or node labels etc., where structure and content are often correlated but are not always consistent. A typical scenario is the citation relationships in scholarly publications where a paper is cited by others not because they have the same content, but because they share one or multiple subject matters. To date, while many network embedding methods exist to take the node content into consideration, they all consider node content as simple flat word/attribute set and nodes sharing connections are assumed to have dependency with respect to all words or attributes. In this paper, we argue that considering topic-level semantic interactions between nodes is crucial to learn discriminative node embedding vectors. In order to model pairwise topic relevance between linked text nodes, we propose topical network embedding, where interactions between nodes are built on the shared latent topics. Accordingly, we propose a unified optimization framework to simultaneously learn topic and node representations from the network text contents and structures, respectively. Meanwhile, the structure modeling takes the learned topic representations as conditional context under the principle that two nodes can infer each other contingent on the shared latent topics. Experiments on three real-world datasets demonstrate that our approach can learn significantly better network representations, i.e., 4.1% improvement over the state-of-the-art methods in terms of Micro-F1 on Cora dataset. (The source code of the proposed method is available through the github link: https://github.com/codeshareabc/TopicalNE.)
- Published
- 2019
- Full Text
- View/download PDF