1. Learning Distilled Graph for Large-Scale Social Network Data Clustering
- Author
-
Liu W, Gong D, Tan M, Shi JQ, Yang Y, Hauptmann AG, Liu W, Gong D, Tan M, Shi JQ, Yang Y, and Hauptmann AG
- Abstract
© 1989-2012 IEEE. Spectral analysis is critical in social network analysis. As a vital step of the spectral analysis, the graph construction in many existing works utilizes content data only. Unfortunately, the content data often consists of noisy, sparse, and redundant features, which makes the resulting graph unstable and unreliable. In practice, besides the content data, social network data also contain link information, which provides additional information for graph construction. Some of previous works utilize the link data. However, the link data is often incomplete, which makes the resulting graph incomplete. To address these issues, we propose a novel Distilled Graph Clustering (DGC) method. It pursuits a distilled graph based on both the content data and the link data. The proposed algorithm alternates between two steps: in the feature selection step, it finds the most representative feature subset w.r.t. an intermediate graph initialized with link data; in graph distillation step, the proposed method updates and refines the graph based on only the selected features. The final resulting graph, which is referred to as the distilled graph, is then utilized for spectral clustering on the large-scale social network data. Extensive experiments demonstrate the superiority of the proposed method.
- Published
- 2020