Back to Search Start Over

Clustering Uncertain Graphs

Authors :
Ceccarello, Matteo
Fantozzi, Carlo
Pietracaprina, Andrea
Pucci, Geppino
Vandin, Fabio
Publication Year :
2016

Abstract

An uncertain graph $\mathcal{G} = (V, E, p : E \rightarrow (0,1])$ can be viewed as a probability space whose outcomes (referred to as \emph{possible worlds}) are subgraphs of $\mathcal{G}$ where any edge $e\in E$ occurs with probability $p(e)$, independently of the other edges. These graphs naturally arise in many application domains where data management systems are required to cope with uncertainty in interrelated data, such as computational biology, social network analysis, network reliability, and privacy enforcement, among the others. For this reason, it is important to devise fundamental querying and mining primitives for uncertain graphs. This paper contributes to this endeavor with the development of novel strategies for clustering uncertain graphs. Specifically, given an uncertain graph $\mathcal{G}$ and an integer $k$, we aim at partitioning its nodes into $k$ clusters, each featuring a distinguished center node, so to maximize the minimum/average connection probability of any node to its cluster's center, in a random possible world. We assess the NP-hardness of maximizing the minimum connection probability, even in the presence of an oracle for the connection probabilities, and develop efficient approximation algorithms for both problems and some useful variants. Unlike previous works in the literature, our algorithms feature provable approximation guarantees and are capable to keep the granularity of the returned clustering under control. Our theoretical findings are complemented with several experiments that compare our algorithms against some relevant competitors, with respect to both running-time and quality of the returned clusterings.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1612.06675
Document Type :
Working Paper