1. Clustering and expansion improvement for well-clustered graphs
- Author
-
Manghiuc, Bogdan-Adrian, Sun, He, and Heunen, Chris
- Subjects
Well-Clustered Graph ,Hierarchical clustering ,Distributed graph clustering ,graph expansion - Abstract
A well-clustered graph is a collection of densely connected components (clusters) such that the vertices inside each cluster are better connected than those between different clusters. Well-clustered graphs constitute one of the most important families of graphs occurring in various practical domains of scientific disciplines, including data science, social network analysis and bioinformatics. This thesis conducts algorithmic studies of well-clustered graphs and, by focusing on the clustering and expansion improvement problems, the thesis presents three algorithms which not only broaden our theoretical understanding but also perform well in practice. Specifically, the main contributions are summarised as follows: Firstly, we consider the hierarchical clustering problem on well-clustered graphs. Hierarchical clustering studies a recursive representation of a data set into clusters of increasingly smaller size via a binary tree. Based on the cost function introduced by Dasgupta, we present a polynomial time $O(1)$-approximation algorithm that computes a hierarchical representation of a well-clustered graph. This algorithm is based on our linear time $O(1)$-approximation algorithm for graphs of high expansion, whose design bypasses complicated routines known in the literature. While constructing $O(1)$-approximate hierarchical trees for general graphs is NP-hard under the Small Set Expansion Hypothesis, our result shows that constructing such trees is tractable for well-clustered graphs. Secondly, we consider the scenario when the input graph represents a network distributed among many sites. The design of most graph clustering algorithms is based on complicated techniques which are inapplicable in the distributed setting. We present a novel distributed algorithm for graph clustering that works for well-clustered graphs with clusters of arbitrary sizes, and the approximation guarantee of our algorithm is with respect to every individual cluster. In addition, our algorithm is easy to implement, and only requires a poly-logarithmic number of synchronous rounds for many input graphs. Thirdly, we study the class of well-clustered graphs from the perspective of improving the overall expansion. The objective of the problem of improving the expansion is to add a certain number of external edges to our input, such that the resulting graph is very well connected. We present a fast algorithm that, given a set of suitable candidate edges, finds a small subset of edges which drastically improve the overall expansion if added to the input graph. Overall, the thesis presents three results that improve the state-of-the-art for different problems concerning well-clustered graphs. We believe that these results together with the new techniques introduced would inspire future related studies.
- Published
- 2022
- Full Text
- View/download PDF