Clustering and expansion improvement for well-clustered graphs
View/ Open
Date
09/12/2022Author
Manghiuc, Bogdan-Adrian
Metadata
Abstract
A well-clustered graph is a collection of densely connected components (clusters) such that the vertices inside each cluster are better connected than those between different clusters. Well-clustered graphs constitute one of the most important families of graphs occurring in various practical domains of scientific disciplines, including data science, social network analysis and bioinformatics. This thesis conducts algorithmic studies of well-clustered graphs and, by focusing on the clustering and expansion improvement problems, the thesis presents three algorithms which not only broaden our theoretical understanding but also perform well in practice. Specifically, the main contributions are summarised as follows:
Firstly, we consider the hierarchical clustering problem on well-clustered graphs. Hierarchical clustering studies a recursive representation of a data set into clusters of increasingly smaller size via a binary tree. Based on the cost function introduced by Dasgupta, we present a polynomial time $O(1)$-approximation algorithm that computes a hierarchical representation of a well-clustered graph. This algorithm is based on our linear time $O(1)$-approximation algorithm for graphs of high expansion, whose design bypasses complicated routines known in the literature. While constructing $O(1)$-approximate hierarchical trees for general graphs is NP-hard under the Small Set Expansion Hypothesis, our result shows that constructing such trees is tractable for well-clustered graphs.
Secondly, we consider the scenario when the input graph represents a network distributed among many sites. The design of most graph clustering algorithms is based on complicated techniques which are inapplicable in the distributed setting. We present a novel distributed algorithm for graph clustering that works for well-clustered graphs with clusters of arbitrary sizes, and the approximation guarantee of our algorithm is with respect to every individual cluster. In addition, our algorithm is easy to implement, and only requires a poly-logarithmic number of synchronous rounds for many input graphs.
Thirdly, we study the class of well-clustered graphs from the perspective of improving the overall expansion. The objective of the problem of improving the expansion is to add a certain number of external edges to our input, such that the resulting graph is very well connected. We present a fast algorithm that, given a set of suitable candidate edges, finds a small subset of edges which drastically improve the overall expansion if added to the input graph.
Overall, the thesis presents three results that improve the state-of-the-art for different problems concerning well-clustered graphs. We believe that these results together with the new techniques introduced would inspire future related studies.
Collections
Related items
Showing items related by title, author, creator and subject.
-
Observational study of galaxy clustering
Hewett, Paul Charles (The University of Edinburgh, 1982)An observational study of the clustering of faint galaxies in the magnitude range 19.5< mj <22.0 is undertaken. High quality plate material from the United Kingdom Schmidt Telescope, and automated machine measures from ... -
Interstellar matter in two star clusters
Samson, Willliam B. (The University of Edinburgh, 1971) -
Distances and ages of galactic globular clusters via infrared photometry
Buckley, David R. V. (The University of Edinburgh, 1994)One of the most promising routes to the formation of a quantitative model of the processes and timescales involved in transforming primordial, pre-galactic material into the galaxy as we know it today is the study of the ...