Performance, memory efficiency and programmability: the ambitious triptych of combining vertex-centricity with HPC
dc.contributor.advisor
Brown, Nicholas
dc.contributor.advisor
Cheney, James
dc.contributor.advisor
Bull, Mark
dc.contributor.author
Capelli, Ludovic A. R.
dc.contributor.sponsor
Engineering and Physical Sciences Research Council (EPSRC)
en
dc.date.accessioned
2023-06-15T13:16:14Z
dc.date.available
2023-06-15T13:16:14Z
dc.date.issued
2023-06-15
dc.description.abstract
The field of graph processing has grown significantly due to the flexibility and wide
applicability of the graph data structure. In the meantime, so has interest from the
community in developing new approaches to graph processing applications. In 2010,
Google introduced the vertex-centric programming model through their framework Pregel. This consists of expressing computation from the perspective of a vertex, whilst inter-vertex communications are achieved via data exchanges along incoming and outgoing edges, using the message-passing abstraction provided. Pregel ’s high-level programming interface, designed around a set of simple functions, provides ease of programmability to the user. The aim is to enable the development of graph processing applications without requiring expertise in optimisation or parallel programming. Such challenges are instead abstracted from the user and offloaded to the underlying framework. However, fine-grained synchronisation, unpredictable memory access patterns and multiple sources of load imbalance make it difficult to implement the vertex centric model efficiently on high performance computing platforms without sacrificing programmability.
This research focuses on combining vertex-centric and High-Performance Comput-
ing (HPC), resulting in the development of a shared-memory framework, iPregel, which
demonstrates that a performance and memory efficiency similar to that of non-vertex-
centric approaches can be achieved while preserving the programmability benefits of
vertex-centric. Non-volatile memory is then explored to extend single-node capabilities, during which multiple versions of iPregel are implemented to experiment with the various data movement strategies.
Then, distributed memory parallelism is investigated to overcome the resource limitations of single node processing. A second framework named DiP, which ports applicable iPregel ’s optimisations to distributed memory, prioritises performance to high scalability.
This research has resulted in a set of techniques and optimisations illustrated through a shared-memory framework iPregel and a distributed-memory framework DiP. The former closes a gap of several orders of magnitude in both performance and memory efficiency, even able to process a graph of 750 billion edges using non-volatile memory. The latter has proved that this competitiveness can also be scaled beyond a single node, enabling the processing of the largest graph generated in this research, comprising 1.6 trillion edges. Most importantly, both frameworks achieved these performance and capability gains whilst also preserving programmability, which is the cornerstone of the vertex-centric programming model. This research therefore demonstrates that by combining vertex-centricity and High-Performance Computing (HPC), it is possible to maintain performance, memory efficiency and programmability.
en
dc.identifier.uri
https://hdl.handle.net/1842/40673
dc.identifier.uri
http://dx.doi.org/10.7488/era/3434
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Ludovic A. R. Capelli, Z. Hu, T. A. K. Zakian, “iPregel: A combiner-based in-memory shared-memory vertex-centric framework” in Proceedings of the 47th International Conference on Parallel Processing Companion (2018). DOI: 10.1145/3229710.3229719
en
dc.relation.hasversion
Ludovic A. R. Capelli, Z. Hu, T. A. K. Zakian, N. Brown, J. M. Bull, “iPre gel: Vertex-centric programmability vs memory efficiency and performance, why choose?” in Journal of Parallel Computing 86 (2019) 45 – 56. DOI: 10.1016/j.parco.2019.04.005
en
dc.relation.hasversion
Ludovic A. R. Capelli, N. Brown, J. M. Bull, “iPregel: Strategies to deal with an extreme form of irregularity in vertex-centric graph processing” in Proceedings of The International Conference for High Performance Computing, Networking, Storage, and Analysis (2019). DOI: 10.1109/IA349570.2019.00013
en
dc.relation.hasversion
Ludovic A. R. Capelli, N. Brown, J. M. Bull, “NVRAM as an enabler to new horizons in graph processing” in Springer Nature Computer Science, Volume 3, Article 385 (2022) 1–13. DOI: 10.1007/s42979-022-01317-4
en
dc.relation.hasversion
T. A. Zakian, L. A. Capelli, Z. Hu, Incrementalization of Vertex-Centric Programs, in: 2019 IEEE International Parallel and Distributed Processing Symposium (IP DPS), IEEE, 2019, pp. 1019–1029
en
dc.subject
HPC
en
dc.subject
vertex-centric
en
dc.subject
graph algorithms
en
dc.subject
optimisation
en
dc.subject
graph processing
en
dc.subject
Pregel
en
dc.subject
iPregel
en
dc.subject
high performance computing
en
dc.subject
parallel programming
en
dc.title
Performance, memory efficiency and programmability: the ambitious triptych of combining vertex-centricity with HPC
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- Capelli2023.pdf
- Size:
- 6.46 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

