Show simple item record

dc.contributor.advisorCole, Murray
dc.contributor.advisorO'Boyle, Michael
dc.contributor.authorMohanty, Siddharth
dc.date.accessioned2015-09-11T14:23:57Z
dc.date.available2015-09-11T14:23:57Z
dc.date.issued2015-06-29
dc.identifier.urihttp://hdl.handle.net/1842/10557
dc.description.abstractManual tuning of applications for heterogeneous parallel systems is tedious and complex. Optimizations are often not portable, and the whole process must be repeated when moving to a new system, or sometimes even to a different problem size. Pattern based parallel programming models were originally designed to provide programmers with an abstract layer, hiding tedious parallel boilerplate code, and allowing a focus on only application specific issues. However, the constrained algorithmic model associated with each pattern also enables the creation of pattern-specific optimization strategies. These can capture more complex variations than would be accessible by analysis of equivalent unstructured source code. These variations create complex optimization spaces. Machine learning offers well established techniques for exploring such spaces. In this thesis we use machine learning to create autotuning strategies for heterogeneous parallel implementations of applications which follow the wavefront pattern. In a wavefront, computation starts from one corner of the problem grid and proceeds diagonally like a wave to the opposite corner in either two or three dimensions. Our framework partitions and optimizes the work created by these applications across systems comprising multicore CPUs and multiple GPU accelerators. The tuning opportunities for a wavefront include controlling the amount of computation to be offloaded onto GPU accelerators, choosing the number of CPU and GPU threads to process tasks, tiling for both CPU and GPU memory structures, and trading redundant halo computation against communication for multiple GPUs. Our exhaustive search of the problem space shows that these parameters are very sensitive to the combination of architecture, wavefront instance and problem size. We design and investigate a family of autotuning strategies, targeting single and multiple CPU + GPU systems, and both two and three dimensional wavefront instances. These yield an average of 87% of the performance found by offline exhaustive search, with up to 99% in some cases.en
dc.contributor.sponsorotheren
dc.language.isoenen
dc.publisherThe University of Edinburghen
dc.relation.hasversionSiddharth Mohanty and Murray Cole. Autotuning wavefront abstractions for heterogeneous architectures. In Applications for Multi-Core Architectures (WAMCA), 2012 Third Workshop on, pages 42–47. IEEE, 2012.en
dc.relation.hasversionSiddharth Mohanty and Murray Cole. Autotuning wavefront applications for multicore multi-GPU hybrid architectures. In Proceedings of Programming Models and Applications on Multicores and Manycores, PMAM’14, pages 1:1–1:9, New York, NY, USA, 2014. ACM.en
dc.subjectautotuningen
dc.subjectwavefrontsen
dc.subjectGPUen
dc.subjectmachine learningen
dc.titleAutotuning wavefront patterns for heterogeneous architecturesen
dc.typeThesis or Dissertationen
dc.type.qualificationlevelDoctoralen
dc.type.qualificationnamePhD Doctor of Philosophyen


Files in this item

This item appears in the following Collection(s)

Show simple item record