Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Hotscotch: exploiting workload characteristics to improve read-throughput

View/Open
Pugh2022.pdf (2.188Mb)
Date
14/06/2022
Item status
Restricted Access
Embargo end date
14/06/2023
Author
Pugh, Matthew
Metadata
Show full item record
Abstract
Key-value stores are ubiquitous at all layers of the computational stack; offering constant average lookup, insertion, and deletion time. This dissertation looks at how we can exploit skew in the workloads to improve the temporal locality afforded in these systems, thereby resulting in an amortised throughput gain. Using Hopscotch hashing as the basis of this approach, an analysis is performed, and model are developed to quantify the potential gains available by performing in-memory data-reorganisation based on the number of accesses each tuple accrues during the execution of a skewed workload. Skew is modelled by drawing samples from numerous Zipf distributions, to measure the effect of the reordering across a wide parameter set. A simulator is developed based on this model to explore many random configurations of the table and quantify the distribution of potential performance improvements. These predictions are then tested experimentally by designing and implementing a number of novel approaches, each with different trade-offs between complexity and optimal ordering. Results demonstrate up to 3.75x, with outlier up to almost 20x in specific parameters, performance improvement on read-paths. This work is then extended to a concurrent programming model, where the baseline Hopscotch is lock-free. An analysis of the critical regions of execution leads to further new algorithms that relax these constraints more, and altering the design of prior approaches to achieve a trade-off of optimal ordering versus performance. This is evaluated on numerous configurations of skewed read workloads, resulting in a mean performance of approximately 2x when load-factor and number of queries are high, with up to 5x in high-opportunity, low-probability configurations. A simple Symmetric Hash Join experiment is also developed, demonstrating an approximate up to 1.5x improvement. As a final extension, this dissertation explores what opportunities are available to perform adaptive policy selection at a fine-granularity. For cases where there is little opportunity to benefit from the reordering, approaches including insight-based rules-methods and reinforcement-learning inspired are proposed, leading the way to future engineering work.
URI
https://hdl.handle.net/1842/39102

http://dx.doi.org/10.7488/era/2353
Collections
  • Informatics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page