Edinburgh Research Archive logo

Edinburgh Research Archive

University of Edinburgh homecrest
View Item 
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  •   ERA Home
  • Informatics, School of
  • Informatics thesis and dissertation collection
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

FluiDB: adaptive storage layout using reversible relational operators

View/Open
PerivolaropoulosC_2022.pdf (1.496Mb)
Date
20/01/2023
Item status
Restricted Access
Embargo end date
20/01/2024
Author
Perivolaropoulos, Christos
Metadata
Show full item record
Abstract
It is a popular practice to use materialized intermediate results to improve the perfor- mance of RDBMSes. Work in this area has focused either optimisers matching existing materialized results in the cache and selecting intermediate results from a plan to survive the plan execution. To our knowledge, few attempts have been made to create plans with cached intermediate results in mind, and none that make any attempt to deduplicate the stored data to alleviate the storage cost of maintaining possibly large queries. We built FluiDB to explore a novel approach to integrating the selection of materialized results with the planner to optimize the logical representation of data in memory. FluiDB materializes common intermediate results and deduplucates data to alleviate the cost of maintaining them. This is achieved by introducing reversible operations: versions of normal relational operators that may produce complementary tables alongside the normal output, that allow the reconstruction of the input relations. A planner aware of such operations can build query plans that dynamically adapt the data layout to the plan under a constrained memory budget. This thesis revolves around four main chapters, each of which describes in detail a different part of FluiDB and a final one that goes into evaluation of the system. The first chapter focuses on query processing and the relational algebra semantics that FluiDB operates under. FluiDB parses queries into DAGs of sub-queries connected by reversible operators. Each such graph of the workload is merged into a global query graph that is used to infer properties of each relation like cardinality and extent. The next chapter is dedicated to the planner and a novel monad for weighted backtracking that the planer is based on. The planner attempts to generate a plan by traversing the query graph so that, besides solving the query at hand, it leaves in memory a curated set of queries aiming to maximize the amortized performance of the workload being run. In this chapter, the garbage collector is also discussed, which is the part of the planner responsible for inserting plan fragments that delete nodes when required such that the available storage budget is respected while no information is lost from the database. After that, we go into Antisthenis, a framework we built for defining incremental com- putation systems. Antisthenis is used to build modules of the planner that efficiently determine whether a relation materializable, and estimate the cost of materializing a relation, given a set of materialized relations. Besides computation reuse, Antisthenis is able to prune the computation taking advantage of properties of the operators involved like absorbing group elements and bounded partial results. These techniques are also used to allow evaluation of some classes of self-referential computations. The final chapter about the FluiDB architecture describes the transpilation of plans gen- erated by the planner to C++, as well as the supporting libraries that enable the tran- spilation of queries to highly specialized C++ code, and the low level data organization of the database. The thesis closes with a chapter that describes our methods for benchmarking and some experimental results.
URI
https://hdl.handle.net/1842/39747

http://dx.doi.org/10.7488/era/2995
Collections
  • Informatics thesis and dissertation collection

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page

 

 

All of ERACommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisorsThis CollectionBy Issue DateAuthorsTitlesSubjectsPublication TypeSponsorSupervisors
LoginRegister

Library & University Collections HomeUniversity of Edinburgh Information Services Home
Privacy & Cookies | Takedown Policy | Accessibility | Contact
Privacy & Cookies
Takedown Policy
Accessibility
Contact
feed RSS Feeds

RSS Feed not available for this page