FluiDB: adaptive storage layout using reversible relational operators
dc.contributor.advisor
Viglas, Stratis
dc.contributor.advisor
Grot, Boris
dc.contributor.author
Perivolaropoulos, Christos
dc.date.accessioned
2023-01-20T16:03:13Z
dc.date.available
2023-01-20T16:03:13Z
dc.date.issued
2023-01-20
dc.description.abstract
It is a popular practice to use materialized intermediate results to improve the perfor- mance of RDBMSes. Work in this area has focused either optimisers matching existing materialized results in the cache and selecting intermediate results from a plan to survive the plan execution. To our knowledge, few attempts have been made to create plans with cached intermediate results in mind, and none that make any attempt to deduplicate the stored data to alleviate the storage cost of maintaining possibly large queries.
We built FluiDB to explore a novel approach to integrating the selection of materialized results with the planner to optimize the logical representation of data in memory. FluiDB materializes common intermediate results and deduplucates data to alleviate the cost of maintaining them. This is achieved by introducing reversible operations: versions of normal relational operators that may produce complementary tables alongside the normal output, that allow the reconstruction of the input relations. A planner aware of such operations can build query plans that dynamically adapt the data layout to the plan under a constrained memory budget. This thesis revolves around four main chapters, each of which describes in detail a different part of FluiDB and a final one that goes into evaluation of the system.
The first chapter focuses on query processing and the relational algebra semantics that FluiDB operates under. FluiDB parses queries into DAGs of sub-queries connected by reversible operators. Each such graph of the workload is merged into a global query graph that is used to infer properties of each relation like cardinality and extent.
The next chapter is dedicated to the planner and a novel monad for weighted backtracking that the planer is based on. The planner attempts to generate a plan by traversing the query graph so that, besides solving the query at hand, it leaves in memory a curated set of queries aiming to maximize the amortized performance of the workload being run. In this chapter, the garbage collector is also discussed, which is the part of the planner responsible for inserting plan fragments that delete nodes when required such that the available storage budget is respected while no information is lost from the database.
After that, we go into Antisthenis, a framework we built for defining incremental com- putation systems. Antisthenis is used to build modules of the planner that efficiently determine whether a relation materializable, and estimate the cost of materializing a relation, given a set of materialized relations. Besides computation reuse, Antisthenis is able to prune the computation taking advantage of properties of the operators involved like absorbing group elements and bounded partial results. These techniques are also used to allow evaluation of some classes of self-referential computations.
The final chapter about the FluiDB architecture describes the transpilation of plans gen- erated by the planner to C++, as well as the supporting libraries that enable the tran- spilation of queries to highly specialized C++ code, and the low level data organization of the database.
The thesis closes with a chapter that describes our methods for benchmarking and some experimental results.
en
dc.identifier.uri
https://hdl.handle.net/1842/39747
dc.identifier.uri
http://dx.doi.org/10.7488/era/2995
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.subject
relational database
en
dc.subject
efficiency
en
dc.subject
materialized views
en
dc.subject
FluiDB
en
dc.subject
relational algebra semantics
en
dc.subject
query processing
en
dc.subject
garbage collector
en
dc.title
FluiDB: adaptive storage layout using reversible relational operators
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en
Files
Original bundle
1 - 1 of 1
- Name:
- PerivolaropoulosC_2022.pdf
- Size:
- 1.5 MB
- Format:
- Adobe Portable Document Format
- Description:
This item appears in the following Collection(s)

