Edinburgh Research Archive

Extending SQL with marked nulls: from design to implementation and application in computing certain answers

dc.contributor.advisor
Guagliardo, Paolo
dc.contributor.advisor
Libkin, Leonid
dc.contributor.author
Wang, Guozhi
dc.contributor.sponsor
School of Informatics, University of Edinburgh
en
dc.date.accessioned
2026-02-11T16:47:38Z
dc.date.issued
2025-12-02
dc.description.abstract
Marked null, a theoretical framework for handling missing values in incomplete data bases, has been extensively studied in literature since the 1980s. However, its practical application has remained unexplored until now. This research introduces marked null through marked data types, which encode marked nulls alongside constants and SQL nulls. We explore two possible encodings of marked nulls, and we define the semantics and behaviour of casts, comparisons, operations, and aggregations for marked data types. Based on these concepts, we develop two prototype implementations as PostgreSQL extensions: one written in SQL Standard for cross-platform compatibility and the other in C targeting PostgreSQL only for optimized performance. Comprehensive benchmarking evaluates both prototypes in terms of space usage and query performance. Performance overhead is analysed across multiple levels, including individual functions, join strategies, the Join Order Benchmark, and TPC benchmarks for practical workloads. Results show that the SQL Standard implementation, designed for broad compatibility, faces significant performance challenges in both space and time, and it suffers from incompatibility issue stemming from inconsistent implementations of SQL across commercial database systems. In contrast, the C implementation leverages PostgreSQL’s extensibility to deliver satisfactory performance, incurring at worst a 19.1% increase in disk space usage and a 9.2% geometric mean query performance overhead for TPC-H Benchmark. With a functional marked null implementation in place, we further explore its application in the problem of computing certain answers. Building on a recent study that provides a translation of queries for correctness guarantee, we simplify this translation using marked nulls. The correctness guarantee is maintained and query performance is improved. Overall, this work demonstrates the feasibility of marked null and illustrates a concrete example of its practical use. It bridges the gap between theory and practice, paving the way for further research and adoption of marked nulls in real-world database systems.
en
dc.identifier.uri
https://era.ed.ac.uk/handle/1842/44397
dc.identifier.uri
https://doi.org/10.7488/era/6917
dc.language.iso
en
dc.publisher
The University of Edinburgh
en
dc.rights.embargodate
2027-02-11
en
dc.subject
Marked null
dc.subject
SQL
dc.subject
PostgreSQL
dc.subject
Benchmarking
dc.subject
Certain answers
dc.title
Extending SQL with marked nulls: from design to implementation and application in computing certain answers
dc.type
Thesis
dc.type.qualificationlevel
Doctoral
dc.type.qualificationname
PhD Doctor of Philosophy
dcterms.accessRights
RESTRICTED ACCESS
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
WangG_2025.pdf
Size:
2.89 MB
Format:
Adobe Portable Document Format

This item appears in the following Collection(s)