Edinburgh Research Archive

Foundation for synthesising programming language semantics

dc.contributor.advisor
Cheney, James
dc.contributor.advisor
Belle, Vaishak
dc.contributor.author
Bartha, Sándor
dc.date.accessioned
2024-02-06T11:57:12Z
dc.date.available
2024-02-06T11:57:12Z
dc.date.issued
2024-02-06
dc.description.abstract
Programming or scripting languages used in real-world systems are seldom designed with a formal semantics in mind from the outset. Therefore, the first step for developing well-founded analysis tools for these systems is to reverse-engineer a formal semantics. This can take months or years of effort. Could we automate this process, at least partially? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging, as found by Krishnamurthi, Lerner and Elberty. They propose automatically learning desugaring translation rules, mapping the language whose semantics we seek to a simplified, core version, whose semantics are much easier to write. The present thesis contains an analysis of their challenge, as well as the first steps towards a solution. Scaling methods with the size of the language is very difficult due to state space explosion, so this thesis proposes an incremental approach to learning the translation rules. I present a formalisation that both clarifies the informal description of the challenge by Krishnamurthi et al, and re-formulates the problem, shifting the focus to the conditions for incremental learning. The central definition of the new formalisation is the desugaring extension problem, i.e. extending a set of established translation rules by synthesising new ones. In a synthesis algorithm, the choice of search space is important and non-trivial, as it needs to strike a good balance between expressiveness and efficiency. The rest of the thesis focuses on defining search spaces for translation rules via typing rules. Two prerequisites are required for comparing search spaces. The first is a series of benchmarks, a set of source and target languages equipped with intended translation rules between them. The second is an enumerative synthesis algorithm for efficiently enumerating typed programs. I show how algebraic enumeration techniques can be applied to enumerating well-typed translation rules, and discuss the properties expected from a type system for ensuring that typed programs be efficiently enumerable. The thesis presents and empirically evaluates two search spaces. A baseline search space yields the first practical solution to the challenge. The second search space is based on a natural heuristic for translation rules, limiting the usage of variables so that they are used exactly once. I present a linear type system designed to efficiently enumerate translation rules, where this heuristic is enforced. Through informal analysis and empirical comparison to the baseline, I then show that using linear types can speed up the synthesis of translation rules by an order of magnitude.
en
dc.identifier.uri
https://hdl.handle.net/1842/41415
dc.identifier.uri
http://dx.doi.org/10.7488/era/4149
dc.language.iso
en
en
dc.publisher
The University of Edinburgh
en
dc.relation.hasversion
Bartha, S., Cheney, J., and Belle, V. (2021). One down, 699 to go: Or, synthesising compositional desugarings. Proceedings of the ACM on Programming Languages, 5(OOPSLA)
en
dc.relation.hasversion
Bartha, S. and Cheney, J. (2020). Towards meta-interpretive learning of programming language semantics. In Proceedings of the 29th International Conference on Inductive Logic Programming (ILP 2019), number 11770 in LNCS, pages 16–25.
en
dc.subject
semantics
en
dc.subject
programming languages
en
dc.subject
program synthesis
en
dc.title
Foundation for synthesising programming language semantics
en
dc.title.alternative
A foundation for synthesising programming language semantics
en
dc.type
Thesis or Dissertation
en
dc.type.qualificationlevel
Doctoral
en
dc.type.qualificationname
PhD Doctor of Philosophy
en

Files

Original bundle

Now showing 1 - 1 of 1
Name:
Bartha2024.pdf
Size:
720.61 KB
Format:
Adobe Portable Document Format
Description:

This item appears in the following Collection(s)