Large language model enabled program synthesis

Li, Yixuan

Large language model enabled program synthesis

Simple item page

dc.contributor.advisor

Polgreen, Elizabeth

dc.contributor.advisor

O'Boyle, Michael

dc.contributor.author

Li, Yixuan

dc.date.accessioned

2026-01-22T16:20:34Z

dc.date.issued

2025-12-02

dc.description.abstract

Program synthesis aims to automate the construction of programs that satisfy user-defined specifications, yet existing approaches face trade-offs between correctness, scalability, and efficiency. One established approach is syntax-guided synthesis (SyGuS), in which the user provides both a formal specification of the desired program behaviour and a syntactic template, typically expressed as a context-free grammar. However, it becomes computationally expensive as the grammar grows and the number of possible programs increases. In contrast, machine learning methods, including large language models (LLMs), can generate code quickly by learning patterns from large datasets, but they often produce incorrect or unverifiable programs. This thesis presents two complementary lines of research for integrating large language models (LLMs) into program synthesis. The first develops prompt engineering techniques to automatically select the most effective prompt-LLM configuration for a given synthesis task. The second investigates hybrid synthesis methods that combine the correctness guarantees of classical SyGuS techniques with the speed and generalisation capabilities of LLMs. This thesis further demonstrates the applicability of hybrid synthesis methods to domain-specific program lifting, where translating low-level code into high-level representations is essential for machine learning workloads. Together, these contributions advance the development of program synthesis frameworks that are both principled and practical.

en

dc.identifier.uri

https://era.ed.ac.uk/handle/1842/44340

dc.identifier.uri

https://doi.org/10.7488/era/6860

dc.language.iso

en

dc.publisher

The University of Edinburgh

en

dc.relation.hasversion

Leiqi Ye, Yixuan Li, Guy Frankel, Jianyi Cheng, and Elizabeth Polgreen. Unlocking hardware verification with oracle guided synthesis. In The 25th Conference on Formal Methods in Computer-Aided Design, pages 235–245. TU Wien Academic Press, 2025.

en

dc.relation.hasversion

Yixuan Li, Lewis Frampton, Federico Mora, and Elizabeth Polgreen. Online prompt selection for program synthesis. Proceedings of the AAAI Conference on Artificial Intelligence, 39(11):11282–11289, Apr. 2025

en

dc.relation.hasversion

Yixuan Li, José Wesley de Souza Magalhães, Alexander Brauckmann, Michael F. P. O’Boyle, and Elizabeth Polgreen. Guided tensor lifting. Proc. ACM Program. Lang., 9(PLDI), June 2025.

en

dc.relation.hasversion

Yixuan Li, Federico Mora, Elizabeth Polgreen, and Sanjit A Seshia. Genetic algorithms for searching a matrix of metagrammars for synthesis. arXiv preprint arXiv:2306.00521, 2023

en

dc.relation.hasversion

Yixuan Li, Julian Parsert, and Elizabeth Polgreen. Guiding enumerative program synthesis with large language models. In CAV (2), volume 14682 of Lecture Notes in Computer Science, pages 280–301. Springer, 2024.

en

dc.relation.hasversion

Yixuan Li, Julian Parsert, and Elizabeth Polgreen. Guiding enumerative program synthesis with large language models. In Computer Aided Verification, CAV 2024, pages 280–301, Cham, 2024. Springer Nature Switzerland.

en

dc.relation.hasversion

Weizhi Tang, Yixuan Li, Chris Sypherd, Elizabeth Polgreen, and Vaishak Belle. Hygenar: An llm-driven hybrid genetic algorithm for few-shot grammar generation. arXiv preprint arXiv:2505.16978, 2025

en

dc.relation.hasversion

HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Gram mar Generation [104] published at ACL 2025 findings.

en

dc.relation.hasversion

Unlocking Hardware Verification with Oracle Guided Synthesis [112] published at FMCAD 2025

en

dc.subject

program synthesis

en

dc.subject

machine learning

en

dc.subject

large language models

en

dc.subject

code generation

en

dc.subject

tensor algebra

en

dc.subject

programming languages

en

dc.subject

verification

en

dc.subject

reasoning

en

dc.title

Large language model enabled program synthesis

en

dc.type

Thesis or Dissertation

en

dc.type.qualificationlevel

Doctoral

en

dc.type.qualificationname

PhD Doctor of Philosophy

en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Li2025.pdf
Size:: 886.05 KB
Format:: Adobe Portable Document Format
Description:

Download

This item appears in the following Collection(s)

Informatics thesis and dissertation collection