Normalisation by Evaluation in the Compilation of Typed Functional Programming Languages
This thesis presents a critical analysis of normalisation by evaluation as a technique for speeding up compilation of typed functional programming languages. Our investigation focuses on the SML.NET compiler and its typed intermediate language MIL. We implement and measure the performance of normalisation by evaluation for MIL across a range of benchmarks. Taking a different approach, we also implement and measure the performance of a graph-based shrinking reductions algorithm for SML.NET. MIL is based on Moggi’s computational metalanguage. As a stepping stone to normalisation by evaluation, we investigate strong normalisation of the computational metalanguage by introducing an extension of Girard-Tait reducibility. Inspired by previous work on local state and parametric polymorphism, we define reducibility for continuations and more generally reducibility for frame stacks. First we prove strong normalistion for the computational metalanguage. Then we extend that proof to include features of MIL such as sums and exceptions. Taking an incremental approach, we construct a collection of increasingly sophisticated normalisation by evaluation algorithms, culminating in a range of normalisation algorithms for MIL. Congruence rules and alpha-rules are captured by a compositional parameterised semantics. Defunctionalisation is used to eliminate eta-rules. Normalisation by evaluation for the computational metalanguage is introduced using a monadic semantics. Variants in which the monadic effects are made explicit, using either state or control operators, are also considered. Previous implementations of normalisation by evaluation with sums have relied on continuation-passing-syle or control operators. We present a new algorithm which instead uses a single reference cell and a zipper structure. This suggests a possible alternative way of implementing Filinski’s monadic reflection operations. In order to obtain benchmark results without having to take into account all of the features of MIL, we implement two different techniques for eliding language constructs. The first is not semantics-preserving, but is effective for assessing the efficiency of normalisation by evaluation algorithms. The second is semantics-preserving, but less flexible. In common with many intermediate languages, but unlike the computational metalanguage, MIL requires all non-atomic values to be named. We use either control operators or state to ensure each non-atomic value is named. We assess our normalisation by evaluation algorithms by comparing them with a spectrum of progressively more optimised, rewriting-based normalisation algorithms. The SML.NET front-end is used to generate MIL code from ML programs, including the SML.NET compiler itself. Each algorithm is then applied to the generated MIL code. Normalisation by evaluation always performs faster than the most naıve algorithms— often by orders of magnitude. Some of the algorithms are slightly faster than normalisation by evaluation. Closer inspection reveals that these algorithms are in fact defunctionalised versions of normalisation by evaluation algorithms. Our normalisation by evaluation algorithms perform unrestricted inlining of functions. Unrestricted inlining can lead to a super-exponential blow-up in the size of target code with respect to the source. Furthermore, the worst-case complexity of compilation with unrestricted inlining is non-elementary in the size of the source code. SML.NET alleviates both problems by using a restricted form of normalisation based on Appel and Jim’s shrinking reductions. The original algorithm is quadratic in the worst case. Using a graph-based representation for terms we implement a compositional linear algorithm. This speeds up the time taken to perform shrinking reductions by up to a factor of fourteen, which leads to an improvement of up to forty percent in total compile time.