Automatic differentiation via effects and handlers

Sigal, Jesse Aaron

Automatic differentiation via effects and handlers

Files

Sigal2024.pdf (6.88 MB)

Date

2024-06-26

Authors

Sigal, Jesse Aaron

Full item page

Abstract

Machine learning, artificial intelligence, scientific modelling, information analysis, and other data heavy fields have driven the demand for tools that enable derivative based optimization. Automatic differentiation is a family of algorithms used to calculate the derivatives of programs with only a constant factor slowdown. There are many implementation strategies, some built into a language and some outside of it, and there are many different members of the family. The utility of automatic differentiation makes it worthwhile to implement it in as many languages as possible. Effects and handlers are a powerful control flow construct in programming languages based upon delimited continuations. They are a structured method of including side effects into programs, and have found many uses including nondeterminism, state management, and concurrency. Effects and handlers excel in facilitating non-local control flow and also provide methods of abstracting and composing effects. Mainstream programming languages are increasingly incorporating effects and handlers, notably OCaml and WebAssembly. We show that effects and handlers are well-suited for implementing automatic differentiation algorithms while maintaining the desirable asymptotic efficiency. In particular, effects and handlers allow for succinctness in the presence of complex control flow. On a practical level, we implement eight automatic differentiation algorithms in four languages with effects and handlers. The implementations range from standard AD algorithms such as forward mode and continuation-based reverse mode, to more advanced modes such as checkpointed reverse mode. We benchmark the standard modes to empirically show that we can reach the correct asymptotic complexity. Furthermore, we build up a mathematical framework in which to prove correctness of selected standard modes. To do so, we extend the set-theoretic denotational semantics of a simple effect and handler language to a category-theoretic semantics. We then describe how to perform a generalized proof by logical relations in this setting, and identify sufficient conditions for our proof method to apply. Equipped with our conditions, we show that diffeological spaces (a generalization of Euclidean spaces) admit proof by logical relations. Ultimately, this enables us to prove our implementations of forward mode and continuation reverse mode correct.

URI

https://hdl.handle.net/1842/41919
http://dx.doi.org/10.7488/era/4642

This item appears in the following Collection(s)

Informatics thesis and dissertation collection