Machine learning in compilers
Item Status
Embargo End Date
Date
Authors
Abstract
Designing a compiler so that it produces optimised code is a difficult task because modern processors are complex. Compiler writers need to spend months to finely tune a heuristic for any architecture, but whenever a new processor comes out the compiler's heuristics will need to be retuned for it. This is, typically, too much effort and so, in fact, most compilers are out of date. Machine learning has been shown to help, creating tools that can predict how best to compile new programs from observations made about programs compiled in the past. Many hurdles still remain, however, and while experts no longer have to worry about the details of heuristic parameters, they must focus on the details of the machine learning process instead.
This thesis develops techniques so that, where human compiler experts were needed to manage the machine learning experiments before, they are required no longer; paving the way for a completely automatic, retuning compiler.
First, we tackle the most outstanding area requiring human involvement; feature generation. In all previous machine learning works for compilers, the features, which describe the important aspects of each example to the machine learning tools, must be constructed by an expert. Should that expert choose features poorly, they will miss crucial information without which the machine learning algorithm can never excel. We show that not only can we automatically derive good features, but that these features outperform those of human experts. We demonstrate our approach on loop unrolling, and find we do better than previous work, obtaining 76% of the available performance, more than the 59% of previous state of the art.
Next, we demonstrate a new method to efficiently capture the raw data needed for machine learning tasks. The iterative compilation on which machine learning in compilers depends is typically time consuming, often requiring months of compute time. The underlying processes are also noisy, so that most prior works fall into two categories; those which attempt to gather clean data by executing a large number of times and those which ignore the statistical validity of their data to keep experiment times feasible. Our approach, however, guarantees clean data while adapting to the experiment at hand, needing an order of magnitude less work than prior techniques.
This item appears in the following Collection(s)

