Reducing code size with function merging
Resource-constrained devices for embedded systems are becoming increasingly important. In such systems, primary and secondary memories are highly restrictive, making code size in most cases even more important than performance. Compared to more traditional platforms, memory is a larger part of the cost and code occupies much of it. Despite that, compilers make little effort to reduce code size. One important optimisation for code-size reduction is function merging. This technique eliminates redundant code across functions by merging them into a single function. However, production compilers only apply this optimisation to identical functions, while research compilers improve on that by merging the few functions with identical control-flow graphs and signatures. Overall, existing solutions are insufficient and we end up having to either increase cost by adding more memory or remove functionality from programs. This thesis introduces the first techniques capable of merging arbitrary pairs of functions. Our insight is that the weak results of prior function merging techniques are due to their rigid and overly restrictive solutions instead of the lack of duplicate code in the input program. Our solution corroborates this insight, resulting in drastic code size reductions while also reducing end-to-end compilation time. First, we introduce FMSA, a novel technique that can merge arbitrary functions through sequence alignment, a bioinformatics algorithm for identifying regions of similarity between sequences. We combine this technique with an intelligent exploration mechanism to direct the search towards the most promising function pairs. Our evaluation on the SPEC 2006 benchmark suite shows that FMSA is more than 2.4x better than the previous state of the art, proposed by von Koch et al., reducing code size by up to 25%, with an overall average of 6%. FMSA increases end-to-end compilation time by an average of 15%. While representing a leap forward, experiments show that FMSA fails to reduce code size in some cases where it would be intuitively expected to work. This limitation stems from its inability to directly handle phi-nodes. Instead, FMSA applies register demotion to replace all such nodes with memory operations, in an attempt to simplify the code generation process. We build on this technique and develop SalSSA that fully supports the SSA form, removing any need for register demotion. By doing so, we notably increase the number of profitably merged functions. Experimental results on the SPEC 2006 and 2017 suites show that our approach delivers on average, 7.9% to 9.7% reduction on the final size of the compiled code. Moreover, as a result of aligning shorter sequences of instructions and reducing the number of wasteful merge operations, our new approach incurs an average compile-time overhead of only 5%, while also reducing memory usage by over 2x. Finally, we continue to build on SalSSA by developing HyFM, which delivers similar levels of code size reduction for significantly lower compilation time and memory usage. To this end, we introduce an alignment strategy that works at the basic block level. Since basic blocks are usually much shorter than functions, even a quadratic alignment is acceptable. However, we also propose a linear algorithm for aligning blocks at a much lower cost. We extend this strategy with a multi-tier profitability analysis that bails out early from unprofitable merging attempts. By aligning individual pairs of blocks, we are able to decide their alignment's profitability before actually generating code. Experimental results on SPEC 2006 and 2017 show that HyFM needs orders of magnitude less memory, using up to 48 MB or 5.6 MB, depending on the variant used, while SalSSA requires 32 GB in the worst case. HyFM also runs over 4.5x faster, while still achieving comparable code size reduction. Combined with the speedup of later compilation stages due to the reduced number of functions, HyFM contributes to a reduced end-to-end compilation time.