Applications of information sharing for code generation in process virtual machines
Kyle, Stephen Christopher
As the backbone of many computing environments today, it is important that process virtual machines be both performant and robust in mobile, personal desktop, and enterprise applications. This thesis focusses on code generation within these virtual machines, particularly addressing situations where redundant work is being performed. The goal is to exploit information sharing in order to improve the performance and robustness of virtual machines that are accelerated by native code generation. First, the thesis investigates the potential to share generated code between multiple threads in a dynamic binary translator used to perform instruction set simulation. This is done through a code generation design that allows native code to be executed by any simulated core and adding a mechanism to share native code regions between threads. This is shown to improve the average performance of multi-threaded benchmarks by 1.4x when simulating 128 cores on a quad-core host machine. Secondly, the ahead-of-time code generation system used for executing Android applications is improved through the use of profiling. The thesis investigates the potential for profiles produced by individual users of applications to be shared and merged together to produce a generic profile that still provides a lot of benefit for a new user who is then able to skip the expensive profiling phase. These profiles can not only be used for selective compilation to reduce code-size and installation time, but can also be used for focussed optimisation on vital code regions of an application in order to improve overall performance. With selective compilation applied to a set of popular Android applications, code-size can be reduced by 49.9% on average, while installation time can be reduced by 31.8%, with only an average 8.5% increase in the amount of sequential runtime required to execute the collected profiles. The thesis also shows that, among the tested users, the use of a crowd-sourced and merged profile does not significantly affect their estimated performance loss from selective compilation (0.90x-0.92x) in comparison to when they they perform selective compilation with their own unique profile (0.93x). Furthermore, by proposing a new, more powerful code generator for Android’s virtual machine, these same profiles can be used to perform focussed optimisation, which preliminary results show to increase runtime performance across a set of common Android benchmarks by 1.46x-10.83x. Finally, in such a situation where a new code generator is being added to a virtual machine, it is also important to test the code generator for correctness and robustness. The methods of execution of a virtual machine, such as interpreters and code generators, must share a set of semantics about how programs must be executed, and this can be exploited in order to improve testing. This is done through the application of domain-aware binary fuzzing and differential testing within Android’s virtual machine. The thesis highlights a series of actual code generation and verification bugs that were found in Android’s virtual machine using this testing methodology, as well as comparing the proposed approach to other state-of-the-art fuzzing techniques.