Shader optimization and specialization
In the field of real-time graphics for computer games, performance has a significant effect on the player’s enjoyment and immersion. Graphics processing units (GPUs) are hardware accelerators that run small parallelized shader programs to speed up computationally expensive rendering calculations. This thesis examines optimizing shader programs and explores ways in which data patterns on both the CPU and GPU can be analyzed to automatically speed up rendering in games. Initially, the effect of traditional compiler optimizations on shader source-code was explored. Techniques such as loop unrolling or arithmetic reassociation provided speed-ups on several devices, but different GPU hardware responded differently to each set of optimizations. Analyzing execution traces from numerous popular PC games revealed that much of the data passed from CPU-based API calls to GPU-based shaders is either unused, or remains constant. A system was developed to capture this constant data and fold it into the shaders’ source-code. Re-running the game’s rendering code using these specialized shader variants resulted in performance improvements in several commercial games without impacting their visual quality.