Experiences in designing a robust and scalable interpreter profiling framework

I Gartley,N Grcevski,M Pirvu,V Sundaresan

doi:10.1109/cgo.2013.6494981

Abstract

Profile directed feedback (PDF) is a well known technique used to drive many compiler optimizations like basic block ordering and guarded devirtualization. These optimizations are particularly crucial in order to achieve good throughput performance in JEE applications that have a large code footprint. To effectively apply optimizations that rely on profiling information, a just-in-time (JIT) compiler must have access to profiling information that is accurate. One common source of profiling information in a Java virtual machine (JVM) is the interpreter. Typically methods are interpreted as a program ramps up, during which profiling information can be collected. However, obtaining useful and accurate information for large enterprise-class applications can be a challenge because of the memory and performance overhead associated with collecting and processing the large volumes of profiling data that is generated. This paper describes the challenges in maintaining the balance between throughput performance and profiling overhead in a production JIT compiler that is used by the IBM JDK. The scope of the performance overhead in terms of throughput, memory footprint and startup speed for large JEE class applications is introduced and various engineering solutions that were tried are detailed and compared in terms of experimental results. We found that the throughput improvement due to interpreter profiling (IP) can be as high as 58%, whereas the overhead measured in terms of application startup time could cost up to 57%. Our solutions to reducing profiling overhead managed to reduce the startup cost to only a few percent while maintaining the full throughput benefit. By discussing these approaches, this paper offers a balanced and practical overview on how to make PDF work well for enterprise-class applications in a production JIT compiler.

Full Text