A Hierarchical Specialization Approach for Generating Optimized Sorting Code

Minhaj Ahmad Khan

doi:10.1007/s13369-014-1254-9

Abstract

Most of the existing sorting implementations are manually optimized since the compilers are unable to generate optimized code, mainly due to unavailability of necessary information required at compile time. This information is only available during execution of the code. However, it can be exposed at compile time through specialization to facilitate the compiler for performing optimizations. This paper presents an automated approach using specialization to generate optimized code for sorting data on different architectures. The sorting kernel is iteratively specialized in a hierarchical way to generate an optimized version comprising a high-level kernel and three low-level kernels: insertion, base, and merge kernels. The high-level kernel working in conjunction with the low-level kernels is embedded into quick sort kernel to be invoked when the data fit within cache sizes. The experiments for our optimization approach have been performed on the Intel Core-2 Duo and Power 4 (PowerPC) processors using icc and gcc compilers, respectively. The sorting code optimized through hierarchical specialization results in fast execution and, in many cases, performs better than the manually optimized implementations.

Full Text