Abstract

Vectorization of programs is crucial for achieving high performance on modern processors with SIMD (Single Instruction Multiple Data) extensions. Programs with IF-statements suffer from control flow divergence that seriously complicates automatic vectorization. Therefore, contemporary compilers employ the IF-conversion approach to convert control flow to data flow, which relies on using predicated execution techniques (i.e., masked or select SIMD instructions). In this paper, we enhance the compiler’s capabilities to generate efficiently vectorized code for processors without masked instructions. We improve the state of the art in program vectorization by developing a novel approach—IF-select transformation—which is applicable to arbitrarily nested IF-statements. We implement our approach in the open-source Open64 compiler and evaluate its performance on the SW26010 processor used in the Sunway TaihuLight supercomputer (currently #3 in the TOP500 list) that does not support masked instructions. We extend our vectorization approach by providing an additional LLVM optimization pass to reduce the amount of masked memory accesses on processors without masked instructions, e.g., IBM Power8 and ARMCortex-A8. Experimental results demonstrate the performance advantages of the suggested vectorization techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.