Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations

Hayfa Tayeb,Ludovic Paillat,Bérenger Bramas

doi:10.1145/3631709

Abstract

Leveraging the SIMD capability of modern CPU architectures is mandatory to take full advantage of their increased performance. To exploit this capability, binary executables must be vectorized, either manually by developers or automatically by a tool. For this reason, the compilation research community has developed several strategies for transforming scalar code into a vectorized implementation. However, most existing automatic vectorization techniques in modern compilers are designed for regular codes, leaving irregular applications with non-contiguous data access patterns at a disadvantage. In this article, we present a new tool, Autovesk, that automatically generates vectorized code from scalar code, specifically targeting irregular data access patterns. We describe how our method transforms a graph of scalar instructions into a vectorized one, using different heuristics to reduce the number or cost of instructions. Finally, we demonstrate the effectiveness of our approach on various computational kernels using Intel AVX-512 and ARM SVE. We compare the speedups of Autovesk vectorized code over GCC, Clang LLVM, and Intel automatic vectorization optimizations. We achieve competitive results on linear kernels and up to 11× speedups on irregular kernels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization

Lead the way for us

Journal: ACM Transactions on Architecture and Code Optimization	Publication Date: Dec 15, 2023
Citations: 1

Similar Papers

Dynamic Data Shapers Optimize Performance in Dynamic Binary Optimization (DBO) Environment
Varun Venkatesan ... Swamy D Ponpandi
-
Varun Venkatesan, et. al.Varun Venkatesan ... Swamy D Ponpandi
01 Dec 2015
01 Dec 2015

Systematic Approach in Optimizing Numerical Memory-Bound Kernels on GPU
Ahmad Abdelfattah ... David Keyes
-
Ahmad Abdelfattah, et. al.Ahmad Abdelfattah ... David Keyes
01 Jan 2013
01 Jan 2013

An Efficient GPU Cache Architecture for Applications with Irregular Memory Access Patterns
Bingchao Li ... Jizeng Wei
ACM Transactions on Architecture and Code Optimization | VOL. 16
Bingchao Li, et. al.Bingchao Li ... Jizeng Wei
17 Jun 2019
ACM Transactions on Architecture and Code Optimization | VOL. 16

Systematic Memory MDS Sliding Window Codes Over Erasure Channels
Xiangyu Chen ... Qifu Tyler Sun
IEEE Transactions on Communications | VOL. 69
Xiangyu Chen, et. al.Xiangyu Chen ... Qifu Tyler Sun
30 Nov 2020
IEEE Transactions on Communications | VOL. 69

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization