Abstract

SIMD extensions provide an efficient energy consumption platform to support mobile systems. How to use SIMD instructions to improve program performance is a challenge. SLP (superword level parallelism) is an efficient solution to exploit the parallelism, oriented to SIMD, between statements in the basic blocks, and it has been widely used in almost all the mainstream compilers. SLP relies on finding isomorphic statements to pack together into vectors. However, the capability of autovectorization for nonisomorphic statements is insufficient. In this paper, we introduce SLP-E, a novel autovectorization method that can automatically vectorize the codes which contain nonisomorphic statements, translate the nonisomorphic statements into the isomorphic statements by equivalent extended transformation of expressions, and vectorize the isomorphic statements. SLP-E improves the application scope and benefits of SLP. We implement the SLP-E in LLVM and compare it with prior approaches. A set of applications that benefit from autovectorization are taken from the SPEC CPU 2017 benchmark to compare our approach and prior techniques. Experimental results show that SLP-E achieves more than 43.9% speedup, on average, over other similar methods.

Highlights

  • The number of mobile phones in the world is increasing more and more

  • The results show that, compared with the SLP method implemented by LLVM, the average performance of the SLP-E method in the kernel test and the overall test is improved by 43.9% and 3.8%, respectively

  • SPEC CPU 2017/2006/2000 and MediaBench2 [43] benchmarks which have some media and artificial intelligent tests widely used in mobile system were used to test the proposed method from two aspects of the kernel test and overall test, respectively, and compared with the SLP method implemented by LLVM itself

Read more

Summary

Introduction

The number of mobile phones in the world is increasing more and more. The mobile phones require high energy efficiency and high performance systems. This method uses a new transformation method (equivalent extended transformation) and combines the structural characteristics of the data dependency graph of the statement to convert the nonisomorphic statements with different numbers of operations in the program into the isomorphic form, which extends the application scope of the SLP method and improves the vectorization capability of SLP (2) In order to solve the identification problem of isomorphic transformation objects (problem 1), an analysis method based on the maximum common subgraph is proposed, using the similar information of the nodes of the data dependency graph corresponding to the statements; obtain the isomorphic part of the data dependency graph corresponding to multiple statements; and obtain the differential part It lays the foundation for further determining the isomorphic transformation object and equivalent extended transformation position. The results show that, compared with the SLP method implemented by LLVM (including LSLP and SN-SLP methods), the average performance of the SLP-E method in the kernel test and the overall test is improved by 43.9% and 3.8%, respectively

Motivation
The Framework of SLP-E
The SLP-E Method
Basic Conception
Common Subgraph Method
Optimization Object Recognition for Equivalent Extended Transform
Implementation of Equivalent Extended Transformation
Evaluation
Kernel Test
Overall Test
Conclusions and Future Works
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call