Abstract
SIMD extensions provide an efficient energy consumption platform to support mobile systems. How to use SIMD instructions to improve program performance is a challenge. SLP (superword level parallelism) is an efficient solution to exploit the parallelism, oriented to SIMD, between statements in the basic blocks, and it has been widely used in almost all the mainstream compilers. SLP relies on finding isomorphic statements to pack together into vectors. However, the capability of autovectorization for nonisomorphic statements is insufficient. In this paper, we introduce SLP-E, a novel autovectorization method that can automatically vectorize the codes which contain nonisomorphic statements, translate the nonisomorphic statements into the isomorphic statements by equivalent extended transformation of expressions, and vectorize the isomorphic statements. SLP-E improves the application scope and benefits of SLP. We implement the SLP-E in LLVM and compare it with prior approaches. A set of applications that benefit from autovectorization are taken from the SPEC CPU 2017 benchmark to compare our approach and prior techniques. Experimental results show that SLP-E achieves more than 43.9% speedup, on average, over other similar methods.
Highlights
The number of mobile phones in the world is increasing more and more
The results show that, compared with the SLP method implemented by LLVM, the average performance of the SLP-E method in the kernel test and the overall test is improved by 43.9% and 3.8%, respectively
SPEC CPU 2017/2006/2000 and MediaBench2 [43] benchmarks which have some media and artificial intelligent tests widely used in mobile system were used to test the proposed method from two aspects of the kernel test and overall test, respectively, and compared with the SLP method implemented by LLVM itself
Summary
The number of mobile phones in the world is increasing more and more. The mobile phones require high energy efficiency and high performance systems. This method uses a new transformation method (equivalent extended transformation) and combines the structural characteristics of the data dependency graph of the statement to convert the nonisomorphic statements with different numbers of operations in the program into the isomorphic form, which extends the application scope of the SLP method and improves the vectorization capability of SLP (2) In order to solve the identification problem of isomorphic transformation objects (problem 1), an analysis method based on the maximum common subgraph is proposed, using the similar information of the nodes of the data dependency graph corresponding to the statements; obtain the isomorphic part of the data dependency graph corresponding to multiple statements; and obtain the differential part It lays the foundation for further determining the isomorphic transformation object and equivalent extended transformation position. The results show that, compared with the SLP method implemented by LLVM (including LSLP and SN-SLP methods), the average performance of the SLP-E method in the kernel test and the overall test is improved by 43.9% and 3.8%, respectively
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.