Abstract

Vectorization is increasingly important to achieve high performance on modern hardware with SIMD instructions. Assembly of matrices and vectors in the finite element method, which is characterized by iterating a local assembly kernel over unstructured meshes, poses difficulties to effective vectorization. Maintaining a user-friendly high-level interface with a suitable degree of abstraction while generating efficient, vectorized code for the finite element method is a challenge for numerical software systems and libraries. In this work, we study cross-element vectorization in the finite element framework Firedrake via code transformation and demonstrate the efficacy of such an approach by evaluating a wide range of matrix-free operators spanning different polynomial degrees and discretizations on two recent CPUs using three mainstream compilers. Our experiments show that our approaches for cross-element vectorization achieve 30% of theoretical peak performance for many examples of practical significance, and exceed 50% for cases with high arithmetic intensities, with consistent speed-up over (intra-element) vectorization restricted to the local assembly kernels.

Highlights

  • The realization of efficient solution procedures for partial differential equations (PDEs) using finite element methods on modern computer systems requires the combination of diverse skills across mathematics, programming languages and high-performance computing

  • Due to the large number of combinations of experimental parameters, we only report an illustrative portion of the results here, with the entire suite of experiments made available on the interactive online repository CodeOcean (Sun 2019a)

  • 4.2.1 Compiler comparison and vector extensions When vectorizing with OpenMP SIMD directives, Intel C compiler (ICC) gives the best performance for almost all test cases, followed by Clang, while GCC is significantly less competitive

Read more

Summary

Introduction

The realization of efficient solution procedures for partial differential equations (PDEs) using finite element methods on modern computer systems requires the combination of diverse skills across mathematics, programming languages and high-performance computing. Automated code generation is one of the promising approaches to manage this complexity. It has been increasingly adopted in software systems and libraries. Recent successful examples include FEniCS (Logg et al 2012), Firedrake (Rathgeber et al 2016) and FreeFem++ (Hecht 2012). These software packages provide users with high-level interfaces for high productivity while relying on optimizations and transformations in the code generation pipeline to generate efficient low-level code. The challenge, as in all compilers, is to use appropriate abstraction layers that enable optimizations to be applied that achieve high performance on a broad set of programs and machines

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call