On parallelization of the loop over elements in FEAP

P Jarzebski,R L Taylor,K Wisniewski

doi:10.1007/s00466-015-1156-z

Abstract

In this paper, we consider parallelization of the loop over elements using OpenMP in FEAP (Taylor, 2014), which is a research FE code, very popular at universities. Even for a serial version of FEAP (a cluster version also exists) such a parallelization is a non-trivial task due to the existing architecture of this code, which complicates efficient parallelization. First, we compare the serial version of FEAP to the parallel code Warp3D (Dodds et al., 2014), considering the usage of time and memory. As we found, Warp3D is much faster but uses more memory than FEAP. An analysis of Warp3D helps us to devise our method of parallelization of the loop over elements. Next, we describe several changes in FEAP, which were necessary to parallelize the loop over elements using OpenMP. In particular, the subroutine assembling elemental matrices is identified as crucial to good performance, and several directives for the mutual exclusion synchronization of OpenMP are implemented and tested. Finally, we demonstrate the performance of the parallelized FEAP, designated as ompFEAP, on numerical examples involving 3D and shell elements of FEAP as well as user's elements. We conclude that ompFEAP, using the directive ATOMIC for synchronization of the assembling, provides a very good speedup and efficiency.

Full Text