High computational cost in elastoplastic analysis is often handled by the use of high performance parallel computers. However, the presence of both elastic and plastic states leads to the branching issue, which prevents the realization of true parallel performance. The computational efficiency of an elastoplastic analysis using finite element method is largely determined by the performance of repeated solution of linearized system of equations. In this paper, we propose GPU-based matrix-free strategies to compute sparse matrix–vector multiplication (SpMV) in Conjugate Gradient (CG) iterative solver for the acceleration of solution of linear system of equations. Matrix-free solvers never assemble large sparse global tangent matrix and perform the computation directly with small dense elemental matrices, reducing the storage requirement and preventing the use of problematic sparse storage formats. A uniform treatment of elements in elastic and plastic regions is achieved by the proposed single kernel strategy, which prevents branching, avoids redundant computation and provides efficient memory access. In addition, we propose node-based and degrees-of-freedom (DOF)-based parallel strategies for effective implementation of matrix-free SpMV on a GPU. The proposed strategies use single elemental tangent matrix for all elements in elastic region and individual tangent matrices in plastic region. The computational experiments over three large-scale benchmark examples of elastoplasticity reveal that the performance of the node-based and DOF-based parallelization strategies depends on the amount of plasticity in the body. At low plasticity levels, node-based strategy performs best, achieving 3.2× speedup over an existing GPU-based matrix-free SpMV strategy in the literature. For moderate to high amount of plasticity, the DOF-based strategy outperforms every other strategy and obtains speedups of up to 3.5× over the existing SpMV strategy.
Read full abstract