Abstract

PurposeThe purpose of this paper is to provide an automatic parallelization toolkit for unstructured mesh-based computation. Among all kinds of mesh types, unstructured meshes are dominant in engineering simulation scenarios and play an essential role in scientific computations for their geometrical flexibility. However, the high-fidelity applications based on unstructured grids are still time-consuming, no matter for programming or running.Design/methodology/approachThis study develops an efficient UNstructured Acceleration Toolkit (UNAT), which provides friendly high-level programming interfaces and elaborates lower level implementation on the target hardware to get nearly hand-optimized performance. At the present state, two efficient strategies, a multi-level blocks method and a row-subsections method, are designed and implemented on Sunway architecture. Random memory access and write–write conflict issues of unstructured meshes have been handled by partitioning, coloring and other hardware-specific techniques. Moreover, a data-reuse mechanism is developed to increase the computational intensity and alleviate the memory bandwidth bottleneck.FindingsThe authors select sparse matrix-vector multiplication as a performance benchmark of UNAT across different data layouts and different matrix formats. Experimental results show that the speed-ups reach up to 26× compared to single management processing element, and the utilization ratio tests indicate the capability of achieving nearly hand-optimized performance. Finally, the authors adopt UNAT to accelerate a well-tuned unstructured solver and obtain speed-ups of 19× and 10× on average for main kernels and overall solver, respectively.Originality/valueThe authors design an unstructured mesh toolkit, UNAT, to link the hardware and numerical algorithm, and then, engineers can focus on the algorithms and solvers rather than the parallel implementation. For the many-core processor SW26010 of the fastest supercomputer in China, UNAT yields up to 26× speed-ups and achieves nearly hand-optimized performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.