Abstract

High performance computing platforms can bring us great benefits on processing various ubiquitous computing tasks. The Sunway TaihuLight supercomputer is a novel high performance computing platform, which is ranked No. 1 among the TOP500 list in the world. In this paper, we focus on how to optimize the Portable and Extensible Toolkit for Scientific computation (PETSc), running on supercomputers. The main motivations for this study are twofold: (i) PETSc is widely and frequently used in many scientific research fields such as biology, fusion, artificial intelligence, geosciences, etc; and (ii) the current nuclear PETSc does not fully utilize the potential of the Sunway TaighLight system, especially its powerful processor, i.e., SW26010 processor. To achieve high efficiency of PETSc, the central idea of our optimizations is to fully promote the performance of time-consuming and frequently used computation components (e.g., matrix and vector modules). To this end, we propose (i) accelerating kernel codes with computing processing elements (CPEs), in which new compression format and targeted optimizations for vector and matrix operations are devised; and (ii) using more efficient memory access schemes. We have implemented our proposals and evaluated its effectiveness and efficiency through a real world application — Structural Finite Element Analysis (SFEA). We obtain 16~32 times speedup for a single SW26010 processor. As an extra finding, the results also show a high scalability on over 8,000 computing nodes, i.e., 532,500 cores.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call