Abstract

The Sunway TaihuLight supercomputer is the worlds first system with a peak performance greater than 100 PFlops. In this paper, we provide a detailed introduction to the TaihuLight system. In contrast with other existing heterogeneous supercomputers, which include both CPU processors and PCIe-connected many-core accelerators (NVIDIA GPU or Intel Xeon Phi), the computing power of TaihuLight is provided by a homegrown many-core SW26010 CPU that includes both the management processing elements (MPEs) and computing processing elements (CPEs) in one chip. With 260 processing elements in one CPU, a single SW26010 provides a peak performance of over three TFlops. To alleviate the memory bandwidth bottleneck in most applications, each CPE comes with a scratch pad memory, which serves as a user-controlled cache. To support the parallelization of programs on the new many-core architecture, in addition to the basic C/C++ and Fortran compilers, the system provides a customized Sunway OpenACC tool that supports the OpenACC 2.0 syntax. This paper also reports our preliminary efforts on developing and optimizing applications on the TaihuLight system, focusing on key application domains, such as earth system modeling, ocean surface wave modeling, atomistic simulation, and phase-field simulation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call