Abstract
NEC SX-series vector supercomputers have provided outstanding memory bandwidths to meet the strong demands for efficient execution of memory-intensive scientific applications in practice. Inheriting the advantage, the 2nd generation SX-Aurora TSUBASA, Type 20B, provides an extremely high memory bandwidth of 1.53 TB/s per vector processor. Unlike conventional SX-series systems, SX-Aurora TSUBASA also offers various execution modes to execute a diversity of emerging scientific workloads efficiently. As a result, application developers need to understand their workloads and the performance characteristics of SX-Aurora TSUBASA, and select an optimization strategy assuming an appropriate execution mode to fully exploit the system performance. Therefore, this paper discusses workload characterization by performance bottleneck analysis to determine an optimization strategy for the 2nd generation SX-Aurora TSUBASA. The evaluation results with benchmarks and real-world applications demonstrate that the workload characterization approach can accurately find the bottleneck and characterize various workloads, by helping application developers decide the optimization strategies for individual workloads. Since we can consider SX-Aurora TSUBASA as a typical example of the latest processors with high memory bandwidths, the workload characterization approach will also be helpful for other future processors.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.