A Comprehensive Analysis of User Job Data on a Petascale Supercomputer Dedicated to CFD

Wenxiang Yang,Zhigong Yang,Yueqing Wang,Cheng Chen,Yongguo Zhou,Fang Wang

doi:10.1109/iccc47050.2019.9064094

Abstract

High performance computing (HPC) systems play a crucial role in performing large-scale scientific applications and their efficiencies are imperative to be improved. This paper aims to comprehensively understand job characteristics and the factors that affect system efficiency and performance, which lays a solid foundation for proposing and evaluating job scheduling and resource management methods. To achieve this goal, we collect job data covering two years from a petascale HPC system that is dedicated to computational fluid dynamics (CFD) applications. Furthermore, a detailed analysis about failed jobs and waiting time is conducted based on the dataset. Our analysis excavates some important characteristics of submitted jobs, which can not only help system owners understand and master the situation about CFD applications in the system, but also provide good guidance and ideas for optimizing job scheduling and resource management algorithms.

Full Text