Abstract

SummarySupercomputers are advanced computing systems interconnected through high‐speed communication networks, consisting of independent computational nodes. During the unfolding of the big data era, the potent computational capabilities of these supercomputers play a pivotal role in scientific computing. Despite executing numerous advanced computational science and engineering tasks on supercomputers, many submitted jobs fail due to various factors, resulting in user inefficiencies. These failures not only consume system resources but also reduce the overall efficiency of the system. Previous research often couples job performance features with a single machine learning method for predicting job failure. However, a primary hurdle emerges from the high cost of gathering these features, complicating their real‐world applicability. To address this challenge, our study establishes correlations among job applications through extensive job log analysis. Leveraging correlations, we propose a predictive framework based on job application sequence correlation (called FP‐JSC). This innovative framework employs multiple machine learning models to offer holistic predictions, selecting the most suitable model based on its learning effectiveness. Moreover, the framework optimizes feature collection expenses without adversely affecting job execution. We determine job applications using both job paths and job names, with the former emerging as a novel feature derived from supplementary monitoring data. Empirical results underscore FP‐JSC's effectiveness, accurately identifying over 89% of jobs with 95% specificity and 89% sensitivity—outperforming single prediction methods employed in related works.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.