Identifying Degree and Sources of Non-Determinism in MPI Applications Via Graph Kernels

Dylan Chapp,Sanjukta Bhowmick,Nigel Tan,Michela Taufer

doi:10.1109/tpds.2021.3081530

Abstract

As the scientific community prepares to deploy an increasingly complex and diverse set of applications on exascale platforms, the need to assess reproducibility of simulations and identify the root causes of reproducibility failures increases correspondingly. One of the greatest challenges facing reproducibility issues at exascale is the inherent non-determinism at the level of inter-process communication. The use of non-deterministic communication constructs is necessary to boost performance, but communication non-determinism can also hamper software correctness and result reproducibility. To address this challenge, we propose a software framework for identifying the percentage and sources of communication non-determinism. We model parallel executions as directed graphs and leverage graph kernels to characterize run-to-run variations in inter-process communication. We demonstrate the effectiveness of graph kernel similarity as a proxy for non-determinism, by showing that these kernels can quantify the type and degree of non-determinism present in communication patterns. To demonstrate our framework's ability to link and quantify runtime non-determinism to root sources, demonstrate with present for an adaptive mesh refinement application, where our framework automatically quantifies the impact of function calls on non-determinism, and a Monte Carlo application, where our framework automatically quantifies the impact of parameter configurations on non-determinism.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: May 18, 2021
Citations: 3	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

Identifying Degree and Sources of Non-Determinism in MPI Applications Via Graph Kernels

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Similar Papers

Guest editors' introduction to special section on asynchronous real-time distributed systems
E.D Jensen ... B Ravindran
IEEE Transactions on Computers | VOL. 51
E.D Jensen, et. al.E.D Jensen ... B Ravindran
01 Aug 2002
IEEE Transactions on Computers | VOL. 51

Approximation of Graph Kernel Similarities for Chemical Graphs by Kernel Principal Component Analysis
Georg Hinselmann ... Nikolas Fechner
-
Georg Hinselmann, et. al.Georg Hinselmann ... Nikolas Fechner
01 Jan 2010
01 Jan 2010

Marrying Graph Kernel with Deep Neural Network: A Case Study for Network Anomaly Detection
Yepeng Yao ... Chen Zhang
-
Yepeng Yao, et. al.Yepeng Yao ... Chen Zhang
01 Jan 2019
01 Jan 2019

Graph kernels combined with the neural network on protein classification.
Jiang Qiangrong ... Qiu Guang
Journal of bioinformatics and computational biology | VOL. 17
Jiang Qiangrong, et. al.Jiang Qiangrong ... Qiu Guang
01 Oct 2019
Journal of bioinformatics and computational biology | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Identifying Degree and Sources of Non-Determinism in MPI Applications Via Graph Kernels

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems