Abstract
Predicting performance of an application running on parallel computing platforms is increasingly becoming important due to the long development time of an application and the high resource management cost of parallel computing platforms. However, predicting overall performance is complex and must take into account both parallel calculation time and communication time. Difficulty in accurate performance modeling is compounded by myriad design choices along multiple dimensions, namely (i) process level parallelism, (ii) distribution of cores on multi-processor platforms, (iii) application related parameters, and (iv) characteristics of datasets. This research proposes a fast and accurate performance prediction approach to predict the calculation and communication time of an application running on a distributed computing platform. The major contribution of our prediction approach is that it can provide an accurate prediction of execution times for new datasets which have much larger sizes than the training datasets. Our approach consists of two models, i.e., a probabilistic self-learning model to predict calculation time and a simulation queuing model to predict network communication time. The combination of these two models provides data analysts a useful insight of optimal configuration of parallel resources (e.g., number of processes and number of cores) and application parameters setting.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have