A Performance Prediction Framework for Grid-Based Data Mining Applications

Leonid Glimcher,Gagan Agrawal

doi:10.1109/ipdps.2007.370274

Abstract

For a grid middleware to perform resource allocation, prediction models are needed, which can determine how long an application will take for completion on a particular platform or configuration. In this paper, we take the approach that by focusing on the characteristics of the class of applications a middleware is suited for, we can develop simple performance models that can be very accurate in practice. The particular middleware we consider is FREERIDE-G (FRamework for Rapid Implementation of Datamining Engines in Grid), which supports a high-level interface for developing data mining and scientific data processing applications that involve data stored in remote repositories. The FREERIDE-G system needs detailed performance models for performing resource selection, i.e., choosing computing nodes and replica of the dataset. This paper presents and evaluates such a performance model. By exploiting the fact that the processing structure of data mining and scientific data analysis applications developed on FREERIDE-G involves generalized reductions, we are able to develop an accurate performance prediction model. We have evaluated our model using implementations of three wellknown data mining algorithms and two scientific data analysis applications developed using FREERIDE-G. Results from these five applications show that we are able to accurately predict execution times for applications as we vary the number of storage nodes, number of nodes available for computation, the dataset size, the network bandwidth, and the underlying hardware.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Performance Prediction Framework for Grid-Based Data Mining Applications

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Middleware for data mining applications on clusters and grids
Leonid Glimcher ... Gagan Agrawal
Journal of Parallel and Distributed Computing | VOL. 68
Leonid Glimcher, et. al.Leonid Glimcher ... Gagan Agrawal
10 Jul 2007
Journal of Parallel and Distributed Computing | VOL. 68

FREERIDE-G: Supporting Applications that Mine Remote FREERIDE-G: Supporting Applications that Mine Remote
L Glimcher ... G Agrawal
-
L Glimcher, et. al.L Glimcher ... G Agrawal
14 Aug 2006
14 Aug 2006

A translation system for enabling data mining applications on GPUs
Wenjing Ma ... Gagan Agrawal
-
Wenjing Ma, et. al.Wenjing Ma ... Gagan Agrawal
08 Jun 2009
08 Jun 2009

Evaluation of a process for the Experimental Development of Data Mining, AI and Data Science applications aligned with the Strategic Planning
Methanias Colaço Júnior ... Fátima De L S Nunes
Journal of Information Systems and Technology Management | VOL. 19
Methanias Colaço Júnior, et. al.Methanias Colaço Júnior ... Fátima De L S Nunes
01 Nov 2022
Journal of Information Systems and Technology Management | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Performance Prediction Framework for Grid-Based Data Mining Applications

Abstract

Talk to us

Similar Papers