Abstract

AbstractData mining tasks are often composed by multiple stages that may be linked each other to form various execution flows. Moreover, data mining tasks are often distributed since they involve data and tools located over geographically distributed environments, like the Grid. Therefore, it is fundamental to exploit effective formalisms, such as workflows, to model data mining tasks that are both multi-staged and distributed. The goal of this work is defining a workflow formalism and providing a visual software environment, named DIS3GNO, to design and execute distributed data mining tasks over the Knowledge Grid, a service-oriented framework for distributed data mining on the Grid. DIS3GNO supports all the phases of a distributed data mining task, including composition, execution, and results visualization. The paper provides a description of DIS3GNO, some relevant use cases implemented by it, and a performance evaluation of the system.KeywordsData MiningComputing NodeGrid ResourceExecution PlanData Mining TaskThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call