Abstract

Distributed Data Mining (DDM) has been proposed as a means to deal with the analysis of distributed data, where DDM discovers patterns and implements prediction based on multiple distributed data sources. However, DDM faces several problems in terms of autonomy, privacy, performance and implementation. DDM requires homogeneity regarding environment, control, administration and the classification algorithm(s), and such that requirements are too strict and inflexible in many applications. In this paper, we propose the employment of a Multi-Agent System (MAS) to be combined with DDM (MAS-DDM). MAS is a mechanism for creating goal-oriented autonomous agents within shared environments with communication and coordination facilities. We shall show that MAS-DDM is both desirable and beneficial. In MAS-DDM, agents could communicate their beliefs (calculated classification) by covering private and non-sharable data, and other agents decide whether the use of such beliefs in classifying instances and adjusting their prior assumptions about each class of data. In MAS-DDM, we will develop and use a modified Naive Bayesian algorithm because (1) Naive Bayesian has been shown to be the most used algorithm to deal with uncertain data, and (2) to show that even if all agents in MAS-DDM use the same algorithm, MAS-DDM preforms better than DDM approaches with non-communicating processes. Point (2) provide an evidence that the exchange of information between agents helps in increasing the accuracy of the classification task significantly.

Highlights

  • In the last few years, we have witnessed a tremendous increase in distributed data, cloud computing, wide usability of micro-processor devices, and data that is generated or obtained at multiple data acquisition devices

  • We propose the employment of a Multi-Agent System (MAS) to be combined with Distributed Data Mining (DDM) (MAS-DDM) to solve the problems we mentioned above for data intensive applications [5]

  • Agents involved in the collaboration classification tasks promote information diversity and particularity based on their own data and share information among one another to enhance the results

Read more

Summary

INTRODUCTION

In the last few years, we have witnessed a tremendous increase in distributed data, cloud computing, wide usability of micro-processor devices (e.g., mobiles and sensors), and data that is generated or obtained at multiple data acquisition devices. MAS is appropriate for distributed problem solving because it allows the creation of autonomous, goal-oriented entities/agents that operate in shared environments with coordination and communication capabilities This mechanism is beneficial for DDM as it allows us to combine and integrate different distributed clustering, prediction, and classification methods. MAS-based DDM allows agents to establish individual learned model and control the transformation of their learned information or results into global and central agents, which produce the final output. This approach is beneficial in obtaining enhanced results by combining results of multiple classifiers. The components of the proposed collaborative classification technique are discussed in the following subsections

MUTUAL COLLABORATIVE NAIVE BAYESIAN CALCULATION
COLLABORATIVE VALIDATION WITH DISTRIBUTED
EXPERIMENTS AND RESULTS
RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call