Abstract

Distributed Data Mining (DDM) generally deals with the mining of data within a distributed framework such as local area and wide area networks. One strong case for DDM systems is the need to mine for patterns in very large databases. This requires mandatory partitioning or splitting of databases into smaller sets which can be mined locally over distributed hosts. Data Distribution implies communication costs associated with the need to combine the results from processing local databases. This paper considers the development of a DDM system on a cluster. In specific we approach the problem of data partitioning for data mining. We present a prototype system for DDM using a data partitioning mechanism based on Bayesian mixture modeling. Results from comparison with standard techniques show plausible support for our system and its applicability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.