Abstract

The use of distributed clustering is an important method of solving large-scale data mining problems. There are still some problems associated with distributed clustering, such as a performance bottleneck on the master node and network congestion caused by global broadcasting. This paper proposes a decentralized clustering method based on density clustering and the content-addressable network technique. It can form a cluster with excellent scalability and load balancing capabilities based on several surrounding nodes. In addition, a method is presented for optimizing the way clustering results are gathered in different application scenarios. Based on our extensive experiments, the proposed approach performs three times better than benchmark algorithms in terms of efficiency and has a stable expanding ratio of about 0.6 for large-scale data sets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call