A decentralized algorithm for distributed ensemble clustering

Antonello Rosato,Rosa Altilio,Massimo Panella

doi:10.1016/j.ins.2021.07.028

Abstract

In this paper, we consider the problem of distributed unsupervised learning where data to be clustered are partitioned over a set of agents having limited connectivity. In order to solve this problem, we consider a novel and extended ensemble clustering procedure in order to make it suitable to a fully distributed scenario. The proposed algorithm can deal with the case where each agent has a local and different dataset. Additionally, to reduce the total amount of exchanged information, only the local prototypes of clusters are forwarded among the neighbors. Cluster similarity indexes are adopted to solve conflicts among agents and to achieve a common structure at the end of the communication process. The experimental results prove the feasibility of this approach, which is able to reach an optimal performance when compared to a fully centralized implementation, that is where data is collected beforehand on a single clustering agent.

Full Text