Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Saiyedul Islam,Achal Agarwal,Kirti Singh Rathore,Sundar Balasubramaniam,Poonam Goyal,Nischay Singh,Navneet Goyal

doi:10.1109/bigdata47090.2019.9006390

Abstract

Hierarchical Agglomerative Clustering (HAC) algorithms are used in many applications where clusters have a hierarchical relationship between them. Their parallelization is challenging due to the dependence of every agglomeration step on all previous agglomerations. Although a few parallel algorithms have been proposed for SLINK HAC algorithm, only limited work has been done to parallelize other HAC algorithms. In this paper, we present a high-level abstraction, which provides a uniform way to specify any HAC algorithm, and a framework for automatic parallelization of the same for distributed memory systems. The abstraction is supported by constructs in a high level, domain specific language, and a compiler translates algorithms expressed in this language to efficient parallel code targeting distributed systems. Our experiments on multiple HAC algorithms proves that the runtime performance achieved is comparable with state-of-the-art manual parallel implementations on Spark and MPI while requiring only a fraction of the programming effort. At runtime, master-slave execution is used, and load is balanced among the slaves in an algorithm-agnostic way, which is a significant contrast to custom load-balancing techniques seen in the literature on parallel HAC algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Fair Algorithms for Hierarchical Agglomerative Clustering
Anshuman Chhabra ... Prasant Mohapatra
-
Anshuman Chhabra, et. al.Anshuman Chhabra ... Prasant Mohapatra
01 Dec 2022
01 Dec 2022

Fast face recognition using a combination of image pyramid and hierarchical clustering algorithms
Hajar Momeni ... Hamid R Abutalebi
-
Hajar Momeni, et. al.Hajar Momeni ... Hamid R Abutalebi
01 Nov 2009
01 Nov 2009

A Semantics-Based Clustering Approach for Online Laboratories Using K-Means and HAC Algorithms
Saad Hikmat Haji ... Razwan Mohmed Salah
Mathematics | VOL. 11
Saad Hikmat Haji, et. al.Saad Hikmat Haji ... Razwan Mohmed Salah
19 Jan 2023
Mathematics | VOL. 11

Efficient Phrase-Based Document Similarity for Clustering
Hung Chim ... Xiaotie Deng
IEEE Transactions on Knowledge and Data Engineering | VOL. 20
Hung Chim, et. al. Hung Chim ... Xiaotie Deng
01 Sep 2008
IEEE Transactions on Knowledge and Data Engineering | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Rapid Prototyping of Hierarchical Agglomerative Clustering Algorithms for Distributed Systems

Abstract

Talk to us

Similar Papers