Spark2Fires: A New Parallel Approximate Subspace Clustering Algorithm

Bo Zhu,Alberto Mozo

doi:10.1007/978-3-319-44066-8_16

Abstract

Subspace clustering is an interesting investigation field that has been intensively studied in the last two decades. The objective of subspace clustering is to find all lower-dimensional clusters hidden in subspaces of high dimensional data. Although the majority of existing subspace clustering algorithms adopt certain heuristic pruning techniques to reduce the search space, the time complexity of such algorithms remain exponential with regard to the highest dimensionality of hidden subspace clusters. Even with help of parallelism, these techniques will require extremely high computational time in practice. In this paper we propose a novel subspace clustering technique that reduces the exponential time complexity to quadratic via approximation. We also provide a parallel implementation of proposed algorithm on top of Apache Spark to further accelerate our approach on large data sets. Preliminary experiment results show our algorithm performs much better especially considering the scalability with regard to the dimensionality of hidden clusters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spark2Fires: A New Parallel Approximate Subspace Clustering Algorithm

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Scalable Framework for Data-Driven Subspace Representation and Clustering
Eunwoo Kim ... Songhwai Oh
Pattern Recognition Letters | VOL. 125
Eunwoo Kim, et. al.Eunwoo Kim ... Songhwai Oh
01 Jul 2019
Pattern Recognition Letters | VOL. 125

Fusion of evolvable genome structure and multi-objective optimization for subspace clustering
Dipanjyoti Paul ... Jimson Mathew
Pattern Recognition | VOL. 95
Dipanjyoti Paul, et. al.Dipanjyoti Paul ... Jimson Mathew
31 May 2019
Pattern Recognition | VOL. 95

Improved subspace clustering algorithm using multi-objective framework and subspace optimization
Dipanjyoti Paul ... Jimson Mathew
Expert Systems with Applications | VOL. 158
Dipanjyoti Paul, et. al.Dipanjyoti Paul ... Jimson Mathew
11 May 2020
Expert Systems with Applications | VOL. 158

Grouping points by shared subspaces for effective subspace clustering
Ye Zhu ... Mark J Carman
Pattern Recognition | VOL. 83
Ye Zhu, et. al.Ye Zhu ... Mark J Carman
31 May 2018
Pattern Recognition | VOL. 83

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spark2Fires: A New Parallel Approximate Subspace Clustering Algorithm

Abstract

Talk to us

Similar Papers