A H-K Clustering Algorithm For High Dimensional Data Using Ensemble Learning

Bharat Tidke,Rashmi Paithankar

doi:10.5121/ijitcs.2014.4601

Abstract

Advances made to the traditional clustering algorithms solves the various problems such as curse of dimensionality and sparsity of data for multiple attributes. The traditional H-K clustering algorithm can solve the randomness and apriority of the initial centers of K-means clustering algorithm. But when we apply it to high dimensional data it causes the dimensional disaster problem due to high computational complexity. All the advanced clustering algorithms like subspace and ensemble clustering algorithms improve the performance for clustering high dimension dataset from different aspects in different extent. Still these algorithms will improve the performance form a single perspective. The objective of the proposed model is to improve the performance of traditional H-K clustering and overcome the limitations such as high computational complexity and poor accuracy for high dimensional data by combining the three different approaches of clustering algorithm as subspace clustering algorithm and ensemble clustering algorithm with H-K clustering algorithm.

Highlights

As an important technique in data mining, clustering analysis groups the observations having similar properties which can be called as an unsupervised classification[1] which helps to extract the relevant information from high dimensional data
The proposed model combines the three techniques, subspace clustering, H-K clustering and ensemble clustering and their advantages to improve the performance of clustering result on high dimensional data which will simultaneously overcome the limitations of H-K clustering algorithm for high dimensional data
A lot of work has been done in the area of clustering, based on the research until date, the general categorization for high dimensional data set clustering includes: 1- Dimension reduction, 2- Subspace clustering, 3 - Ensemble Clustering and 4 - H-K clustering [1] [11] [14]

Summary

. INTRODUCTION

As an important technique in data mining, clustering analysis groups the observations having similar properties which can be called as an unsupervised classification[1] which helps to extract the relevant information from high dimensional data. Ensemble clustering ‘the knowledge reuse framework’, firstly proposed by Strel and Ghosh [11] is the technique which uses the two mechanisms as generation mechanism which generates the clusters using different criteria and consensus function will choose the most appropriate solution form the set of solutions. It overcome the challenges created by high dimensional data and gives high performance on real world datasets for applications as Internet applications and medical diagnostics [2,3,12,13,19,20]. The proposed model combines the three techniques, subspace clustering, H-K clustering and ensemble clustering and their advantages to improve the performance of clustering result on high dimensional data which will simultaneously overcome the limitations of H-K clustering algorithm for high dimensional data ( as high computational complexity and poor accuracy)

MOTIVATION

RELATED WORK

Dimension reduction

Subspace clustering

Ensemble Clustering

H-K clustering

Method

Findings

5.CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Information Technology Convergence and Services	Publication Date: Dec 31, 2014
Citations: 11	License type: cc-by

R Discovery Prime

R Discovery Prime

A H-K Clustering Algorithm For High Dimensional Data Using Ensemble Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Information Technology Convergence and Services

Lead the way for us

Similar Papers

A H-K CLUSTERING ALGORITHM FOR HIGH DIMENSIONAL DATA USING ENSEMBLE LEARNING

Zenodo (CERN European Organization for Nuclear Research) | VOL. -

11 Jan 2015
Zenodo (CERN European Organization for Nuclear Research) | VOL. -

Dimensionality-reduced subspace clustering
Helmut Bölcskei ... Reinhard Heckel
Information and Inference: A Journal of the IMA | VOL. 6
Helmut Bölcskei, et. al.Helmut Bölcskei ... Reinhard Heckel
14 Mar 2017
Information and Inference: A Journal of the IMA | VOL. 6

An Integrated Approach to High-Dimensional Data Clustering
Rashmi Paithankar ... Bharat Tidke
-
Rashmi Paithankar, et. al.Rashmi Paithankar ... Bharat Tidke
01 Jan 2015
01 Jan 2015

Soft Subspace Clustering Algorithm for Streaming Data
Lin Zhu ... Jie Yang
Journal of Software | VOL. 24
Lin Zhu, et. al.Lin Zhu ... Jie Yang
06 Jan 2014
Journal of Software | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A H-K Clustering Algorithm For High Dimensional Data Using Ensemble Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Information Technology Convergence and Services