Predicting Class Label Using Clustering-Classification Technique: A Comparative Study

Aseel Alshaibanee

doi:10.31642/jokmc/2018/100101

Abstract

Among different techniques, algorithms and applications of Data Mining, predicting the class label of unlabeled objects(undefined class label) is a crucial term in the field. The most common approaches in this area is the use of classification technique (DT, Bayes, SVM, KNN and others) that represent what is known as supervised learning. However, in many cases no target class labels and the boundaries are available to perform the prediction, so the new approach Clustering-classification technique is used. The work in this paper presents a survey of the most common researches conducted in this field and discuss their experiments, the algorithms they used, the types of data they utilized, the data sizes used, and the results they discovered. According to the results, applying the clustering techniques before classification improved classification accuracy and reduced experiment execution time. The Cluster Classifier was proven to be a suitable approach to summarize data by some of the researchers. It achieves a summarization rate of over 50%, which represents a considerable reduction in the size of the test datasets.. The findings of the researches indicated that, in addition to feature selection and feature extraction, data preprocessing (handled missing data and effective outlier detection techniques) enhanced the classifier performance and accuracy while reducing the classification error.

Full Text