Abstract

When dealing a classification problem with mixed data, most of conventional supervised learning algorithms cannot perform well due to their numerical characteristics. However, some clustering algorithms, such as k-prototypes algorithm, show their potential in clustering mixed data. Therefore, the current study intends to develop a novel clustering-based classification algorithm for mixed data to have both merits of classification and clustering. The proposed algorithm employs a sine-cosine algorithm (SCA) to find attribute weights and initial centroids for a k-prototypes algorithm. The objective function of the algorithm is formulated as a sum-up purity. To have better performance for SCA, a mutation strategy, containing Gaussian mutation, Cauchy mutation, Levy mutation, and single-point mutation, is embedded into the original SCA. The proposed algorithm is compared with some metaheuristic-based classification algorithms and existing classification algorithms. Based on the 10 data sets from UCI, the experimental results indicated that the proposed algorithm can achieve superior classification performance in terms of accuracy and Cohen's Kappa. In addition, mutation mechanism can make SCA have better performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call