Abstract

The classification tasks for numerical or categorical data have been well developed. However, the data collected in the real world are frequently the mixed type containing numerical and categorical values, and how to classify the mixed data quickly and efficiently is a critical yet challenging task. Existing classification models for mixed data usually treat the mixed data processing and subsequent classification as two independent phases, without considering their compatibility. By fusing the mixed data processing into a classification algorithm, this paper proposes an extended version of RBF-ELM (Radial Basis Function-Extreme Learning Machine), a Mixed Data RBF-ELM method (MD-RBF-ELM for short), which can achieve direct, fast, and efficient classification for mixed data. Specifically, a distance metric method for mixed data is firstly designed to calculate the distances between the input data and the RBF centers, and then these distances are used to train the network structure and weights of MD-RBF-ELM, thereby realizing the fusion of data processing with model learning. In addition, to alleviate the problem of MD-RBF-ELM’s unstable performance caused by randomly selecting the RBF centers, we propose an improved density peak clustering algorithm and use it to select the optimal RBF centers automatically and adaptively. Extensive experimental results on 34 data sets demonstrate that MD-RBF-ELM significantly enhances the classification performance (increasing 2.37% for F1-score, up to 14/34 for the number of best results, and reaching 2.4/8 for the averaged ranks), compared with seven state-of-the-art competitors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.