Abstract

Hybrid data lead to overfitting in machine learning models, which may reduce the accuracy of classification. Feature selection can not only reduce the computational cost of processing hybrid data but also improve the accuracy of classification. The particle swarm optimization (PSO) algorithm has clear advantages in feature selection. This paper presents an oscillatory particle swarm optimization feature selection algorithm for hybrid data based on mutual information entropy. First, a new distance function on the object set of a hybrid information system (HIS) is built, which yields a tolerance relation on this object set. Then, mutual information entropy is presented to measure the uncertainty of the HIS. On this basis, the maximum-relevance and minimal-redundancy model (MRMR model) for the HIS is proposed. Based on the MRMR model, a feature selection algorithm (denoted as MRMR) for the HIS is naturally designed. As the integration of the MRMR model and PSO can effectively explore all possible feature subsets, an oscillatory particle swarm optimization algorithm based on the MRMR model (denoted as OPSO-MRMR) for the HIS is also designed. Moreover, the MRMR model is utilized to define a fitness function that evaluates the quality of particles. The particle position update process is modified by means of a two-order oscillatory equation. Finally, an experimental analysis is conducted to compare the two designed algorithms with five other algorithms. The statistical analysis of classification accuracy and F1 score shows that OPSO-MRMR improves precision by 5.8% and 10.7% compared to the other six algorithms, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call