Abstract

Feature selection (FS) is one of the most critical tasks in data mining, which aims to reduce the dimensionality of the data and maximize classification accuracy. The FS problem can be treated as an NP-hard problem. Recently, various swarm intelligent (SI) algorithms have been employed to deal with the FS problem to solve the expensive computation of the exact method. However, the performance of the SI algorithms is limited because these algorithms do not comprehensively take the characteristics of the FS problem into consideration. Therefore, a promising salp swarm algorithm called NCSSA is presented to solve this problem. In NCSSA, multi-perspective initialization strategy, Newton interpolation inertia weight, improved followers’ update model and cosine opposition-based learning (COBL) are proposed. In the majority of the SI algorithm-based FS method, the initial search agents are randomly generated or using a single filter method. However, a single filter method has different performance on various datasets. Therefore, a multi-perspective initialization strategy based on minimal redundancy maximal relevance (MRMR) and ReliefF is proposed, which can select the optimal subsets from different perspectives. Furthermore, Newton interpolation inertia weight is presented to balance the algorithm’s exploration and exploitation. Compare with the existing inertia weights, the adjustment flexibility of the proposed inertia weight is enhanced. Additionally, the followers update their positions according to the values of ReliefF and MRMR, which can make full use of the relationship between data and labels. Finally, the COBL is introduced to accelerate the convergence rate and helps the algorithm jump out of the local best solutions. The COBL is better than opposition-based learning (OBL) in terms of randomness, and considers the characteristics of the FS problem. The proposed NCSSA is compared to a series of non-SI-based methods and SI-based methods employing the standard datasets from the UCI Machine Learning Repository. Experimental results show that the NCSSA is a promising algorithm for the FS problem. The contribution analysis of each strategy indicates that the COBL is the most effective strategy in improving the SSA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call