Abstract

Selecting discriminative features to build effective learning models is a significant research work in machine learning. In practical applications, the data distribution characteristics are diverse, and the uncertainties pose challenges for building learning models with robustness and generalization capabilities. Since one-hot encoding is good at representing independent labels, the label matrix of regression-based feature selection (FS) methods is usually encoded with one-hot encoding. However, it's not well adapted to the different data distributions. This paper proposes a sparse orthogonal supervised FS model with global redundancy minimization, label scaling, and robustness (GRMLSRSOFS) to address the above problems. This model uses the label scaling technique proposed in this paper to better adapt to different data distributions. An iterative optimization method is given, and its convergence is demonstrated theoretically and experimentally. Further, experimental results on 12 public datasets show that 1) The GRMLSRSOFS can achieve higher classification accuracy with fewer features in most cases than several state-of-the-art FS methods. For example, the GRMLSRSOFS achieves 100% classification accuracy using only 20 features on the warpPIE10P dataset and obtains nearly 6% improvement over other methods on the Yale dataset. 2) The convergence speed of the GRMLSRSOFS will be faster after label scaling.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call