Abstract

Feature selection (FS) can select features with high value from high-dimensional data as much as possible and reduce the dimensionality of the data to improve the performance of the machine learning model and enhance the generalization ability. In semi-supervised FS (SSFS), regression-based methods utilize the label matrix, and the quality of the label matrix can directly affect the performance of FS. Hence, the importance of the reliable label matrix is obvious. Orthogonal regression (OR) can retain more information in the subspace than least squares regression (LSR). Therefore, this paper introduces OR into SSFS, and then the label scaling technique is used to learn a reliable label matrix. Also, this study utilizes adaptive graph learning to exploit more structural information about the data. Two constraints, Frobenius-norm or maximum information entropy imposed on the similarity matrix and two adaptive orthogonal SSFS (AGLSOFS) methods with reliable label matrix learning are constructed. The impact of these two constraints on the construction of dynamic similarity graphs and FS results is discussed. Effective optimization algorithms for these two methods are based on the Alternating Direction Method of Multipliers (ADMM) and Generalized Power Iteration (GPI). Experiments are conducted on 15 benchmark datasets, and the results show that: (1) similarity graphs constructed using both original and projected data are more accurate; (2) both constraints are valid; (3) both of the methods of this paper perform well on most datasets; (4) OR performs better than LSR for FS; and (5) the scaling factor affects the convergence speed of the model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call