Abstract
The authors consider algorithms for reducing the dimensionality of the input space of linear and logistic regression models of pattern recognition with a large number of input attributes. The proposed algorithms are based on the ideas of a non-smooth regularization of the non-informative attributes suppression, and one of the special cases is the Tibshirani lasso method. A comparative analysis of the efficiency of the logistic and linear models indicates their practical equivalence on the studied data sets. Based on the features of linear models, the authors propose a new, rapidly converging algorithm that implements the search for a solution under non-smooth regularization conditions without the participation of complex and resource-intensive algorithms of non-differentiable optimization. A comparative analysis of the proposed algorithm confirms its efficiency in comparison with methods based on nonsmooth optimization. The presented method allows us to solve the problem of processing noisy data with a large number of redundant attributes in a reasonable time period. Taking into account the high rate of convergence of the proposed method, the research proposes using it for the purpose of preliminary reduction of the dimension of the input attribute space with a large number of attributes. The article presents some examples of data processing of practically important classification problems.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have