Abstract
The authors consider algorithms for reducing the dimensionality of the input space of linear and logistic regression models of pattern recognition with a large number of input attributes. The proposed algorithms are based on the ideas of a non-smooth regularization of the non-informative attributes suppression, and one of the special cases is the Tibshirani lasso method. A comparative analysis of the efficiency of the logistic and linear models indicates their practical equivalence on the studied data sets. Based on the features of linear models, the authors propose a new, rapidly converging algorithm that implements the search for a solution under non-smooth regularization conditions without the participation of complex and resource-intensive algorithms of non-differentiable optimization. A comparative analysis of the proposed algorithm confirms its efficiency in comparison with methods based on nonsmooth optimization. The presented method allows us to solve the problem of processing noisy data with a large number of redundant attributes in a reasonable time period. Taking into account the high rate of convergence of the proposed method, the research proposes using it for the purpose of preliminary reduction of the dimension of the input attribute space with a large number of attributes. The article presents some examples of data processing of practically important classification problems.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.