Abstract
Logistic regression as a classic classification algorithm has limitations that can only be applied to linearly separable data. For linearly indivisible data, we use a kernel trick to map it to a higher dimensional space, making it easier to separate and structure in this space. However, with the increasing scale of data, the use of kernel trick is becoming more and more restricted. When the data reaches a certain scale, the cost of storage and computing kernel matrix is very expensive. To mitigate the problem of kernel matrix overload, we employ the low-rank approximate kernel matrix to speed up the solution of kernel logistic regression (KLR) and propose a framework for quickly solving the KLR algorithm. We use a fast iterative algorithm similar to the sequential minimal optimization (SMO) algorithm to solve the dual problem in the KLR. In addition, in this framework, the low-rank approximation is combined with gradient descent and Newton iteration algorithm, respectively. The low-rank approximation is used to reduce the redundant information in the data, which not only speeds up the solution of the KLR but also improves the accuracy of classification. Finally, the extensive experiments show that the KLR optimization algorithms based on our proposed framework outperform the state-of-the-art algorithms.
Highlights
Logistic regression algorithm (LR) is a classic classification algorithm in statistical analysis, machine learning, and data mining
The main contributions of this paper can be summarized as follows: (1) Based on the low-rank approximate optimization algorithm, we proposed a kernel framework to accelerate the solution of kernel logistic regression (KLR)
(2) We have explored the efficiency of solving the KLR under different scale data by the above three low-rank approximation optimization algorithms
Summary
Logistic regression algorithm (LR) is a classic classification algorithm in statistical analysis, machine learning, and data mining. When solving the iterative update value via truncated Newton method, the fast dual algorithm introduces the calculation of the kernel matrix ( the kernel matrix does not participate in the iterative process), which increases the time cost of the algorithm, so the algorithm still needs to be improved. Inspired by the combination of the kernel SVM and the low-rank approximation method, we combine the fast dual algorithm with the Nystrom method to solve KLR, which improves the accuracy and efficiency of the classifier. The Nystrom method estimates the original kernel matrix by low-rank approximation to reduce redundant information in the data and reduce the computational time overhead of the kernel matrix. (2) We have explored the efficiency of solving the KLR under different scale data by the above three low-rank approximation optimization algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.