Abstract

Logistic regression as a classic classification algorithm has limitations that can only be applied to linearly separable data. For linearly indivisible data, we use a kernel trick to map it to a higher dimensional space, making it easier to separate and structure in this space. However, with the increasing scale of data, the use of kernel trick is becoming more and more restricted. When the data reaches a certain scale, the cost of storage and computing kernel matrix is very expensive. To mitigate the problem of kernel matrix overload, we employ the low-rank approximate kernel matrix to speed up the solution of kernel logistic regression (KLR) and propose a framework for quickly solving the KLR algorithm. We use a fast iterative algorithm similar to the sequential minimal optimization (SMO) algorithm to solve the dual problem in the KLR. In addition, in this framework, the low-rank approximation is combined with gradient descent and Newton iteration algorithm, respectively. The low-rank approximation is used to reduce the redundant information in the data, which not only speeds up the solution of the KLR but also improves the accuracy of classification. Finally, the extensive experiments show that the KLR optimization algorithms based on our proposed framework outperform the state-of-the-art algorithms.

Highlights

  • Logistic regression algorithm (LR) is a classic classification algorithm in statistical analysis, machine learning, and data mining

  • The main contributions of this paper can be summarized as follows: (1) Based on the low-rank approximate optimization algorithm, we proposed a kernel framework to accelerate the solution of kernel logistic regression (KLR)

  • (2) We have explored the efficiency of solving the KLR under different scale data by the above three low-rank approximation optimization algorithms

Read more

Summary

INTRODUCTION

Logistic regression algorithm (LR) is a classic classification algorithm in statistical analysis, machine learning, and data mining. When solving the iterative update value via truncated Newton method, the fast dual algorithm introduces the calculation of the kernel matrix ( the kernel matrix does not participate in the iterative process), which increases the time cost of the algorithm, so the algorithm still needs to be improved. Inspired by the combination of the kernel SVM and the low-rank approximation method, we combine the fast dual algorithm with the Nystrom method to solve KLR, which improves the accuracy and efficiency of the classifier. The Nystrom method estimates the original kernel matrix by low-rank approximation to reduce redundant information in the data and reduce the computational time overhead of the kernel matrix. (2) We have explored the efficiency of solving the KLR under different scale data by the above three low-rank approximation optimization algorithms.

KERNEL LOGISTIC REGRESSION ALGORITHM
GRADIENT DESCENT METHOD FOR KLR
1) NYSTROM METHOD APPROXIMATION EIGENFUNCTION
GRADIENT DESCENT COMBINED WITH THE APPROXIMATE KERNEL MATRIX
NEWTON ITERATION COMBINED WITH THE APPROXIMATE KERNEL MATRIX
CONVERGENCE ANALYSIS OF THE NGD-KLR ALGORITHM AND NNI-KLR ALGORITHM
COMPUTATIONAL COMPLEXITY
EXPERIMENTAL SETUP
EXPERIMENTAL RESULTS
DISCUSSION
Findings
VIII. CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.