Abstract

This paper proposes a novel binary classification approach named Laplacian quadratic surface optimal margin distribution machine (LapQSODM), for semisupervised learning. This new model exploits the geometric information embedded in unlabeled samples through a manifold regularizer to overcome the problem of insufficient labeled samples. Different from the traditional support vector machines (SVMs) based on the largest minimum margin idea and using the kernel technique to address the nonlinearity in the datasets, our proposed model optimizes the margin distribution and directly generates a quadratic surface to perform the classification. This new model not only improves the generalization performance but also avoids the difficulty of searching for an appropriate kernel function and tuning its parameters. For the regular-scale datasets, we extend the classical conjugate gradient algorithm for the proposed method and design an easy iterative method to calculate its exact step size; for the large-scale datasets, we develop an unbiased estimation for the LapQSODM gradient by utilizing one or several samples and design a stochastic gradient descent with variance reduction (SVRG) algorithm that is efficient and effective. Then, we conduct a comprehensive numerical experiment on both artificial and public benchmark datasets. The experimental results show LapQSODM has a better generalization performance than most of the well-known benchmark methods and is more robust due to the optimization of the margin distribution. Finally, we apply LapQSODM to three realistic credit risk assessment problems. The promising numerical results demonstrate the great potential and ability of our model in the real practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call