Abstract

Pairwise learning is receiving increasing attention since it covers many important machine learning tasks, e.g., metric learning, AUC maximization, and ranking. Investigating the generalization behavior of pairwise learning is thus of great significance. However, existing generalization analysis mainly focuses on the convex objective functions, leaving the nonconvex pairwise learning far less explored. Moreover, the current learning rates of pairwise learning are mostly of slower order. Motivated by these problems, we study the generalization performance of nonconvex pairwise learning and provide improved learning rates. Specifically, we develop different uniform convergence of gradients for pairwise learning under different assumptions, based on which we characterize empirical risk minimizer, gradient descent, and stochastic gradient descent. We first establish learning rates for these algorithms in a general nonconvex setting, where the analysis sheds insights on the trade-off between optimization and generalization and the role of early-stopping. We then derive faster learning rates of order O(1/n) for nonconvex pairwise learning with a gradient dominance curvature condition, where n is the sample size. Provided that the optimal population risk is small, we further improve the learning rates to O(1/n2), which, to the best of our knowledge, are the first O(1/n2) rates for pairwise learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call