Abstract

The minibatching technique has been extensively adopted to facilitate stochastic first-order methods because of their computational efficiency in parallel computing for large-scale machine learning and data mining. However, the optimal minibatch size determination for accelerated stochastic gradient methods is not completely understood. Actually, there appears trade-off between the iteration complexity and the total computational complexity; that is, the number of iterations (minibatch queries) can be decreased by increasing the minibatch size, but too large minibatch size would result in an unnecessarily large total computational cost. In this study, we give a sharp characterization of the minimax optimal minibatch size to achieve the optimal iteration complexity by providing a reachable lower bound for minimizing finite sum of convex functions and, surprisingly, show that the optimal method with the minimax optimal minibatch size can achieve both of the optimal iteration complexity and the optimal total computational complexity simultaneously. Finally, this feature is verified experimentally.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call