One way to speed up convergence in a large optimization problem is to introduce a smaller, approximate version of the problem at a coarser scale and to alternate between relaxation steps for the fine-scale and coarse-scale problems. Such an optimization method for neural networks governed by quite general objective functions is presented. At the coarse scale, there is a smaller approximating neural net which, like the original net, is nonlinear and has a nonquadratic objective function. The transitions and information flow from fine to coarse scale and back do not disrupt the optimization, and the user need only specify a partition of the original fine-scale variables. Thus, the method can be applied easily to many problems and networks. There is generally about a fivefold improvement in estimated cost under the multiscale method. In the networks to which it was applied, a nontrivial speedup by a constant factor of between two and five was observed, independent of problem size. Further improvements in computational cost are very likely to be available, especially for problem-specific multiscale neural net methods.
Read full abstract