We introduce a new swarm-based gradient descent (SBGD) method for non-convex optimization. The swarm consists of agents, each is identified with a position, x \mathbf {x} , and mass, m m . The key to their dynamics is communication: masses are being transferred from agents at high ground to low(-est) ground. At the same time, agents change positions with step size adjusted to their relative mass. Accordingly, the crowd of agents is dynamically divided between heavier ‘leaders’, which proceed with small time steps and are expected to approach local minima, and lighter ‘explorers’ which proceed with larger-step protocol and are expected to encounter improved position for the swarm. If they do, then they assume the role of heavy swarm leaders and so on. We prove local convergence and we present convincing numerical simulations which demonstrate that the added layer of SBGD explorers performs well in improving its global convergence behavior in one-, two-, and 20-dimensional benchmarks.
Read full abstract