Abstract

A wide range of combinatorial optimization problems can be reduced to the Ising model, and equivalently the quadratic unconstrained binary optimization (QUBO) problem. Thus, in recent years, researchers have proposed to solve QUBO on FPGAs, GPUs, and special-purpose processors. The adaptive bulk search (ABS) is a previously-proposed framework for solving QUBO in parallel on multiple GPUs. In the ABS, a CPU host performs a GA-based global search while GPUs asynchronously perform many local searches in parallel. An original ABS adopts a simple local search algorithm called a cyclic-min algorithm, which does not use pseudo random numbers. However, the lack of randomness may cause a potential drawback of restricted bit-flipping operations in a local search. To avoid this drawback, this paper proposes a cyclic-min algorithm with randomly-generated multiple bit permutations, which enables a more effective local search with random number generation in CPUs (not in GPUs). Furthermore, this paper introduces a scalable implementation of the ABS with MPI and OpenMP. Our experimental results on TSUBAME3.0 show that the solution quality improves and the throughput linearly increases as the number of GPUs increases; with 256 GPUs, it evaluates 20.1×1012 solutions per second.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call