Abstract

The N-Queens problem plays an important role in academic research and practical application. Heuristic algorithm is often used to solve variant 2 of the N-Queens problem. In the process of solving, evaluation of the candidate solution, namely, fitness function, often occupies the vast majority of running time and becomes the key to improve speed. In this paper, three parallel schemes based on CPU and four parallel schemes based on GPU are proposed, and a serial scheme is implemented at the baseline. The experimental results show that, for a large-scale N-Queens problem, the coarse-grained GPU scheme achieved a maximum 307-fold speedup over a single-threaded CPU counterpart in evaluating a candidate solution. When the coarse-grained GPU scheme is applied to simulated annealing in solving N-Queens problem variant 2 with a problem size no more than 3000, the speedup is up to 9.3.

Highlights

  • Introduction e EightQueens problem was first proposed by Max Bezzel in a Berlin chess magazine in 1848 [1]. e original question was how to place eight queens on the chessboard and make them unable to attack each other

  • In order to facilitate the mutation, crossover, synthesis, splitting, and other operations in the evolution process of the heuristic algorithm, the candidate solution is usually encoded by an integer and expressed as one-dimensional arrays or a vector. e subscripts of the array or vector are used as abscissas, and the element values are used as ordinates

  • Speedup of fitness function Speedup of SA algorithm simulated annealing algorithm, the performance gain of fitness function brought by GPU parallelism directly improves the performance of the SA algorithm

Read more

Summary

Parallel Schemes of Fitness Function

An N-Queens problem is a two-dimensional optimization problem. In order to facilitate the mutation, crossover, synthesis, splitting, and other operations in the evolution process of the heuristic algorithm, the candidate solution is usually encoded by an integer and expressed as one-dimensional arrays or a vector. e subscripts of the array or vector are used as abscissas, and the element values are used as ordinates. Input: N, NQ, Pairs Output: conflicts (1) tid ⟵ global thread id in GPU Kernel (2) Xi ⟵ Pairstid.x1 (3) Xj ⟵ Pairstid.x2 (4) Yi ⟵ NQXi (5) Yj ⟵ NQXj (6) if Yi Yj or |Yj − Yi| Xj − Xi (7) atomicAdd(conflicts) (8) end ALGORITHM 8: Calculation of conflicts with fine-grained scheme 1 in GPU. Input: N, NQ Output: conflicts (1) Xi ⟵ global thread id in Kernel; (2) Yi ⟵ NQXi ; (3) for Xj Xi + 1; Xj

Experiment
Application of the GPU Coarse-Grained Scheme to Simulated Annealing
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call