Abstract
The N-Queens problem plays an important role in academic research and practical application. Heuristic algorithm is often used to solve variant 2 of the N-Queens problem. In the process of solving, evaluation of the candidate solution, namely, fitness function, often occupies the vast majority of running time and becomes the key to improve speed. In this paper, three parallel schemes based on CPU and four parallel schemes based on GPU are proposed, and a serial scheme is implemented at the baseline. The experimental results show that, for a large-scale N-Queens problem, the coarse-grained GPU scheme achieved a maximum 307-fold speedup over a single-threaded CPU counterpart in evaluating a candidate solution. When the coarse-grained GPU scheme is applied to simulated annealing in solving N-Queens problem variant 2 with a problem size no more than 3000, the speedup is up to 9.3.
Highlights
Introduction e EightQueens problem was first proposed by Max Bezzel in a Berlin chess magazine in 1848 [1]. e original question was how to place eight queens on the chessboard and make them unable to attack each other
In order to facilitate the mutation, crossover, synthesis, splitting, and other operations in the evolution process of the heuristic algorithm, the candidate solution is usually encoded by an integer and expressed as one-dimensional arrays or a vector. e subscripts of the array or vector are used as abscissas, and the element values are used as ordinates
Speedup of fitness function Speedup of SA algorithm simulated annealing algorithm, the performance gain of fitness function brought by GPU parallelism directly improves the performance of the SA algorithm
Summary
An N-Queens problem is a two-dimensional optimization problem. In order to facilitate the mutation, crossover, synthesis, splitting, and other operations in the evolution process of the heuristic algorithm, the candidate solution is usually encoded by an integer and expressed as one-dimensional arrays or a vector. e subscripts of the array or vector are used as abscissas, and the element values are used as ordinates. Input: N, NQ, Pairs Output: conflicts (1) tid ⟵ global thread id in GPU Kernel (2) Xi ⟵ Pairstid.x1 (3) Xj ⟵ Pairstid.x2 (4) Yi ⟵ NQXi (5) Yj ⟵ NQXj (6) if Yi Yj or |Yj − Yi| Xj − Xi (7) atomicAdd(conflicts) (8) end ALGORITHM 8: Calculation of conflicts with fine-grained scheme 1 in GPU. Input: N, NQ Output: conflicts (1) Xi ⟵ global thread id in Kernel; (2) Yi ⟵ NQXi ; (3) for Xj Xi + 1; Xj
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.