GPU-Accelerated Population Annealing Algorithm: Frustrated Ising Antiferromagnet on the Stacked Triangular Lattice

Michal Borovský,Lev Yu Barash,Milan Žukovič,Martin Weigel,Gh Adam,J Buša,M Hnatič

doi:10.1051/epjconf/201610802016

Michal Borovský, Lev Yu Barash + Show 5 more

Open Access

https://doi.org/10.1051/epjconf/201610802016

Copy DOI

Abstract

The population annealing algorithm is a novel approach to study systems with rough free-energy landscapes, such as spin glasses. It combines the power of simulated annealing, Boltzmann weighted differential reproduction and sequential Monte Carlo process to bring the population of replicas to the equilibrium even in the low-temperature region. Moreover, it provides a very good estimate of the free energy. The fact that population annealing algorithm is performed over a large number of replicas with many spin updates, makes it a good candidate for massive parallelism. We chose the GPU programming using a CUDA implementation to create a highly optimized simulation. It has been previously shown for the frustrated Ising antiferromagnet on the stacked triangular lattice with a ferromagnetic interlayer coupling, that standard Markov Chain Monte Carlo simulations fail to equilibrate at low temperatures due to the effect of kinetic freezing of the ferromagnetically ordered chains. We applied the population annealing to study the case with the isotropic intra- and interlayer antiferromagnetic coupling (J 2 /|J 1 | = −1). The reached ground states correspond to non-magnetic degenerate states, where chains are antiferromagnetically ordered, but there is no long-range ordering between them, which is analogical with Wannier phase of the 2D triangular Ising antiferromagnet.

Highlights

We chose the graphics processing units (GPU) programming using a CUDA implementation to create a highly optimized simulation. It has been previously shown for the frustrated Ising antiferromagnet on the stacked triangular lattice with a ferromagnetic interlayer coupling, that standard Markov Chain Monte Carlo simulations fail to equilibrate at low temperatures due to the effect of kinetic freezing of the ferromagnetically ordered chains
As we can see for all simulations we have successfully converged to the ground state (GS) configuration, but there is a small difference in the slope of the energy curves
The best performance of the population annealing (PA) code what we can get so far was achieved in the simulation C with 0.208 ns per spin-flip on NVIDIA GTX Titan with the speedup up to the 443 times compared to the sequential Markov Chain Monte Carlo (MCMC) code (92.274 ns per spin-flip), which ran on a single core of the Intel i7-4790K processor at 4.4 GHz

Summary

Introduction

The speedup that can be gained as compared to sequential CPU computing, highly depends on our knowledge of the GPU CUDA architecture and the way it executes kernels It takes a lot of thought and caution to incorporate all of this to create a highly optimized CUDA program. Our second goal is to study the highly frustrated Ising antiferromagnet on the stacked triangular lattice, which suffers from a slow spin dynamics in the low temperature region [6], where standard Markov Chain Monte Carlo (MCMC) simulations fail. This problem will be briefly discussed in the section 4. To deal with this problem we applied PA annealing algorithm on this system and the results will be presented in the section 5

Population annealing algorithm

GPU realization

Stacked triangular Ising antiferromagnet

Results

Conclusions