Finite-difference time-domain (FDTD) is a popular but computational intensive method to solve Maxwell's equations for electrical and optical devices simulation. This paper presents implementations of three-dimensional FDTD with convolutional perfect match layer (CPML) absorbing boundary conditions on graphics processing unit (GPU). Electromagnetic fields in Yee cells are calculated in parallel millions of threads arranged as a grid of blocks with compute unified device architecture (CUDA) programming model and considerable speedup factors are obtained versus sequential CPU code. We extend the parallel algorithm to multiple GPUs in order to solve electrically large structures. Asynchronous memory copy scheme is used in data exchange procedure to improve the computation efficiency. We successfully use this technique to simulate pointwise source radiation and validate the result by comparison to high precision computation, which shows favorable agreements. With four commodity GTX295 graphics cards on a single personal computer, more than 4000 million Yee cells can be updated in one second, which is hundreds of times faster than traditional CPU computation.
Read full abstract