A High-Performance Parallel FDTD Method Enhanced by Using SSE Instruction Set

Dau-Chyrh Chang,Lihong Zhang,Xiaoling Yang,Shao-Hsiang Yen,Wenhua Yu

doi:10.1155/2012/851465

Abstract

We introduce a hardware acceleration technique for the parallel finite difference time domain (FDTD) method using the SSE (streaming (single instruction multiple data) SIMD extensions) instruction set. The implementation of SSE instruction set to parallel FDTD method has achieved the significant improvement on the simulation performance. The benchmarks of the SSE acceleration on both the multi-CPU workstation and computer cluster have demonstrated the advantages of (vector arithmetic logic unit) VALU acceleration over GPU acceleration. Several engineering applications are employed to demonstrate the performance of parallel FDTD method enhanced by SSE instruction set.

Highlights

Since the finite difference time domain (FDTD) method is firstly proposed by Yee in 1966 [1] for solving Maxwell’s equations as a type of difference algorithm, it has grown into a popular computational electromagnetic technique after decades of development and become a major electromagnetic simulation tool today
In order to overcome this disadvantage of the FDTD method, there are a large number of publications on the related topics such as conformal technique [2] to enlarge the cell size, subgridding scheme [3] to use local fine mesh, and ADI algorithm [4] to increase the size of time step
To use the multicore PC, multi-CPU workstation, and computer cluster to speed up the electromagnetic simulation, the parallel processing based on the MPI library [5] and OpenMP [6]

Summary

Introduction

Since the FDTD method is firstly proposed by Yee in 1966 [1] for solving Maxwell’s equations as a type of difference algorithm, it has grown into a popular computational electromagnetic technique after decades of development and become a major electromagnetic simulation tool today. Unlike the GPU acceleration, VALU acceleration will fully use the features built inside the regular CPU. Inside the CPU for many years since the Intel Pentium III It has not used for the acceleration of engineering simulations. The VALU acceleration has been illustrated through several typical examples on both Intel and AMD processors. The FDTD simulation accelerated by using VALU units can get more benefit from the AMD than Intel processors. We first briefly introduce the parallel FDTD method, and describe the basic architecture of a regular CPU in the Section 3. We summarize the works in this paper and point out the future works

Parallel FDTD Acceleration Method

FDTD Performance Investigation

Engineering Applications

Findings

Conclusions