Abstract

The non-equilibrium Green’s function (NEGF) is being utilized in the field of nanoscience to predict transport behaviors of electronic devices. This work explores how much performance improvement can be driven for quantum transport simulations with the aid of manycore computing, where the core numerical operation involves a recursive process of matrix multiplication. Major techniques adopted for performance enhancement are data restructuring, matrix tiling, thread scheduling, and offload computing, and we present technical details on how they are applied to optimize the performance of simulations in computing hardware, including Intel Xeon Phi Knights Landing (KNL) systems and NVIDIA general purpose graphic processing unit (GPU) devices. With a target structure of a silicon nanowire that consists of 100,000 atoms and is described with an atomistic tight-binding model, the effects of optimization techniques on the performance of simulations are rigorously tested in a KNL node equipped with two Quadro GV100 GPU devices, and we observe that computation is accelerated by a factor of up to ∼20 against the unoptimized case. The feasibility of handling large-scale workloads in a huge computing environment is also examined with nanowire simulations in a wide energy range, where good scalability is procured up to 2048 KNL nodes.

Highlights

  • The non-equilibrium Green’s function (NEGF) formalism [1] is essential to predict quantum transport behaviors of carriers in ultra-scale electronic devices such as nanowire transistors [2], quantum dot photodetectors [3], and low-dimensional devices [4]

  • Using our in-house code package, named the quantum simulation tool for advanced nanoscale device designs (QAND) [7,8], which employs tight-binding models for atomistic representation of semiconductor nanostructures [5,9] and has been actively being used for modeling studies of device designs with solid connections to experiments [10,11], we apply the strategies to our NEGF solver and rigorously conduct performance tests to understand how the applied technical strategies affect the performance in a single computing node that is equipped with a 64-core Intel Xeon Phi Knights Landing (KNL) processor [12] and two NVIDIA Quadro GV100 general-purpose graphic processing unit (GPU) devices [13]

  • Since details of available computing resources vary depending on the GPU computing capability, it is critical to have a precise understanding of the architecture and the hardware specification of GPU devices that will be utilized to accelerate the kernel code

Read more

Summary

Introduction

The non-equilibrium Green’s function (NEGF) formalism [1] is essential to predict quantum transport behaviors of carriers (electrons or holes) in ultra-scale electronic devices such as nanowire transistors [2], quantum dot photodetectors [3], and low-dimensional devices [4]. Using our in-house code package, named the quantum simulation tool for advanced nanoscale device designs (QAND) [7,8], which employs tight-binding models for atomistic representation of semiconductor nanostructures [5,9] and has been actively being used for modeling studies of device designs with solid connections to experiments [10,11], we apply the strategies to our NEGF solver and rigorously conduct performance tests to understand how the applied technical strategies affect the performance in a single computing node that is equipped with a 64-core Intel Xeon Phi Knights Landing (KNL) processor [12] and two NVIDIA Quadro GV100 general-purpose graphic processing unit (GPU) devices [13]. In order to verify the ability of our NEGF solver to handle large-scale problems in huge computing environments, a strong scalability is tested in up to 2048 KNL nodes of the NURION supercomputer (the 21st fastest supercomputer in the world) [14], for end-toend simulations of quantum transport in a wide energy range. Being solidly verified with excellent speed-up and scalability of computation, the technical details we deliver can serve as a practical guideline of how manycore computing resources can be used to accelerate quantum transport simulations, as well as other numerical problems involving multiplication of dense and complex matrices

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call