Abstract
The Cell Broadband Engine architecture is a revolutionary processor architecture well suited for many scientific codes. This paper reports on an effort to implement several traditional high-performance scientific computing applications on the Cell Broadband Engine processor, including molecular dynamics, quantum chromodynamics and quantum chemistry codes. The paper discusses data and code restructuring strategies necessary to adapt the applications to the intrinsic properties of the Cell processor and demonstrates performance improvements achieved on the Cell architecture. It concludes with the lessons learned and provides practical recommendations on optimization techniques that are believed to be most appropriate.
Highlights
In this paper, three case studies are presented in which computationally demanding applications are implemented on the Cell Broadband Engine (Cell/B.E.) processor
It is derived from the data layout and inner loop of Nanoscale Molecular Dynamics (NAMD) to form a compact benchmark for SPEC CPU2006, a CPU-intensive benchmark suite
The reference implementation [23] of the two-electron repulsion integrals (ERIs) evaluation code, which is the computational core of the direct self-consistent field (SCF) method, is written from scratch following the well-known algorithms from the General Atomic and Molecular Electronic Structure System (GAMESS) [18] and other ab initio quantum chemistry packages
Summary
Three case studies are presented in which computationally demanding applications are implemented on the Cell Broadband Engine (Cell/B.E.) processor. The applications chosen for the Cell/B.E. implementation are classic examples of the high-performance computing (HPC) codes that are typically executed on large-scale parallel systems. MILC belongs to a class of applications that are based on structured grids [2] where the majority of computations on the grid points are vector algebra operations. The reference implementation [23] of the two-electron repulsion integrals (ERIs) evaluation code, which is the computational core of the direct SCF method, is written from scratch following the well-known algorithms from the General Atomic and Molecular Electronic Structure System (GAMESS) [18] and other ab initio quantum chemistry packages. The eight SPUs of the Cell/B.E. processor have combined theoretical peak performance of 204.8 GFLOPS in single precision and 14.63 GFLOPS in double precision
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have