Abstract

We describe our experience porting the Regensburg implementation of the DD-αAMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present the performance of the code on a single processor as well as the scaling on many nodes, where in both cases the speedup factor is close to the theoretical expectations.

Highlights

  • The lattice QCD (LQCD) community has traditionally been an early adopter of new computing and network architectures

  • The subject of this contribution was the port of our existing code base for QPACE 2 to our new machine QPACE 3

  • On Knights Corner (KNC) we could achieve a significant performance gain using half precision, but on Knights Landing (KNL) half precision deteriorates performance rather than improving it, at least with our current implementation

Read more

Summary

Introduction

The lattice QCD (LQCD) community has traditionally been an early adopter of new computing and network architectures. This typically requires major efforts porting simulation code or even communication libraries. The present contribution focuses on the software efforts we made to efficiently run this implementation on QPACE 3.

Overview
DD-αAMG for Xeon Phi
On-chip strong scaling
Multi-node benchmarks
Findings
Conclusions and future opportunities
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call