Early experiences on Summit: Data analytics and AI applications

D E Womble,J T Johnston,W Joubert,J C Wells,M Shankar,J A Nichols

doi:10.1147/jrd.2019.2944146

D E Womble, J T Johnston + Show 4 more

https://doi.org/10.1147/jrd.2019.2944146

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Oak Ridge National Laboratory (ORNL) installed the Summit supercomputer in 2018. Summit is an accelerated-node architecture with 4,608 nodes, each with two IBM P9 and six NVIDIA Volta V100 GPU processors, significant DRAM footprint, robust HBM quantities supporting the GPUs, nonvolatile memory, and fast NVLink and Infiniband interconnects. This machine was designed to deliver over 200 peak double-precision petaflops for scientific modeling and simulation applications and over 3 peak reduced-precision ExaOps. Summit features impact application performance depending on whether the codes are simulation-oriented, write-intensive, data-analysis-oriented, read-intensive, or communication-intensive codes. In the context of artificial intelligence (AI) and machine learning (ML), these features support data-intensive applications that infer and predict statistical relationships in complex datasets. This article presents recent experiences at ORNL using Summit for applications in AI and ML and describes example code and algorithmic changes necessary to use Summit effectively. Finally, this article discusses research directions in scalable ML, including, algorithms research and combining data analysis with modeling and simulation in an accelerated-node, exascale environment.

Full Text