A Single-Shot Generalized Device Placement for Large Dataflow Graphs

Yanqi Zhou,James Laudon,Peter Ma,Qiumin Xu,Daniel Lin-Kit Wong,Sudip Roy,Amirali Abdolrashidi,Azalia Mirhoseini

doi:10.1109/mm.2020.3015188

Abstract

With increasingly complex neural network architectures and heterogeneous device characteristics, finding a reasonable graph partitioning and device placement strategy is challenging. There have been prior attempts at learned approaches for solving device placement, these approaches are computationally expensive, unable to handle large graphs consisting over 50000 nodes, and do not generalize well to unseen graphs. To address all these limitations, we propose an efficient single-shot, generalized deep RL method (SGDP) based on a scalable sequential attention mechanism over a graph neural network that is transferable to new graphs. On a diverse set of representative deep learning models, our method on average achieves 20% improvement over human placement and 18% improvement over the prior art with 15× faster convergence. We are the first to demonstrate super human performance on 8-layer recurrent neural network language model and 8-layer GNMT consisting of over 50000 nodes, on 8-GPUs. We provide rationales and sensitivity study on model architecture selections.

Full Text