TSAR-ILP: Tile-Based, Synchronization-AwaRe ILP Allocating Heterogeneous Platforms for Streaming Applications

Bruno Morais,Gunar Schirner,Jinghan Zhang

doi:10.1109/tcad.2023.3274050

Abstract

Automatic design space exploration (DSE) is key in hardware-software (HW/SW) co-design. To cope with the large design space, explorations are often heuristic-based and/or approximate yielding potentially locally optimal solutions. Without knowing the globally optimal solution, strong assertions about performance upper / lower bounds cannot be made. In contrast, integer linear programming (ILP) formulations can produce exact (optimal) solutions. Previous ILP-based formulations, however, lack support for tile-based architectures and realistic synchronization models, limiting their DSE capabilities. This work introduces a tile-based, synchronization-aware ILP (TSAR-ILP) formulation that overcomes previous limitations. With TSAR-ILP, the allocation / binding problems are introduced and formalized, attaining optimal solutions for mapping streaming applications onto template platforms. Using TSAR-ILP, this work explores a hardware acceleratorrich (HWACC-rich) platform with direct HWACC-to-HWACC communication under HW area constraints for 40 OpenVX applications. To illustrate design opportunities given by (a) the ILP formulation and (b) direct HWACC-to-HWACC communication, this paper analyzes the impact of job size. Results show that selecting smaller job sizes yields performance improvements and less area usage at the cost of slightly increased synchronization overhead. A job size reduction from 1 kB to 256 bytes gives 3.51x average performance increase across 40 applications. Finally, DSE with TSAR-ILP is shown not to be prohibitive through scalability analysis using a set of 5000 synthetic applications with varying size (10-125 nodes), with 94.3% of applications successfully achieving optimal solutions under 60 seconds.

Full Text