UFront: Toward A Unified MLIR Frontend for Deep Learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Automatic code generation for ML systems has gained popularity with the advent of compiler techniques like Multi-Level Intermediate Representation (Multi-Level IR, or MLIR). State-of-the-art MLIR frontends, including IREE-TF, Torch-MLIR, and ONNX-MLIR, aim to bridge the gap between ML frameworks and low-level hardware architectures through MLIR's progressive lowering pipeline. However, existing MLIR frontends encounter challenges such as inflexible high-level IR conversion, limited higher-level optimization opportunities, and reduced compatibility and efficiency, leading to software fragmentation and restricting their practical applications within the MLIR ecosystem. To address these challenges, we introduce UFront, a unified MLIR frontend employing a two-stage operator-to-operator compilation workflow. Unlike traditional frontends that compile model source code into binaries step by step with different MLIR transform passes, UFront decouples the process into two distinct stages. It first performs instantaneous model tracing, delegates traced computing nodes as standard Deep Neural Network (DNN) operators and transforms models written in different frameworks into unified high-level IR without relying on MLIR passes, enhancing conversion flexibility. Meanwhile, it performs high-level graph optimizations such as constant folding and operator fusion to produce more efficient high-level IR. In the second stage, UFront directly converts high-level IR into standard TOSA IR using proposed lowering patterns, eliminating transform redundancies and ensuring lower-level compatibility with existing ML compiler backends. This two-stage compilation approach enables consistent end-to-end code generation and optimization of various DNN models written in different formats within a single workflow. Extensive experiments on popular DNN models written in various frameworks demonstrate that UFront exhibits higher compatibility, faster end-to-end compilation, and is capable of producing more efficient binary execution compared to SOTA works.

Save Icon
Up Arrow
Open/Close