Verified tensor-program optimization via high-level scheduling rewrites

Amanda Liu,Adam Chlipala,Gilbert Louis Bernstein,Jonathan Ragan-Kelley

doi:10.1145/3498717

Abstract

We present a lightweight Coq framework for optimizing tensor kernels written in a pure, functional array language. Optimizations rely on user scheduling using series of verified, semantics-preserving rewrites. Unusually for compilation targeting imperative code with arrays and nested loops, all rewrites are source-to-source within a purely functional language. Our language comprises a set of core constructs for expressing high-level computation detail and a set of what we call reshape operators, which can be derived from core constructs but trigger low-level decisions about storage patterns and ordering. We demonstrate that not only is this system capable of deriving the optimizations of existing state-of-the-art languages like Halide and generating comparably performant code, it is also able to schedule a family of useful program transformations beyond what is reachable in Halide.

Highlights

In high-performance computing, a single natural algorithm over multidimensional arrays may have a bewildering variety of different code realizations, to optimize for performance on different machines
The transformations may be checked by compilers, so that functionality bugs can only be missed in the algorithm, not specific optimizations on it
Programming languages like Halide for graphics [Ragan-Kelley et al 2013] and TVM for machine learning [Chen et al 2018] have emerged to directly facilitate programming in this style, with compilers driven by optimization directives

Summary

INTRODUCTION

We present a framework embedded in the Coq proof assistant, with a language of optimization commands that is simultaneously more formally assured and more flexible than in past work. We can imagine composing an algorithm soundness proof with one of our derivations of optimized code with correctness of a lower-level-language compiler or even a hardware accelerator ś all of which are worthwhile future work. We return to define our language (including formal semantics) bottom-up, before proceeding through three crucial elements of our pipeline: basic scheduling rewrites, lowering to imperative code, and reshape operators. After an interlude explaining Coq encoding details, we present preliminary results from an empirical evaluation showing that we achieve competitive performance w.r.t. Halide on a small set of examples, managing to compile respectably fast versions of some algorithms beyond Halide’s applicability.

OVERVIEW AND MOTIVATING EXAMPLE

CORE LANGUAGE

Specification

THE SCHEDULING-REWRITE FRAMEWORK

Scheduling Rewrites

Binders and Contexts

Rewrite Tactics and Automation

COMPILATION

Normalization

Code Generation

RESHAPE OPERATORS

Compute and Storage Order

Safe Garbage

Adjoint Introduction

IMPLEMENTATION DETAIL

EXPERIMENTAL EVALUATION

Scatter-to-Gather Optimization

RELATED WORK

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings of the ACM on Programming Languages	Publication Date: Jan 12, 2022
Citations: 12	License type: cc-by

R Discovery Prime

R Discovery Prime

Verified tensor-program optimization via high-level scheduling rewrites

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the ACM on Programming Languages

Lead the way for us

Similar Papers

Space Improvements and Equivalences in a Functional Core Language
Manfred Schmidt-Schauß ... Nils Dallmeyer
Electronic Proceedings in Theoretical Computer Science | VOL. 265
Manfred Schmidt-Schauß, et. al.Manfred Schmidt-Schauß ... Nils Dallmeyer
16 Feb 2018
Electronic Proceedings in Theoretical Computer Science | VOL. 265

Editorial
Julia Lawall
Journal of Functional Programming | VOL. 18
Julia LawallJulia Lawall
01 Sep 2008
Journal of Functional Programming | VOL. 18

Parallel Functional Reactive Programming
John Peterson ... Valery Trifonov
-
John Peterson, et. al.John Peterson ... Valery Trifonov
01 Jan 1998
01 Jan 1998

Making Curry with Rice: An Optimizing Curry Compiler
Steven Libby
-
Steven LibbySteven Libby
16 Sep 2022
16 Sep 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Verified tensor-program optimization via high-level scheduling rewrites

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Proceedings of the ACM on Programming Languages