Tensor Iterators for Flexible High-Performance Tensor Computation

John Jolly,Vishal Sahoo,Priya Goyal,Hans Johansen,Mary Hall

doi:10.1007/978-3-031-31445-2_2

Abstract

The explosive growth of machine learning applications has consequently created a demand for high-performance implementations of tensor contractions, both for dense and sparse tensors. Compilers, code generators and libraries are often limited in what sparse tensor representations are supported. We observe that tensor contractions can be thought of as iterating over the elements of a sparse tensor to perform an operation and accumulation; co-iteration over multiple tensors can be implemented with iteration and lookup. We recognize that the resulting code can be restructured by specifying a computation, its data layout, and how to iterate over that. We illustrate the need for this generality with two different implementations of block-based data layouts implementing sparse matrix-vector multiplication (SpMV). We show how to generate these implementations with a tensor iterator abstraction designed to be integrated into the MLIR compiler, and present measurements of nearby manual implementations to demonstrate the tradeoffs and complexities with these different implementations.

Full Text