Tensor Algebra Research Articles

Tensor algebra finds applications in various domains including machine learning applications, data analytics and others. Spatial hardware accelerators are widely used to boost the performance of tensor algebra applications. It has a complex hardware architecture and rich design space. Prior approaches based on manual implementation lead to low programming productivity, making it hard to explore the large design space. In this paper, we propose Tensorlib, a framework for generating spatial hardware accelerators for tensor algebra applications. Tensorlib is motivated by the observation that, tensor dataflows can be expressed with linear transformations, and they share common hardware modules which can be reused across different designs. Tensorlib first uses Space-Time Transformation to explore different dataflows, which can compactly represent the hardware dataflow using a transformation matrix. Next, we identify the common structures of different dataflows and build parameterized hardware module templates. Our generation framework can select the needed hardware modules for each dataflow, connect the modules using a specified interconnection pattern, and automatically generate the complete hardware accelerator design. Tensorlib remarkably improves the productivity for the development and optimization of spatial hardware architecture, providing a rich design space with trade-offs in performance, area, and power. Experiments show that Tensorlib can automatically generate hardware designs with different dataflows for a variety of tensor algebra programs. Tensorlib can achieve 318 MHz frequency and 786 GFLOP/s throughput for matrix multiplication kernel on Xilinx VU9P FPGA, which outperforms the state-of-the-art generators.

Read full abstract

In recent times, Variational Quantum Circuits (VQC) have been widely adopted to different tasks in machine learning such as Combinatorial Optimization and Supervised Learning. With the growing interest, it is pertinent to study the boundaries of the classical simulation of VQCs to effectively benchmark the algorithms. Classically simulating VQCs can also provide the quantum algorithms with a better initialization reducing the amount of quantum resources needed to train the algorithm. Even though Matrix Product State representations have been extensively used for quantum state approximation, their capacity is limited in simulating quantum circuits due to the exponential complexity in circuit depth. This manuscript proposes an algorithm that compresses the quantum state within a circuit using a noisy tensor ring representation which allows for the implementation of VQC based algorithms on a classical simulator at a fraction of the usual storage and computational complexity. Using the tensor ring approximation of the input quantum state, we propose a method that applies the parametrized unitary operations while retaining the low-rank structure of the tensor ring corresponding to the transformed quantum state, providing an exponential improvement of storage and computational time in the number of qubits and layers. This approximation is used to implement the tensor ring VQC (TRVQC) for the task of supervised learning on Iris and MNIST datasets to demonstrate the performance of the proposed method compared with the implementations from classical simulator using Matrix Product States (MPS). TRVQC has a test accuracy of 82.63% compared to the benchmark of 83.68% on Iris dataset whereas the former outperforms the latter on a reduced MNIST dataset with TRVQC having an accuracy of 83.73% compared to the benchmark 81.02%, showcasing the comparable performance of the proposed algorithm with the MPS framework.

Read full abstract

Tensor Algebra Research Articles

Related Topics

Articles published on Tensor Algebra

Generalized iterated-sums signatures

Mosaic: An Interoperable Compiler for Tensor Algebra

Indexed Streams: A Formal Intermediate Representation for Fused Contraction Programs

Automatic Generation of Spatial Accelerator for Tensor Algebra

SDPN: A Slight Dual-Path Network With Local-Global Attention Guided for Medical Image Segmentation.

Spatiotemporal traffic data imputation by synergizing low tensor ring rank and nonlocal subspace regularization

Covariant predictions for Planck-scale features in primordial power spectra

Incomplete Multiview Clustering via Low-Rank Tensor Ring Completion

Sgap: towards efficient sparse tensor algebra compilation for GPU

Fusions of tensor powers of Johnson schemes

Graph-Regularized Non-Negative Tensor-Ring Decomposition for Multiway Representation Learning.

Robust to Rank Selection: Low-Rank Sparse Tensor-Ring Completion.

Kronecker CP Decomposition With Fast Multiplication for Compressing RNNs.

Power spectra of slow-roll inflation in the consistent D → 4 Einstein-Gauss-Bonnet gravity

Scale-invariant enhancement of gravitational waves during inflation

Separation of spherically and translationally covariant finite quantum spaces within the XXX model

Tensor decomposition for painting analysis. Part 2: spatio-temporal simulation

Classical simulation of variational quantum classifiers using tensor rings

TR-ReFloc: A TR-based Framework for Recovering Missed RSS for WiFi indoor positioning in the offline and online phase

Real-Time Equation-of-Motion Coupled-Cluster Cumulant Green's Function Method: Heterogeneous Parallel Implementation Based on the Tensor Algebra for Many-Body Methods Infrastructure.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Tensor Algebra Research Articles

Related Topics

Articles published on Tensor Algebra

Generalized iterated-sums signatures

Mosaic: An Interoperable Compiler for Tensor Algebra

Indexed Streams: A Formal Intermediate Representation for Fused Contraction Programs

Automatic Generation of Spatial Accelerator for Tensor Algebra

SDPN: A Slight Dual-Path Network With Local-Global Attention Guided for Medical Image Segmentation.

Spatiotemporal traffic data imputation by synergizing low tensor ring rank and nonlocal subspace regularization

Covariant predictions for Planck-scale features in primordial power spectra

Incomplete Multiview Clustering via Low-Rank Tensor Ring Completion

Sgap: towards efficient sparse tensor algebra compilation for GPU

Fusions of tensor powers of Johnson schemes

Graph-Regularized Non-Negative Tensor-Ring Decomposition for Multiway Representation Learning.

Robust to Rank Selection: Low-Rank Sparse Tensor-Ring Completion.

Kronecker CP Decomposition With Fast Multiplication for Compressing RNNs.

Power spectra of slow-roll inflation in the consistent D → 4 Einstein-Gauss-Bonnet gravity

Scale-invariant enhancement of gravitational waves during inflation

Separation of spherically and translationally covariant finite quantum spaces within the XXX model

Tensor decomposition for painting analysis. Part 2: spatio-temporal simulation

Classical simulation of variational quantum classifiers using tensor rings

TR-ReFloc: A TR-based Framework for Recovering Missed RSS for WiFi indoor positioning in the offline and online phase

Real-Time Equation-of-Motion Coupled-Cluster Cumulant Green's Function Method: Heterogeneous Parallel Implementation Based on the Tensor Algebra for Many-Body Methods Infrastructure.