The probabilistic tensor decomposition toolbox

Jesper L Hinrich,Kristoffer H Madsen,Morten Mørup

doi:10.1088/2632-2153/ab8241

Abstract

This article introduces the probabilistic tensor decomposition toolbox - a MATLAB toolbox for tensor decomposition using Variational Bayesian inference and Gibbs sampling. An introduction and overview of probabilistic tensor decomposition and its connection with classical tensor decomposition methods based on maximum likelihood is provided. We subsequently describe the probabilistic tensor decomposition toolbox which encompasses the Canonical Polyadic, Tucker, and Tensor Train decomposition models. Currently, unconstrained, non-negative, orthogonal, and sparse factors are supported. Bayesian inference forms a principled way of incorporating prior knowledge, prediction of held-out data, and estimating posterior probabilities. Furthermore, it facilitates automatic model order determination, automatic regularization on factors (e.g. sparsity), and inherently penalizes model complexity which is beneficial when inferring hierarchical models, such as heteroscedastic noise modelling. The toolbox allows researchers to easily apply Bayesian tensor decomposition methods without the need to derive or implement these methods themselves. Furthermore, it serves as a reference implementation for comparing existing and new tensor decomposition methods. The software is available from https://github.com/JesperLH/prob-tensor-toolbox/.

Highlights

Tensors, higher-order or n-way arrays are increasingly encountered in all areas of science
We subsequently describe the Creative Commons probabilistic tensor decomposition toolbox which encompasses the Canonical Polyadic, Tucker, Attribution 4.0 licence
Summary of contribution This paper introduces the probabilistic tensor decomposition toolbox which gathers many of the existing tools for Bayesian tensor decomposition in one place, see https://github.com/JesperLH/prob-tensor-toolbox/

Summary

Introduction

Higher-order or n-way arrays are increasingly encountered in all areas of science. While standard two-way matrix analysis methods can be applied by restructuring these higher order arrays into a matrix, such approaches fail to properly exploit the inherent multi-way structure. The most prominent being the Matlab based N-way Toolbox [10], Tensorlab [11], and Tensor Toolbox [12] enabling researchers to apply multi-way modeling across domains. These existing prominent multi-way toolboxes are based solely on maximum likelihood (ML) estimation which only provides a point estimate of the underlying parameters and does not account for parameter uncertainty. Instead estimating uncertainty requires repeated model fitting using for instance jackknifing [13] or bootstrapping [14] which are increasingly expensive as the size of the data grows

Methods

Results

Conclusion