Multimodal Tucker Decomposition for Gated RBM Inference

Mauricio Maldonado-Chan,Ramon Osvaldo Guardado-Medina,Andres Mendez-Vazquez

doi:10.3390/app11167397

Abstract

Gated networks are networks that contain gating connections in which the output of at least two neurons are multiplied. The basic idea of a gated restricted Boltzmann machine (RBM) model is to use the binary hidden units to learn the conditional distribution of one image (the output) given another image (the input). This allows the hidden units of a gated RBM to model the transformations between two successive images. Inference in the model consists in extracting the transformations given a pair of images. However, a fully connected multiplicative network creates cubically many parameters, forming a three-dimensional interaction tensor that requires a lot of memory and computations for inference and training. In this paper, we parameterize the bilinear interactions in the gated RBM through a multimodal tensor-based Tucker decomposition. Tucker decomposition decomposes a tensor into a set of matrices and one (usually smaller) core tensor. The parameterization through Tucker decomposition helps reduce the number of model parameters, reduces the computational costs of the learning process and effectively strengthens the structured feature learning. When trained on affine transformations of still images, we show how a completely unsupervised network learns explicit encodings of image transformations.

Highlights

Feature engineering is the process of transforming raw data into a suitable representation or feature vector that can be used to train a machine learning model for a prediction problem
The multimodal tensor-based Tucker decomposition presented in this research has the useful property that it keeps the independent structure in the gated restricted Boltzmann machine (RBM) model intact
We take advantage of this independent structure by developing a contrastive-divergence-based training procedure used for inference and learning

Summary

Introduction

Feature engineering is the process of transforming raw data into a suitable representation or feature vector that can be used to train a machine learning model for a prediction problem. The traditional approach to feature engineering was to manually build a feature extractor that required careful engineering and considerable domain expertise. Deep learning allows computational models that are composed of multiple processing layers to learn representation of data with multiple levels of abstraction [1] Deep learning improves the process of feature engineering by automatically extracting useful and interpretable features. It eliminates the need for domain expertise and hard core feature extraction by learning high-level features from the data in a hierarchical manner. The main building blocks in the deep learning literature are restricted Boltzmann machines [2,3], autoencoders [4,5], convolutional neural networks [6,7], and recurrent neural networks [8]

Methods

Results

Conclusion