Abstract

Multiview learning is concerned with machine learning problems, where data are represented by distinct feature sets or views. Recently, this definition has been extended to accommodate sequential data, i.e., each view of the data is in the form of a sequence. Multiview sequential data pose major challenges for representation learning, including i) absence of sample correspondence information between the views , ii) complex relations among samples within each view , and iii) high complexity for handling multiple sequences . In this article, we first introduce a generalized deep learning model that can simultaneously discover sample correspondence and capture the cross-view relations among the data sequences. The model parameters can be optimized using a gradient descent-based algorithm. The complexity for computing the gradient is at most quadratic with respect to sequence lengths in terms of both computational time and space. Based on this model, we propose a second model by integrating the objective with reconstruction losses of autoencoders. This allows the second model to provide a better trade-off between view-specific and cross-view relations in the data. Finally, to handle multiple (more than two) data sequences, we develop a third model along with a convergence-guaranteed optimization algorithm. Extensive experiments on public datasets demonstrate the superior performances of our models over competing methods.

Highlights

  • In many real-world applications, data are often collected from various perspectives, each of which presents a view of the same data and has its own representation space and relation characteristics

  • Because DTWΩ=entropy is equal to DTWβ, i.e., a smooth approximation of dynamic time warping (DTW) employed in the preliminary work

  • We evaluate the performances of the two-view methods on the Wisconsin X-ray microbeam (XRMB) corpus [54], which consists of 2537 utterances recorded from 47 Amer11

Read more

Summary

INTRODUCTION

In many real-world applications, data are often collected from various perspectives, each of which presents a view of the same data and has its own representation space and relation characteristics. Our method minimizes the generalized smooth DTW distance between projections of the two views subject to soft whitening constraints This allows GSCA to discover the sample correspondences and capture the relations between the views simultaneously. To provide a better balance between view-specific and cross-view relations, we combine our objective function with the reconstruction losses of autoencoders [6] This forms generalized sequentially correlated autoencoders (GSCAE), which are a new variant of the proposed model. We propose the GMSA model to handle multiple data sequences simultaneously, which was not considered in [7], and learn a more interpretable representation Experiments using both two- and multiple-view datasets were designed carefully to provide a fair comparison with existing competitors.

More discussions are given in III-C
GENERALIZED SMOOTH DTW
GENERALIZED SEQUENTIAL CORRELATION
OBJECTIVE
OPTIMIZATION
COMPARISON WITH DSCA
GENERALIZED SEQUENTIALLY CORRELATED
GENERALIZED MULTIPLE SEQUENCES ANALYSIS
RELATED WORKS
COMPARED METHODS
EVALUATION MEASUREMENTS
PARAMETER TUNING
TWO-VIEW DATA I
Method
TWO-VIEW DATA II
Objective function value
MULTIVIEW DATA II
ABLATION ANALYSIS OF GMSA
Findings
VIII. CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call