Abstract

The wealth of sensory data coming from different modalities has opened numerous opportunities for data analysis. The data are of increasing volume, complexity and dimensionality, thus calling for new methodological innovations towards multimodal data processing. However, multimodal architectures must rely on models able to adapt to changes in the data distribution. Differences in the density functions can be due to changes in acquisition conditions (pose, illumination), sensors characteristics (number of channels, resolution) or different views (e.g. street level vs. aerial views of a same building). We call these different acquisition modes domains, and refer to the adaptation problem as domain adaptation. In this paper, instead of adapting the trained models themselves, we alternatively focus on finding mappings of the data sources into a common, semantically meaningful, representation domain. This field of manifold alignment extends traditional techniques in statistics such as canonical correlation analysis (CCA) to deal with nonlinear adaptation and possibly non-corresponding data pairs between the domains. We introduce a kernel method for manifold alignment (KEMA) that can match an arbitrary number of data sources without needing corresponding pairs, just few labeled examples in all domains. KEMA has interesting properties: 1) it generalizes other manifold alignment methods, 2) it can align manifolds of very different complexities, performing a discriminative alignment preserving each manifold inner structure, 3) it can define a domain-specific metric to cope with multimodal specificities, 4) it can align data spaces of different dimensionality, 5) it is robust to strong nonlinear feature deformations, and 6) it is closed-form invertible, which allows transfer across-domains and data synthesis. To authors’ knowledge this is the first method addressing all these important issues at once. We also present a reduced-rank version of KEMA for computational efficiency, and discuss the generalization performance of KEMA under Rademacher principles of stability. Aligning multimodal data with KEMA reports outstanding benefits when used as a data pre-conditioner step in the standard data analysis processing chain. KEMA exhibits very good performance over competing methods in synthetic controlled examples, visual object recognition and recognition of facial expressions tasks. KEMA is especially well-suited to deal with high-dimensional problems, such as images and videos, and under complicated distortions, twists and warpings of the data manifolds. A fully functional toolbox is available at https://github.com/dtuia/KEMA.git.

Highlights

  • Domain adaptation constitutes a field of high interest in pattern analysis and machine learning

  • We focus on the concentration of sums of eigenvalues of the generalized Kernel Manifold Alignment (KEMA) eigenproblem solved using a finite number of samples, where new points are projected into the m-dimensional space spanned by the m eigenvectors corresponding to the largest m eigenvalues

  • The so-called KEMA can align an arbitrary number of domains of different dimensionality without needing corresponding pairs, just few labeled examples in all domains

Read more

Summary

Introduction

Domain adaptation constitutes a field of high interest in pattern analysis and machine learning. Classification algorithms developed with data from one domain cannot be directly used in another related domain, and adaptation of either the classifier or the data representation becomes strictly imperative [1]. Adapting (modifying) the classifier for any new incoming situation requires either computationally demanding retraining, passive-aggressive strategies, online filtering, or sample-relevance estimation and weighting. These approaches are algorithm-dependent, often resort to heuristic parameters, require good estimates of sample relevance and information content. One may try to adapt the domain representations to a single latent space, and apply a unique single classifier in that semantically meaningful feature space. Adapting the representation space has been referred in the literature to as feature representation transfer [6] or feature transformation learning[7]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.