Multimodal Architecture Research Articles

The wealth of sensory data coming from different modalities has opened numerous opportunities for data analysis. The data are of increasing volume, complexity and dimensionality, thus calling for new methodological innovations towards multimodal data processing. However, multimodal architectures must rely on models able to adapt to changes in the data distribution. Differences in the density functions can be due to changes in acquisition conditions (pose, illumination), sensors characteristics (number of channels, resolution) or different views (e.g. street level vs. aerial views of a same building). We call these different acquisition modes domains, and refer to the adaptation problem as domain adaptation. In this paper, instead of adapting the trained models themselves, we alternatively focus on finding mappings of the data sources into a common, semantically meaningful, representation domain. This field of manifold alignment extends traditional techniques in statistics such as canonical correlation analysis (CCA) to deal with nonlinear adaptation and possibly non-corresponding data pairs between the domains. We introduce a kernel method for manifold alignment (KEMA) that can match an arbitrary number of data sources without needing corresponding pairs, just few labeled examples in all domains. KEMA has interesting properties: 1) it generalizes other manifold alignment methods, 2) it can align manifolds of very different complexities, performing a discriminative alignment preserving each manifold inner structure, 3) it can define a domain-specific metric to cope with multimodal specificities, 4) it can align data spaces of different dimensionality, 5) it is robust to strong nonlinear feature deformations, and 6) it is closed-form invertible, which allows transfer across-domains and data synthesis. To authors’ knowledge this is the first method addressing all these important issues at once. We also present a reduced-rank version of KEMA for computational efficiency, and discuss the generalization performance of KEMA under Rademacher principles of stability. Aligning multimodal data with KEMA reports outstanding benefits when used as a data pre-conditioner step in the standard data analysis processing chain. KEMA exhibits very good performance over competing methods in synthetic controlled examples, visual object recognition and recognition of facial expressions tasks. KEMA is especially well-suited to deal with high-dimensional problems, such as images and videos, and under complicated distortions, twists and warpings of the data manifolds. A fully functional toolbox is available at https://github.com/dtuia/KEMA.git.

Electronic Health Record (EHR) phenotyping utilizes patient data captured through normal medical practice, to identify features that may represent computational medical phenotypes. These features may be used to identify at-risk patients and improve prediction of patient morbidity and mortality. We present a novel deep multi-modality architecture for EHR analysis (applicable to joint analysis of multiple forms of EHR data), based on Poisson Factor Analysis (PFA) modules. Each modality, composed of observed counts, is represented as a Poisson distribution, parameterized in terms of hidden binary units. Information from different modalities is shared via a deep hierarchy of common hidden units. Activation of these binary units occurs with probability characterized as Bernoulli-Poisson link functions, instead of more traditional logistic link functions. In addition, we demonstrate that PFA modules can be adapted to discriminative modalities. To compute model parameters, we derive efficient Markov Chain Monte Carlo (MCMC) inference that scales efficiently, with significant computational gains when compared to related models based on logistic link functions. To explore the utility of these models, we apply them to a subset of patients from the Duke-Durham patient cohort. We identified a cohort of over 16,000 patients with Type 2 Diabetes Mellitus (T2DM) based on diagnosis codes and laboratory tests out of our patient population of over 240,000. Examining the common hidden units uniting the PFA modules, we identify patient features that represent medical concepts. Experiments indicate that our learned features are better able to predict mortality and morbidity than clinical features identified previously in a large-scale clinical trial.

Multimodal Architecture Research Articles

Related Topics

Articles published on Multimodal Architecture

Channel cloning by multi-mode phase-sensitive parametric mixer.

Multi-step self-guided pathways for shape-changing metamaterials.

Is a Picture Worth a Thousand Words? A Deep Multi-Modal Architecture for Product Classification in E-Commerce

Bayesian optimization on graph-structured search spaces: Optimizing deep multimodal fusion architectures

Multimodal architecture for video captioning with memory networks and an attention mechanism

Dynamic configuration management of a multi-standard and multi-mode reconfigurable multi-ASIP architecture for turbo decoding

A multi-modal architecture for non-intrusive analysis of performance in the workplace

Topology Generation for Hybrid Electric Vehicle Architecture Design

A novel Li-ion battery charger using multi-mode LDO configuration based on 350 nm HV-CMOS

Kernel Manifold Alignment for Domain Adaptation.

Electronic health record analysis via deep Poisson factor models

Slot-Mode Optomechanical Crystals: A Versatile Platform for Multimode Optomechanics.

Discovery and registration of components in multimodal systems distributed on the IoT

Configurable Architectures for Multi-Mode Floating Point Adders

Efficient multi-Gb/s multi-mode LDPC decoder architecture for IEEE 802.11ad applications

Principles of Construction of Polymodal Info-Communication Systems based on Multimodal Architectures of Subscriber’s Terminals

Prototyping using a Pattern Technique and a Context-Based Bayesian Network in Multimodal Systems

On the development of high-throughput and area-efficient multi-mode cryptographic hash designs in FPGAs

Speech-centric Multimodal Interaction for Easy-to-access Online Services – A Personal Life Assistant for the Elderly

Area-Efficient Multimode Encoding Architecture for Long BCH Codes

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multimodal Architecture Research Articles

Related Topics

Articles published on Multimodal Architecture

Channel cloning by multi-mode phase-sensitive parametric mixer.

Multi-step self-guided pathways for shape-changing metamaterials.

Is a Picture Worth a Thousand Words? A Deep Multi-Modal Architecture for Product Classification in E-Commerce

Bayesian optimization on graph-structured search spaces: Optimizing deep multimodal fusion architectures

Multimodal architecture for video captioning with memory networks and an attention mechanism

Dynamic configuration management of a multi-standard and multi-mode reconfigurable multi-ASIP architecture for turbo decoding

A multi-modal architecture for non-intrusive analysis of performance in the workplace

Topology Generation for Hybrid Electric Vehicle Architecture Design

A novel Li-ion battery charger using multi-mode LDO configuration based on 350 nm HV-CMOS

Kernel Manifold Alignment for Domain Adaptation.

Electronic health record analysis via deep Poisson factor models

Slot-Mode Optomechanical Crystals: A Versatile Platform for Multimode Optomechanics.

Discovery and registration of components in multimodal systems distributed on the IoT

Configurable Architectures for Multi-Mode Floating Point Adders

Efficient multi-Gb/s multi-mode LDPC decoder architecture for IEEE 802.11ad applications

Principles of Construction of Polymodal Info-Communication Systems based on Multimodal Architectures of Subscriber’s Terminals

Prototyping using a Pattern Technique and a Context-Based Bayesian Network in Multimodal Systems

On the development of high-throughput and area-efficient multi-mode cryptographic hash designs in FPGAs

Speech-centric Multimodal Interaction for Easy-to-access Online Services – A Personal Life Assistant for the Elderly

Area-Efficient Multimode Encoding Architecture for Long BCH Codes