Deeply fusing multimodal features in hypergraph

Wei Yao,Ying Jiang,Wenda Lu,Jun Chen,Linchao Xie

doi:10.1016/j.jvcir.2019.02.025

Abstract

Utilizing multimodal features to describe multimedia data is a natural way to improve recognition accuracy. However, how to optimally cluster the raw features into different modalities in order to alleviate curse of dimension and how to exploit relationships between and within the feature modalities are still two tough issues. In this paper, we propose a new deep feature fusion framework: hypergraph feature fusion (HFF), to handle these two issues. First, we extract a collection of deep features from multiple images, then HFF constructs a features’ relationships hypergraph (FRH) to reveal relationships among raw features. Then HFF conducts generalized community learning by graph approximation (GCLGA) in FRH to cluster the raw features into k modalities and obtain the inter and intra modalities’ structure matrices. These matrices reveal relationships of inter and intra modalities and can help to build graph kernels in order to optimize kernel based classification. Finally, HFF applies a two level classifier to classify the fused feature vectors. Dimension of each level classifier’s input feature vector is much lower than raw feature vector. We conduct the kernel based classification on two experiments: (1) Using kernel SVM to classify ETH-80 image dataset by fusing 2 kinds of raw image features. (2) Using features extracted from kernel LDA on speech emotion recognition by fusing 6 kinds of raw speech features. The experimental result shows HFF can effectively solve these two issues and improve class-prediction accuracy over state-of-art feature fusion techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Deeply fusing multimodal features in hypergraph

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation

Lead the way for us

Similar Papers

A query-by-example music retrieval system using feature and decision fusion
Nastaran Borjian ... Ehsanollah Kabir
Multimedia Tools and Applications | VOL. 77
Nastaran Borjian, et. al.Nastaran Borjian ... Ehsanollah Kabir
27 Feb 2017
Multimedia Tools and Applications | VOL. 77

A very high-resolution scene classification model using transfer deep CNNs based on saliency features
Osama A Shawky ... El-Sayed A El-Dahshan
Signal, Image and Video Processing | VOL. 15
Osama A Shawky, et. al.Osama A Shawky ... El-Sayed A El-Dahshan
17 Oct 2020
Signal, Image and Video Processing | VOL. 15

A Fourier Domain Feature Approach for Human Activity Recognition & Fall Detection
Asma Khtun ... Sk Golam Sarowar Hossain
-
Asma Khtun, et. al.Asma Khtun ... Sk Golam Sarowar Hossain
23 Mar 2023
23 Mar 2023

Pulmonary Nodule Classification Using Feature and Ensemble Learning-Based Fusion Techniques
Muhammad Muzammil ... Muhammad Amir
IEEE Access | VOL. 9
Muhammad Muzammil, et. al.Muhammad Muzammil ... Muhammad Amir
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deeply fusing multimodal features in hypergraph

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation