Abstract
Utilizing multimodal features to describe multimedia data is a natural way to improve recognition accuracy. However, how to optimally cluster the raw features into different modalities in order to alleviate curse of dimension and how to exploit relationships between and within the feature modalities are still two tough issues. In this paper, we propose a new deep feature fusion framework: hypergraph feature fusion (HFF), to handle these two issues. First, we extract a collection of deep features from multiple images, then HFF constructs a features’ relationships hypergraph (FRH) to reveal relationships among raw features. Then HFF conducts generalized community learning by graph approximation (GCLGA) in FRH to cluster the raw features into k modalities and obtain the inter and intra modalities’ structure matrices. These matrices reveal relationships of inter and intra modalities and can help to build graph kernels in order to optimize kernel based classification. Finally, HFF applies a two level classifier to classify the fused feature vectors. Dimension of each level classifier’s input feature vector is much lower than raw feature vector. We conduct the kernel based classification on two experiments: (1) Using kernel SVM to classify ETH-80 image dataset by fusing 2 kinds of raw image features. (2) Using features extracted from kernel LDA on speech emotion recognition by fusing 6 kinds of raw speech features. The experimental result shows HFF can effectively solve these two issues and improve class-prediction accuracy over state-of-art feature fusion techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of Visual Communication and Image Representation
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.