Abstract

This paper presents an overview of coding methods used to encode a set of covariance matrices. Starting from a Gaussian mixture model (GMM) adapted to the Log-Euclidean (LE) or affine invariant Riemannian metric, we propose a Fisher Vector (FV) descriptor adapted to each of these metrics: the Log-Euclidean Fisher Vectors (LE FV) and the Riemannian Fisher Vectors (RFV). Some experiments on texture and head pose image classification are conducted to compare these two metrics and to illustrate the potential of these FV-based descriptors compared to state-of-the-art BoW and VLAD-based descriptors. A focus is also applied to illustrate the advantage of using the Fisher information matrix during the derivation of the FV. In addition, finally, some experiments are conducted in order to provide fairer comparison between the different coding strategies. This includes some comparisons between anisotropic and isotropic models, and a estimation performance analysis of the GMM dispersion parameter for covariance matrices of large dimension.

Highlights

  • In supervised classification, the goal is to tag an image with one class name based on its content.In the beginning of the 2000s, the leading approaches were based on feature coding

  • Starting from the Gaussian mixture model and the Riemannian Gaussian mixture model, we have proposed a unified view of coding methods

  • The experimental results have shown that: (i) the use of the Fisher information matrix (FIM) in the derivation of the Fisher vectors (FV) allows to improve the classification accuracy, (ii) the proposed FV descriptors outperform the state-of-the-art bag of words model (BoW) and vector of locally aggregated descriptors (VLAD)-based descriptors, and (iii) the descriptors based on the LE metric lead to better classification results than those based on the affine invariant Riemannian metric

Read more

Summary

Introduction

The goal is to tag an image with one class name based on its content.In the beginning of the 2000s, the leading approaches were based on feature coding. Among the most employed coding-based methods, there are the bag of words model (BoW) [1], the vector of locally aggregated descriptors (VLAD) [2,3], the Fisher score (FS) [4] and the Fisher vectors (FV) [5,6,7]. The success of these methods is based on their main advantages. Combined with powerful local handcrafted features, such as SIFT, they are robust to transformations like scaling, translation, or occlusion [11]

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.