Learning a robust representation via a deep network on symmetric positive definite manifolds

Zhi Gao,Yuwei Wu,Xingyuan Bu,Tan Yu,Junsong Yuan,Yunde Jia

doi:10.1016/j.patcog.2019.03.007

Abstract

Recent studies have shown that aggregating convolutional features of a Convolutional Neural Network (CNN) can obtain impressive performance for a variety of computer vision tasks. The Symmetric Positive Definite (SPD) matrix becomes a powerful tool due to its remarkable ability to learn an appropriate statistic representation to characterize the underlying structure of visual features. In this paper, we propose a method of aggregating deep convolutional features into a robust representation through the SPD generation and the SPD transformation under an end-to-end deep network. To this end, several new layers are introduced in our method, including a nonlinear kernel generation layer, a matrix transformation layer, and a vector transformation layer. The nonlinear kernel generation layer is employed to aggregate convolutional features into a kernel matrix which is guaranteed to be an SPD matrix. The matrix transformation layer is designed to project the original SPD representation to a more compact and discriminative SPD manifold. The vectorization and normalization operations are performed in the vector transformation layer to take the upper triangle elements of the SPD representation and carry out the power normalization and l2 normalization to reduce the redundancy and accelerate the convergence. The SPD matrix in our network can be considered as a mid-level representation bridging convolutional features and high-level semantic features. Results of extensive experiments show that our method notably outperforms state-of-the-art methods.

Full Text