Multi-Layer Neural Network Auto Encoders Learning Method, using Regularization for Invariant Image Recognition

Skribtsov Pavel Vyacheslavovich,Kazantsev Pavel Aleksandrovich

doi:10.17485/ijst/2016/v9i27/97704

Skribtsov Pavel Vyacheslavovich, Kazantsev Pavel Aleksandrovich

Open Access

https://doi.org/10.17485/ijst/2016/v9i27/97704

Copy DOI

Abstract

Background/Objectives: This paper proposes a new type of regularization for deep learning neural networks that is capable of explicit separation of the lower dimensional hidden layer input pattern representation into two components: class information component and transform component. Methods: Currently, researchers involved in pattern recognition problems are actively searching for the replacement of deterministic feature extraction algorithms by unsupervised methods capable of generating optimal domain-specific image features during the training process of auto-associative multilayer neural networks. The result of the training process of the deep neural network with a “bottleneck” hidden layer is the task-oriented encoder capable of efficient input signal dimensionality reduction. Findings: Many important useful properties of the encoder including the degree of invariance of the feature extraction to input signal transformations (perturbations) greatly depend on the particular form of the regularization applied. In addition to the regular weight decay smoothing component the suggested regularization has two additional components: the first one minimizes the spread of the class-describing features under different pattern transforms and the other component minimizes the spread of the transformation description features for the objects with same perturbations but from the different classes. Class-membership information from the training sequence is used along with the introduced estimator of the similarity of pattern transform to compute the regularization terms. The research reveals that a private case of the suggested regularization corresponds to the well-known Frobenius norm of Jacobian matrix of the encoder activations, therefore the contribution of this paper can be seen as a non-local extension of the encoder Jacobian-based family of deep neural network regularizers embedding invariance to non-local input pattern transformations into the deep neural network feature extraction pipeline. Experiments carried out on the synthetic and real pattern datasets show promising results and encourage further investigation of the proposed approach. Improvements/Applications: This method can be used for areal images recognition invariant to lighting, weather and orientation, for example for the recognition of vehicles and other landmarks in the images obtained by the unmanned aerial vehicles (UAV).

Full Text