Abstract

Deep Convolutional Neural Networks (DCNN) are currently the predominant technique commonly used to learn visual features from images. However, the complex structure of most recent DCNNs impose two major requirements namely, huge labeled dataset and high computational resources. In this paper, we develop a new efficient deep unsupervised network to learn invariant image representation from unlabeled visual data. The proposed Deep Convolutional Self-organizing Maps (DCSOM) network comprises a cascade of convolutional SOM layers trained sequentially to represent multiple levels of features. The 2D SOM grid is commonly used for either data visualization or feature extraction. However, this work employs high dimensional map size to create a new deep network. The N-Dimensional SOM (ND-SOM) grid is trained to extract abstract visual features using its classical competitive learning algorithm. The topological order of the features learned from ND-SOM helps to absorb local transformation and deformation variations exhibited in the visual data. The input image is divided into an overlapped local patches where each local patch is represented by the N-coordinates of the winner neuron in the ND-SOM grid. Each dimension of the ND-SOM can be considered as a non-linear principal component and hence it can be exploited to represent the input image using N-Feature Index Image (FII) bank. Multiple convolutional SOM layers can be cascaded to create a deep network structure. The output layer of the DCSOM network computes local histograms of each FII bank in the final convolutional SOM layer. A set of experiments using MNIST handwritten digit database and all its variants are conducted to evaluate the robust representation of the proposed DCSOM network. Experimental results reveal that the performance of DCSOM outperforms state-of-the-art methods for noisy digits and achieve a comparable performance with other complex deep learning architecture for other image variations.

Highlights

  • Learning hierarchies of features for image representation is a major goal for many computer vision and pattern recognition applications [1]

  • EXPERIMENTAL RESULTS This section focuses on the evaluation of various invariance aspects of Deep Convolutional Self-Organizing Map (DCSOM) network using MNIST handwritten digit databases [36] and MNIST variations [37]

  • We use two convolutional Self-Organizing Map (SOM) feature extraction layers to examine the effectiveness of the DCSOM and to analyze the effect of changing its hyper-parameters

Read more

Summary

INTRODUCTION

Learning hierarchies of features for image representation is a major goal for many computer vision and pattern recognition applications [1]. Our proposed model trains only one convolutional SOM layer in each stage as contrary to other methods [20], [23], [24] which trains multiple region-specific SOMs. The proposed Deep Convolutional Self-organizing Map (DCSOM) network is a new deep learning architecture which uses multiple convolutional SOM layers. The proposed Deep Convolutional Self-organizing Map (DCSOM) network is a new deep learning architecture which uses multiple convolutional SOM layers It has a similar structure with common convolutional neural networks. The second convolutional SOM layer is trained in similar way without applying Z-score normalization on the local patches derived from each feature index image of the previous layer. Exploiting the topological order property of SOM feature map to efficiently represent local patches of the input image using SOM neuron coordinates.

RELATED WORKS
LOCAL CONTRAST NORMALIZATION LAYER
BATCH LEARNING ALGORITHM
SOM MAPPING
LEARNING SECOND CONVOLUTIONAL SOM LAYER
EXPERIMENTAL RESULTS
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call