A Framework for Designing the Architectures of Deep Convolutional Neural Networks

Saleh Albelwi,Ausif Mahmood

doi:10.3390/e19060242

Abstract

Recent advances in Convolutional Neural Networks (CNNs) have obtained promising results in difficult deep learning tasks. However, the success of a CNN depends on finding an architecture to fit a given problem. A hand-crafted architecture is a challenging, time-consuming process that requires expert knowledge and effort, due to a large number of architectural design choices. In this article, we present an efficient framework that automatically designs a high-performing CNN architecture for a given problem. In this framework, we introduce a new optimization objective function that combines the error rate and the information learnt by a set of feature maps using deconvolutional networks (deconvnet). The new objective function allows the hyperparameters of the CNN architecture to be optimized in a way that enhances the performance by guiding the CNN through better visualization of learnt features via deconvnet. The actual optimization of the objective function is carried out via the Nelder-Mead Method (NMM). Further, our new objective function results in much faster convergence towards a better architecture. The proposed framework has the ability to explore a CNN architecture’s numerous design choices in an efficient way and also allows effective, distributed execution and synchronization via web services. Empirically, we demonstrate that the CNN architecture designed with our approach outperforms several existing approaches in terms of its error rate. Our results are also competitive with state-of-the-art results on the MNIST dataset and perform reasonably against the state-of-the-art results on CIFAR-10 and CIFAR-100 datasets. Our approach has a significant role in increasing the depth, reducing the size of strides, and constraining some convolutional layers not followed by pooling layers in order to find a CNN architecture that produces a high recognition performance.

Highlights

Deep convolutional neural networks (CNNs) recently have shown remarkable success in a variety of areas such as computer vision [1,2,3] and natural language processing [4,5,6]
We validate the effectiveness of the proposed objective function compared to the error rate objective function
Nelder-Mead Method (NMM) based on the proposed objective function

Summary

Introduction

Deep convolutional neural networks (CNNs) recently have shown remarkable success in a variety of areas such as computer vision [1,2,3] and natural language processing [4,5,6]. CNN [54] is a subclass of neural networks that takes advantage of the spatial structure of the inputs. The convolutional layer is comprised of a set of learnable kernels or filters which aim to extract local features from the input. The units of the feature maps can only connect to a small region of the input, called the receptive field. A new feature map is typically generated by sliding a filter over the input and computing the dot product (which is similar to the convolution operation), followed by a non-linear activation function to introduce non-linearity into the model. All units share the same weights (filters) among each feature map. The advantage of sharing weights is the reduced number of parameters and the ability to detect the same feature, regardless of its location in the inputs [55]

Methods

Results

Conclusion