Automated Optimal Architecture of Deep Convolutional Neural Networks for Image Recognition

Saleh Albelwi,Ausif Mahmood

doi:10.1109/icmla.2016.0018

Saleh Albelwi, Ausif Mahmood

https://doi.org/10.1109/icmla.2016.0018

Copy DOI

Export

Save

Cite

Publication Date: Dec 1, 2016

Citations: 17

Affiliation: University of Bridgeport

Abstract
Full-Text
Similar Papers

Abstract

Listen

Recent advancements in deep Convolutional Neural Networks (CNNs) have led to impressive progress in computer vision, especially in image classification. CNNs involve numerous hyperparameters that identify the network's structure such as depth of the network, kernel size, number of feature maps, stride, pooling size and pooling regions etc. These hyperparameters have a significant impact on the classification accuracy of a CNN. Selecting proper CNN architecture is different from one dataset to another. An empirical approach is often used in determining the near optimal value of these hyperparameters. Some recent works have tried optimization techniques for hyperparameter selection as well. In this paper, we develop a framework for hyperparameter optimization that is based on a new objective function that combines the information from the visualization of learned feature maps via deconvolutional networks, and the accuracy of the trained CNN model. Nelder-Mead Algorithm (NMA) is used in guiding the CNN architecture towards near optimal hyperparameters. Our proposed approach is evaluated on CIFAR-10 and Caltech-101 benchmarks. The experimental results indicate that the final architecture of a CNN obtained by our objective function outperforms other approaches in terms of accuracy. It is shown that our optimization framework contributes to increase in the depth of network, shrinks the size of stride and pooling sizes to obtain the best CNN architecture.

Full Text