Abstract

The performance of a convolutional neural network (CNN) heavily depends on its hyperparameters. However, finding a suitable hyperparameters configuration is difficult, challenging, and computationally expensive due to three issues, which are 1) the mixed-variable problem of different types of hyperparameters; 2) the large-scale search space of finding optimal hyperparameters; and 3) the expensive computational cost for evaluating candidate hyperparameters configuration. Therefore, this article focuses on these three issues and proposes a novel estimation of distribution algorithm (EDA) for efficient hyperparameters optimization, with three major contributions in the algorithm design. First, a hybrid-model EDA is proposed to efficiently deal with the mixed-variable difficulty. The proposed algorithm uses a mixed-variable encoding scheme to encode the mixed-variable hyperparameters and adopts an adaptive hybrid-model learning (AHL) strategy to efficiently optimize the mixed-variables. Second, an orthogonal initialization (OI) strategy is proposed to efficiently deal with the challenge of large-scale search space. Third, a surrogate-assisted multi-level evaluation (SME) method is proposed to reduce the expensive computational cost. Based on the above, the proposed algorithm is named s urrogate-assisted hybrid-model EDA (SHEDA). For experimental studies, the proposed SHEDA is verified on widely used classification benchmark problems, and is compared with various state-of-the-art methods. Moreover, a case study on aortic dissection (AD) diagnosis is carried out to evaluate its performance. Experimental results show that the proposed SHEDA is very effective and efficient for hyperparameters optimization, which can find a satisfactory hyperparameters configuration for the CIFAR10, CIFAR100, and AD diagnosis with only 0.58, 0.97, and 1.18 GPU days, respectively.

Highlights

  • C ONVOLUTIONAL neural network (CNN), as one of the most efficient deep learning models [1], plays a vastly important role in various artificial intelligence applications like Go playing [2]

  • As the resolution of computerized tomography (CT) images and various aortic dissection (AD) shapes can bring in great classification difficulties, this classification problem is suitable for testing the CNNs obtained by surrogate-assisted hybrid-model EDA (SHEDA)

  • The total data are randomly split into the training dataset with 3486 images and the test dataset with 387 images, which are about 90% and 10% of the total data size, respectively

Read more

Summary

INTRODUCTION

C ONVOLUTIONAL neural network (CNN), as one of the most efficient deep learning models [1], plays a vastly important role in various artificial intelligence applications like Go playing [2]. Recent studies have started to consider more intelligent, automatic, and efficient ways of obtaining better CNN models, which result in the CNN optimization researches [11], i.e., consider finding the best CNN hyperparameters as an optimization problem and design powerful algorithms to solve it. In this direction, many algorithms have been proposed and obtained promising results [12], such as using reinforcement learning [11], Bayesian optimization [12], and. Solving the CNN hyperparameters optimization problem is still challenging due to the following three difficulties, including the mixed-variable problem, large-scale search space, and expensive computational cost.

AND RELATED WORK
Estimation of Distributed Algorithm
Related Work
PROPOSED ALGORITHM
Mixed-Variable Encoding Scheme
AHL Strategy
OI Strategy
12: End for
SME Method
Evaluation
MLDG Method
Complete Algorithm
Time Complexity of the Complete Algorithm
EXPERIMENTAL STUDIES
Benchmark Datasets and Evaluation Metrics
Compared State-of-the-Art Methods
Algorithm Settings
Comparisons With State-of-the-Art Methods
Ablation Experiments for Contribution Analysis
Influence of Sampling Number
Influence of Individual Number for Learning Probabilistic Models
Influence of the Training Epochs
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call