Bayesian optimization for the design of deep neural networks

Nikolas Giannakis,Nikolaos Gorgolis,Ioannis Hatzilygeroudis

doi:10.1109/iisa52424.2021.9555533

Abstract

The design of deep neural networks (DNNs), which in essence concerns the choice of specific values for their hyperparameters, is a very involved process that provides very big challenges to researchers and designers. The fact that there are strong problem-specific dependencies and intuition/experience has been typically used by the designers, has led to the consideration of it as more of an art issue than a well structured and stardardized procedure. The aim of this work is to introduce some structure on the above design process by considering it as a function to be optimized, which takes as input a specific set of hyperparameter values and returns the accuracy of the designed DNN. The process of Bayesian optimization, using Gaussian processes as modeling functions, is employed to fine tune the hyperparameters of DNNs. We arrived at some very promising results. Comparing the proposed process to the random choice of hyperparameters from a specific set, much better accuracy is achieved at no significant extra time cost. Also, the process produces neural network architectures that mimic very closely the known best performing architectures for specific problem sets within the given constraints.

Full Text