Abstract
Nowadays, deep convolutional neural networks (DCNNs) play a significant role in many application domains, such as computer vision, medical imaging, and image processing. Nonetheless, designing a DCNN, able to defeat the state of the art, is a manual, challenging, and time-consuming task, due to the extremely large design space, as a consequence of a large number of layers and their corresponding hyperparameters. In this work, we address the challenge of performing hyperparameter optimization of DCNNs through a novel multiagent reinforcement learning (MARL)-based approach, eliminating the human effort. In particular, we adapt <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula> -learning and define learning agents per layer to split the design space into independent smaller design subspaces such that each agent fine tunes the hyperparameters of the assigned layer concerning a global reward. Moreover, we provide a novel formation of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula> -tables along with a new update rule that facilitates agents’ communication. Our MARL-based approach is data driven and able to consider an arbitrary set of design objectives and constraints. We apply our MARL-based solution to different well-known DCNNs, including GoogLeNet, VGG, and U-Net, and various datasets for image classification and semantic segmentation. Our results have shown that compared to the original CNNs, the MARL-based approach can reduce the model size, training time, and inference time by up to, respectively, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$83\times $ </tex-math></inline-formula> , 52%, and 54% without any degradation in accuracy. Moreover, our approach is very competitive to state-of-the-art neural architecture search methods in terms of the designed CNN accuracy and its number of parameters while significantly reducing the optimization cost.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have