Learning Policies for Neural Network Architecture Optimization Using Reinforcement Learning

Raghav Vadhera,Manfred Huber

doi:10.32473/flairs.36.133380

Abstract

Deep learning systems tend to be very sensitive to the specific network architecture both in terms of learning ability and performance of the learned solution. This, together with the difficulty of tuning neural network architectures leads to a need for automatic network optimization. Previous work largely optimizes a network for one specific problem using architecture search, requiring significant amounts of time training different architectures during optimization. To address this and to open up the potential for transfer across tasks, this paper presents a novel approach that uses Reinforcement Learning to learn a policy for network optimization in a derived architecture embedding space that incrementally optimizes the network for the given problem. By utilizing policy learning and an abstract problem embedding, this approach brings the promise of transfer of the policy across problems and thus the potential optimization of networks for new problems without the need for excessive additional training. For an initial evaluation of the base capabilities, experiments for a standard classification problem are performed in this paper, showing the ability of the approach to optimize the architecture for a specific problem within a given rang of fully connected networks, and indicating its potential for learning effective policies to automatically improve network architectures.

Full Text