Abstract

Architecture search is the automatic process of designing the model or cell structure that is optimal for the given dataset or task. Recently, this approach has shown good improvements in terms of performance (tested on language modeling and image classification) with reasonable training speed using a weight sharing-based approach called Efficient Neural Architecture Search (ENAS). In this work, we propose a novel architecture search algorithm called Flexible and Expressible Neural Architecture Search (FENAS), with more flexible and expressible search space than ENAS, in terms of more activation functions, input edges, and atomic operations. Also, our FENAS approach is able to reproduce the well-known LSTM and GRU architectures (unlike ENAS), and is also able to initialize with them for finding architectures more efficiently. We explore this extended search space via evolutionary search and show that FENAS performs significantly better on several popular text classification tasks and performs similar to ENAS on standard language model benchmark. Further, we present ablations and analyses on our FENAS approach.

Highlights

  • Architecture search enables automatic ways of finding the best model architecture and cell structures for the given task or dataset, as opposed to the traditional approach of manually tuning among different architecture choices

  • Comparing our Flexible and Expressible Neural Architecture Search (FENAS) approach with the previous Neural architecture search (NAS) approaches, FENAS performs on Penn Treebank (PTB) and significantly better on several downstream GLUE tasks

  • FENAS search space is larger than Efficient Neural Architecture Search (ENAS) because of more activation functions and more inputs to the computational nodes

Read more

Summary

Introduction

Architecture search enables automatic ways of finding the best model architecture and cell structures for the given task or dataset, as opposed to the traditional approach of manually tuning among different architecture choices. This idea has been successfully applied to the tasks of language modeling and image classification (Zoph and Le, 2017; Zoph et al, 2018; Cai et al, 2018; Liu et al, 2018a,b). The first approach of architecture search involved an RNN controller which samples a model architecture and uses the validation performance of this architecture trained on the given dataset as feedback (or reward) to sample the architecx[t] hh[[t-t1]] h[t] x[t] h[t-1] tanh ReLU (1) (2) add h[t] (3) Node 1 x[t] h[t-1] tanh x[t]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call