Abstract

Neural architecture search (NAS) has significantly advanced the automatic design of convolutional neural architectures. However, it is challenging to directly extend existing NAS methods to attention networks because of the uniform structure of the search space and the lack of long-range feature extraction. To address these issues, we construct a hierarchical search space that allows various attention operations to be adopted for different layers of a network. To reduce the complexity of the search, a low-cost search space compression method is proposed to automatically remove the unpromising candidate operations for each layer. Furthermore, we propose a novel search strategy combining a self-supervised search with a supervised one to simultaneously capture long-range and short-range dependencies. To verify the effectiveness of the proposed methods, we conduct extensive experiments on various learning tasks, including image classification, fine-grained image recognition, and zero-shot image retrieval. The empirical results show strong evidence that our method is capable of discovering high-performance full-attention architectures while guaranteeing the required search efficiency.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call