Improving Differentiable Architecture Search via self-distillation

Xunyu Zhu,Jian Li,Yong Liu,Weiping Wang

doi:10.1016/j.neunet.2023.08.062

Abstract

Differentiable Architecture Search (DARTS) is a simple yet efficient Neural Architecture Search (NAS) method. During the search stage, DARTS trains a supernet by jointly optimizing architecture parameters and network parameters. During the evaluation stage, DARTS discretizes the supernet to derive the optimal architecture based on architecture parameters. However, recent research has shown that during the training process, the supernet tends to converge towards sharp minima rather than flat minima. This is evidenced by the higher sharpness of the loss landscape of the supernet, which ultimately leads to a performance gap between the supernet and the optimal architecture. In this paper, we propose Self-Distillation Differentiable Neural Architecture Search (SD-DARTS) to alleviate the discretization gap. We utilize self-distillation to distill knowledge from previous steps of the supernet to guide its training in the current step, effectively reducing the sharpness of the supernet’s loss and bridging the performance gap between the supernet and the optimal architecture. Furthermore, we introduce the concept of voting teachers, where multiple previous supernets are selected as teachers, and their output probabilities are aggregated through voting to obtain the final teacher prediction. Experimental results on real datasets demonstrate the advantages of our novel self-distillation-based NAS method compared to state-of-the-art alternatives.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Differentiable Architecture Search via self-distillation

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Journal: Neural Networks	Publication Date: Sep 9, 2023
Citations: 1

Similar Papers

Efficient Neural Architecture Search for End-to-End Speech Recognition Via Straight-Through Gradients
Huahuan Zheng ... Keyu An
-
Huahuan Zheng, et. al.Huahuan Zheng ... Keyu An
19 Jan 2021
19 Jan 2021

Comparative Analysis of Neural Architecture Search Methods for Classification of Cultural Heritage Sites
Sunil V Gurlahosur ... Uma Mudenagudi
-
Sunil V Gurlahosur, et. al.Sunil V Gurlahosur ... Uma Mudenagudi
01 Jan 2021
01 Jan 2021

Neural architecture search based on dual attention mechanism for image classification.
Cong Jin ... Tianshu Wei
Mathematical biosciences and engineering : MBE | VOL. 20
Cong Jin, et. al.Cong Jin ... Tianshu Wei
01 Jan 2021
Mathematical biosciences and engineering : MBE | VOL. 20

Automatic Design of CNNs via Differentiable Neural Architecture Search for PolSAR Image Classification
Hongwei Dong ... Siyu Zhang
IEEE Transactions on Geoscience and Remote Sensing | VOL. 58
Hongwei Dong, et. al.Hongwei Dong ... Siyu Zhang
05 Feb 2020
IEEE Transactions on Geoscience and Remote Sensing | VOL. 58

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Differentiable Architecture Search via self-distillation

Abstract

Talk to us

Similar Papers

More From: Neural Networks