Add a SideNet to your MainNet

Adrien Morisot

doi:10.7557/18.6286

Abstract

As the performance and popularity of deep neural networks has increased, so too has their computational cost. There are many effective techniques for reducing a network’s computational footprint--quantisation, pruning, knowledge distillation--, but these lead to models whose computational cost is the same regardless of their input. Our human reaction times vary with the complexity of the tasks we perform: easier tasks--e.g. telling apart dogs from boats--are executed much faster than harder ones--e.g. telling apart two similar-looking breeds of dogs. Driven by this observation, we develop a method for adaptive network complexity by attaching a small classification layer, which we call SideNet, to a large pretrained network, which we call MainNet. Given an input, the SideNet returns a classification if its confidence level, obtained via softmax, surpasses a user-determined threshold, and only passes it along to the large MainNet for further processing if its confidence is too low. This allows us to flexibly trade off the network’s performance with its computational cost. Experimental results show that simple single hidden layer perceptron SideNets added onto pretrained ResNet and BERT MainNets allow for substantial decreases in compute with minimal drops in performance on image and text classification tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Add a SideNet to your MainNet

Abstract

Talk to us

Similar Papers

More From: Proceedings of the Northern Lights Deep Learning Workshop

Lead the way for us

Journal: Proceedings of the Northern Lights Deep Learning Workshop	Publication Date: Mar 28, 2022
License type: CC BY 4.0

Similar Papers

Multi-perspective analysis on data augmentation in knowledge distillation
Wei Li ... Aiguo Song
Neurocomputing | VOL. 583
Wei Li, et. al.Wei Li ... Aiguo Song
05 Mar 2024
Neurocomputing | VOL. 583

Data-Efficient Sensor Upgrade Path Using Knowledge Distillation.
Pieter Van Molle ... Jonas De Vylder
Sensors (Basel, Switzerland) | VOL. 21
Pieter Van Molle, et. al.Pieter Van Molle ... Jonas De Vylder
29 Sep 2021
Sensors (Basel, Switzerland) | VOL. 21

Efficient Multi-Organ Segmentation From 3D Abdominal CT Images With Lightweight Network and Knowledge Distillation.
Qianfei Zhao ... Lanfeng Zhong
IEEE Transactions on Medical Imaging | VOL. 42
Qianfei Zhao, et. al.Qianfei Zhao ... Lanfeng Zhong
01 Sep 2023
IEEE Transactions on Medical Imaging | VOL. 42

Extractive Knowledge Distillation
Takumi Kobayashi
-
Takumi KobayashiTakumi Kobayashi
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Add a SideNet to your MainNet

Abstract

Talk to us

Similar Papers

More From: Proceedings of the Northern Lights Deep Learning Workshop