ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

Guillaume Coiffier,Ghouthi Boukli Hacene,Vincent Gripon

doi:10.3390/iot2020012

Abstract

Deep Neural Networks are state-of-the-art in a large number of challenges in machine learning. However, to reach the best performance they require a huge pool of parameters. Indeed, typical deep convolutional architectures present an increasing number of feature maps as we go deeper in the network, whereas spatial resolution of inputs is decreased through downsampling operations. This means that most of the parameters lay in the final layers, while a large portion of the computations are performed by a small fraction of the total parameters in the first layers. In an effort to use every parameter of a network at its maximum, we propose a new convolutional neural network architecture, called ThriftyNet. In ThriftyNet, only one convolutional layer is defined and used recursively, leading to a maximal parameter factorization. In complement, normalization, non-linearities, downsamplings and shortcut ensure sufficient expressivity of the model. ThriftyNet achieves competitive performance on a tiny parameters budget, exceeding 91% accuracy on CIFAR-10 with less than 40 k parameters in total, 74.3% on CIFAR-100 with less than 600 k parameters, and 67.1% On ImageNet ILSVRC 2012 with no more than 4.15 M parameters. However, the proposed method typically requires more computations than existing counterparts.

Highlights

Distillation techniques consist of training a deep neural network, termed ‘student’, to reproduce the outputs of another model, termed ‘teacher’, with the student being typically smaller than the teacher
In an effort to reduce the number of parameters in deep convolutional neural networks, it is usual to target the deep layers in priority
One could believe that ThriftyNets are not likely to reach top performance, as generic deep neural networks are believed to produce more abstract features as we go deeper in their architectures, whereas ThriftyNets use the same features at any depth

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. We focus on reducing the number of parameters of architectures,which is usually strongly connected to the memory usage of the model. In this area, factorizing methods, which identify similar sets of parameters and merge them [4], are effective, in that they considerably reduce the number of parameters while maintaining the same global structure and number of flops. We propose to introduce a new factorized deep learning model, in which the factorization is not learned during training, but rather imposed at the creation of the model We call these models ThriftyNets, as they typically contain a very constrained number of parameters, while achieving top-tier results on standard classification vision datasets.

Related Work

Pruning

Quantization

Distillation

Efficient Scaling

Factorization

Recurrent Residual Networks as ODE

Context

Thrifty Networks

Augmented Thrifty Networks

Pooling Strategy

Grouped Convolutions

Hyperparameters and Size of the Model

Depth and Abstraction

Experiments

Impact of Data Augmentation

Comparison with Standard Architectures

Factorization and Filter Usage

Efficient ThriftyNets

Effect of the Number of Iterations

Effect of the Number of Filters

Effect of the Number of Downsamplings

Freezing the Shortcut Parameters in an Augmented ThriftyNet

Findings

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IoT	Publication Date: Mar 30, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IoT

Lead the way for us

Similar Papers

A Comparative Study of Modern Machine Learning Approaches for Focal Lesion Detection and Classification in Medical Images: BoVW, CNN and MTANN
Nima Tajbakhsh ... Kenji Suzuki
-
Nima Tajbakhsh, et. al.Nima Tajbakhsh ... Kenji Suzuki
01 Jan 2018
01 Jan 2018

Texture Patterns for Object Recognition and Content-Based Color Image Retrieval

-

21 Dec 2020
21 Dec 2020

A deep learning CNN architecture applied in smart near-infrared analysis of water pollution for agricultural irrigation resources
Huazhou Chen ... Ken Cai
Agricultural Water Management | VOL. 240
Huazhou Chen, et. al.Huazhou Chen ... Ken Cai
07 Jun 2020
Agricultural Water Management | VOL. 240

Hybrid models for lung nodule malignancy prediction utilizing convolutional neural network ensembles and clinical data.
Rahul Paul ... Robert Gillies
Journal of medical imaging (Bellingham, Wash.) | VOL. 7
Rahul Paul, et. al.Rahul Paul ... Robert Gillies
06 Apr 2020
Journal of medical imaging (Bellingham, Wash.) | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ThriftyNets: Convolutional Neural Networks with Tiny Parameter Budget

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IoT