On the Scale Invariance in State of the Art CNNs Trained on ImageNet

Mara Graziani,Adrien Depeursinge,Henning Müller,Thomas Lompech,Vincent Andrearczyk

doi:10.3390/make3020019

Abstract

The diffused practice of pre-training Convolutional Neural Networks (CNNs) on large natural image datasets such as ImageNet causes the automatic learning of invariance to object scale variations. This, however, can be detrimental in medical imaging, where pixel spacing has a known physical correspondence and size is crucial to the diagnosis, for example, the size of lesions, tumors or cell nuclei. In this paper, we use deep learning interpretability to identify at what intermediate layers such invariance is learned. We train and evaluate different regression models on the PASCAL-VOC (Pattern Analysis, Statistical modeling and ComputAtional Learning-Visual Object Classes) annotated data to (i) separate the effects of the closely related yet different notions of image size and object scale, (ii) quantify the presence of scale information in the CNN in terms of the layer-wise correlation between input scale and feature maps in InceptionV3 and ResNet50, and (iii) develop a pruning strategy that reduces the invariance to object scale of the learned features. Results indicate that scale information peaks at central CNN layers and drops close to the softmax, where the invariance is reached. Our pruning strategy uses this to obtain features that preserve scale information. We show that the pruning significantly improves the performance on medical tasks where scale is a relevant factor, for example for the regression of breast histology image magnification. These results show that the presence of scale information at intermediate layers legitimates transfer learning in applications that require scale covariance rather than invariance and that the performance on these tasks can be improved by pruning off the layers where the invariance is learned. All experiments are performed on publicly available data and the code is available on GitHub.

Highlights

Computer vision algorithms trained on natural images must achieve scale invariance for optimal robustness to viewpoint changes
We report the Mean Average Error (MAE) across ten repetitionsand the relative standard deviation for the prediction of the average area
By introducing the corrected Global Average Pooling (GAP), we show that the regression of image scale in noise images is mostly due to the padding effects at early convolution layers that encode information about the input size

Summary

Introduction

Computer vision algorithms trained on natural images must achieve scale invariance for optimal robustness to viewpoint changes. Networks (CNNs) [6,7] achieve state-of-the-art performance in object recognition tasks with scale variations (e.g., ImageNet [8]) by implicitly learning scale invariance even without a pre-defined invariant design [9]. Such invariance, together with other learned features of color, edges and textures [10,11], is transferred to other tasks when pretrained models are used to learn from limited training data [12]. Scratch training is adopted by scale covariant [4] and multi-scale designs [15,16,17,18]

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning and Knowledge Extraction	Publication Date: Apr 3, 2021
Citations: 17	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

On the Scale Invariance in State of the Art CNNs Trained on ImageNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning and Knowledge Extraction

Lead the way for us

Similar Papers

Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging
Daniel Wolf ... Timo Ropinski
Scientific Reports | VOL. 13
Daniel Wolf, et. al.Daniel Wolf ... Timo Ropinski
20 Nov 2023
Scientific Reports | VOL. 13

Three Aspects on Using Convolutional Neural Networks for Computer-Aided Detection in Medical Imaging
Hoo-Chang Shin ... Jianhua Yao
-
Hoo-Chang Shin, et. al.Hoo-Chang Shin ... Jianhua Yao
01 Jan 2017
01 Jan 2017

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
Hoo-Chang Shin ... Holger R Roth
IEEE Transactions on Medical Imaging | VOL. 35
Hoo-Chang Shin, et. al.Hoo-Chang Shin ... Holger R Roth
11 Feb 2016
IEEE Transactions on Medical Imaging | VOL. 35

Brain tumor segmentation with deep convolutional symmetric neural network
Hao Chen ... Zhen Qin
Neurocomputing | VOL. 392
Hao Chen, et. al.Hao Chen ... Zhen Qin
24 Apr 2019
Neurocomputing | VOL. 392

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Scale Invariance in State of the Art CNNs Trained on ImageNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning and Knowledge Extraction