Analysis on the Selection of the Appropriate Batch Size in CNN Neural Network

Runze Lin

doi:10.1109/mlke55170.2022.00026

Abstract

Batch Size is an essential hyper-parameter in deep learning. Different chosen batch sizes may lead to various testing and training accuracies and different runtimes. Choosing an optimal batch size is crucial when training a neural network. The scientific purpose of this paper is to find an appropriate range of batch size people can use in a convolutional neural network. The study is conducted by changing the hyper-parameter batch size and observing the influences when training some commonly used convolutional neural networks (Mnist, Fashion Mnist and CIFAR-10). The experiment results suggest it is more likely to obtain the most accurate model when choosing the mini-batch size between 16 and 64. In addition, the experiments discuss the effect of different sizes of datasets, neural network depth, and whether the batch size is a power of 2 on the conclusions. Therefore, when training a CNN model, people could first choose a batch size of 32 and decrease it for accuracy or increase it for efficiency.

Full Text