Performance comparison of different convolutional neural network models and optimizers for dog bread classification

Xinyu Mao,Ruochen Zhu,Mingrui Xie

doi:10.54254/2755-2721/16/20230875

Abstract

This study explores the performance of various deep learning models including ResNet152, VGG16, VGG19 and ResNet256 on the dog breed classification task. During training, observe the loss and accuracy trends. The loss gradually decreases, showing that the model is fitting the training data better. With the improvements of the capacity of the model, the accuracy trend shows a steady increase. These models converge after about 20 epochs and fluctuate little after that. The initial learning rate, adjustment factor and patience parameters play key roles in the convergence process. However, the achieved accuracy is below 90%, suggesting that further optimization or more complex architectures may be beneficial. Among all models, ResNet512 has the highest overall accuracy (83%), followed by ResNet256 (83%), VGG19 256 (79%) and VGG16 256 (78%). The ResNet model outperforms the VGG model in most cases, probably because its network structure reduces computational complexity while maintaining accuracy. Increasing the input size can improve the accuracy of the same network structure, such as ResNet 256 and ResNet 512, while modifying the network structure by adding more layers. Learning rate decay scheduling methods, such as ReduceLROnPlateau and CosineAnnealingLR, and optimizers such as SGD, Adam, and Adagrad, are explored as well.

Full Text