The Investigation of Multiple Optimization Methods on Convolutional Neural Network

Hetian Pan

doi:10.54097/pq6f5h89

Abstract

In this study, the optimization of a Convolutional Neural Network (CNN) was conducted using the Fruits 360 dataset, with a specific emphasis on the impacts of Spatial Transformer Network (STN) and Stochastic Gradient Descent (SGD) optimization methods. Firstly, a baseline CNN model is built, which achieves 97.84% accuracy with a loss of 0.0999 after 50 epochs. Then, the impact of integrating STN and SGD into CNN models separately is investigated. The addition of STN slightly increased the accuracy to 97.92%, reduced the loss to 0.0994, and decreased the validation accuracy. This result suggests that while STN enhances the model's generalization ability, it may slightly reduce the maximum accuracy achievable on the validation set. After SGD optimization, the verification accuracy is increased to 98.19%, the loss is reduced to 0.0537, and the verification accuracy is increased to 98.40%. These results highlight the effectiveness of SGD in fine-tuning model parameters, resulting in more accurate models and improved generalization capabilities. A comparative analysis of these methods highlights their respective advantages. The effectiveness of the STN is rooted in its capacity to improve model generalization and mitigate overfitting, which is particularly beneficial in situations that demand robustness against varied data sets. In contrast, SGD stands out for its ability to significantly improve model accuracy and reduce loss, making it a balanced choice for comprehensive model optimization. Future research directions include exploring these optimization techniques on various datasets and investigating the potential of combining STN and SGD to achieve higher performance in CNN models.

Full Text