Imbalanced Data Parameter Optimization of Convolutional Neural Networks Based on Analysis of Variance

Ruiao Zou,Nan Wang

doi:10.3390/app14199071

Ruiao Zou, Nan Wang

https://doi.org/10.3390/app14199071

Copy DOI

Export

Save

Cite

Journal: Applied Sciences	Publication Date: Oct 8, 2024
License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

Classifying imbalanced data is important due to the significant practical value of accurately categorizing minority class samples, garnering considerable interest in many scientific domains. This study primarily uses analysis of variance (ANOVA) to investigate the main and interaction effects of different parameters on imbalanced data, aiming to optimize convolutional neural network (CNN) parameters to improve minority class sample recognition. The CIFAR-10 and Fashion-MNIST datasets are used to extract samples with imbalance ratios of 25:1, 15:1, and 1:1. To thoroughly assess model performance on imbalanced data, we employ various evaluation metrics, such as accuracy, recall, F1 score, P-mean, and G-mean. In highly imbalanced datasets, optimizing the learning rate significantly affects all performance metrics. The interaction between the learning rate and kernel size significantly impacts minority class samples in moderately imbalanced datasets. Through parameter optimization, the accuracy of the CNN model on the 25:1 highly imbalanced CIFAR-10 and Fashion-MNIST datasets improves by 14.20% and 5.19% compared to the default model and by 8.21% and 3.87% compared to the undersampling model, respectively, while also enhancing other evaluation metrics for minority classes.

Full Text