Model Soups for Various Training and Validation Data

Tomofumi Matsuzawa,Kaiyu Suzuki

doi:10.3390/ai3040048

Abstract

Model soups synthesize multiple models after fine-tuning them with different hyperparameters based on the accuracy of the validation data. They train different models on the same training and validation data sets. In this study, we maximized the model fine-tuning accuracy using the inference time and memory cost of a single model. We extended the model soups to create subsets of k training and validation data using a method similar to k-fold cross-validation and trained models on these subsets. First, we showed the correlation between the validation and test data when the models are synthesized, such that their training data contain validation data. Thereafter, we showed that synthesizing k of these models, after synthesizing models based on subsets of the same training and validation data, provides a single model with high test accuracy. This study provides a method for learning models with both high accuracy and reliability for small datasets such as medical images.

Full Text