Abstract

An imbalanced dataset is a significant challenge when training a deep neural network (DNN) model for deep learning problems, such as weeds classification. An imbalanced dataset may result in a model that behaves robustly on major classes and is overly sensitive to minor classes. This article proposes a yielding multi-fold training (YMufT) strategy to train a DNN model on an imbalanced dataset. This strategy reduces the bias in training through a min-class-max-bound procedure (MCMB), which divides samples in the training set into multiple folds. The model is consecutively trained on each one of these folds. In practice, we experiment with our proposed strategy on two small (PlantSeedlings, small PlantVillage) and two large (Chonnam National University (CNU), large PlantVillage) weeds datasets. With the same training configurations and approximate training steps used in conventional training methods, YMufT helps the DNN model to converge faster, thus requiring less training time. Despite a slight decrease in accuracy on the large dataset, YMufT increases the F1 score in the NASNet model to 0.9708 on the CNU dataset and 0.9928 when using the Mobilenet model training on the large PlantVillage dataset. YMufT shows outstanding performance in both accuracy and F1 score on small datasets, with values of (0.9981, 0.9970) using the Mobilenet model for training on small PlantVillage dataset and (0.9718, 0.9689) using Resnet to train on the PlantSeedlings dataset. Grad-CAM visualization shows that conventional training methods mainly concentrate on high-level features and may capture insignificant features. In contrast, YMufT guides the model to capture essential features on the leaf surface and properly localize the weeds targets.

Highlights

  • With the application of deep neural network (DNN) models, many computer vision problems have achieved tremendous performance in tasks such as object classification [1], object segmentation [2], object detection [3], and object localization [4]

  • Using the yielding multi-fold training (YMufT) strategy, the overall performance of the models and the performance on minor species were significantly improved in comparison to the performance of the models trained using the conventional training method

  • We presented YMufT, a strategy for training DNN models on imbalanced datasets

Read more

Summary

Introduction

With the application of deep neural network (DNN) models, many computer vision problems have achieved tremendous performance in tasks such as object classification [1], object segmentation [2], object detection [3], and object localization [4]. The success of DNN models may depend on the quality and distribution of the labels in the dataset. As shown in [5,6,7,8,9], a DNN model (and machine learning techniques) trained on an imbalanced dataset may show poor performance on minority labels. Mentioned that large-scale datasets might have long-tailed label distributions, meaning that dataset imbalances could become problematic when samples are collected in a largescale domain. Others chose to combine multiple sampling strategies, such as [13,14,16]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.