Abstract

Classification algorithms have shown exceptional prediction results in the supervised learning area. These classification algorithms are not always efficient when it comes to real-life datasets due to class distributions. As a result, datasets for real-life applications are generally imbalanced. Several methods have been proposed to solve the problem of class imbalance. In this paper, we propose a hybrid method combining the preprocessing techniques and those of ensemble learning. The original training set is undersampled by evaluating the samples by stochastic measurement (SM) and then training these samples selected by Multilayer Perceptron to return a balanced training set. The MLPUS (Multilayer perceptron undersampling) balanced training set is aggregated using the bagging ensemble method. We applied our method to the real-life Niger_Rice dataset and forty-four other imbalanced datasets from the KEEL repository in this study. We also compared our method with six other existing methods in the literature, such as the MLP classifier on the original imbalance dataset, MLPUS, UnderBagging (combining random under-sampling and bagging), RUSBoost, SMOTEBagging (Synthetic Minority Oversampling Technique and bagging), SMOTEBoost. The results show that our method is competitive compared to other methods. The Niger_Rice real-life dataset results are 75.6, 0.73, 0.76, and 0.86, respectively, for accuracy, F-measure, G-mean, and ROC with our proposed method. In contrast, the MLP classifier on the original imbalance Niger_Rice dataset gives results 72.44, 0.82, 0.59, and 0.76 respectively for accuracy, F-measure, G-mean, and ROC.

Highlights

  • Demographic growth in West Africa in general and Mali, in particular, requires abundant agricultural production to cope with this demographic growth

  • We study the prediction of rice production using climate data in Mali in the irrigated area called the Niger office

  • The different experimental results of our method and six other methods of the 45 imbalanced datasets of Table 2 are summarized

Read more

Summary

Introduction

Demographic growth in West Africa in general and Mali, in particular, requires abundant agricultural production to cope with this demographic growth. Exploring machine learning technologies to predict agricultural production is an exciting challenge in this climatically unstable region [2]. We study the prediction of rice production using climate data in Mali in the irrigated area called the Niger office. We use the prediction methods of the classification algorithms for rice production from the Niger office. Classifications methods are more often known for solving qualitative problems, while rice production is quantitative This adaptation of the solution is since the Niger office Company uses a threshold to qualify whether rice production is good or bad. This threshold is 6.2 tones per hectare. After constructing our real-life dataset Niger_Rice dataset of rice production qualification using climatic data, it appears that this dataset is imbalanced according to [5]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call