A hybrid approach for noise reduction-based optimal classifier using genetic algorithm: A case study in plant disease prediction

Anshul Bhatia,Amit Prakash Singh,Anuradha Chug,Dinesh Singh

doi:10.3233/ida-216011

Abstract

Plant diseases can cause significant losses to agricultural productivity; therefore, their early prediction is much needed. So far, many machine learning-based plant disease prediction models have been recommended, but these models face a problem of noisy class label dataset that degrades the performance. Noisy class label dataset results from the improper assignment of positive class labels into negative class data samples or vice versa. Hence, a precise and noise-free plant disease model is required for a better prediction. The current study proposes noise reduction-based hybridized classifiers for plant disease prediction. One tomato and four soybean disease datasets have been selected to conduct the proposed research. The Adaptive Sampling-based Class Label Noise Reduction (AS-CLNR) method has been used along with the Support Vector Machine (SVM) approach for noise reduction. The noise-minimized datasets have been fed into the Extreme Learning Machine (ELM), Decision Tree (DT), and Random Forest (RF) classifiers whose parameters are optimized using Genetic Algorithm (GA) for developing plant disease prediction models. The performances of all these models viz. Hybrid SVM-GA-ELM, Hybrid SVM-GA-DT, and Hybrid SVM-GA-RF have been evaluated using Accuracy, Area under ROC Curve, and F1-Score metrics. Further, these classifiers have been ranked using the statistical Friedman Test in which the Hybrid SVM-GA-RF classifier performed the best. Lastly, the Nemenyi test has also been performed to find out if significant differences exist between various classifiers or not. It was found that 33.33% of the total pairs of hybrid classifiers show a remarkably different performance from one another.

Full Text