Selection of Important Features for Optimizing Crop Yield Prediction

Maya Gopal P S Maya Gopal P S,Bhargavi R Bhargavi R

doi:10.4018/ijaeis.2019070104

Abstract

In agriculture, crop yield prediction is critical. Crop yield depends on various features including geographic, climate and biological. This research article discusses five Feature Selection (FS) algorithms namely Sequential Forward FS, Sequential Backward Elimination FS, Correlation based FS, Random Forest Variable Importance and the Variance Inflation Factor algorithm for feature selection. Data used for the analysis was drawn from secondary sources of the Tamil Nadu state Agriculture Department for a period of 30 years. 75% of data was used for training and 25% data was used for testing. The performance of the feature selection algorithms are evaluated by Multiple Linear Regression. RMSE, MAE, R and RRMSE metrics are calculated for the feature selection algorithms. The adjusted R2 was used to find the optimum feature subset. Also, the time complexity of the algorithms was considered for the computation. The selected features are applied to Multilinear regression, Artificial Neural Network and M5Prime. MLR gives 85% of accuracy by using the features which are selected by SFFS algorithm.

Highlights

INTRODUCTION & RELATED WORKData mining is a process of discovering previously unknown and potentially interesting patterns in large datasets (Frawley et al, 1991)
Our study investigates the behaviour of five feature selection algorithms with sixteen features and the outcome is given as input to multiple linear regression model, artifical neural network and M5Prime to find the accuracy
The Akaike Information Criterion (AIC) value is calculated by using the formula AIC = N ln SSNerror + 2K, here N is the number of observation and K is the number of paramter +1

Summary

INTRODUCTION & RELATED WORK

Data mining is a process of discovering previously unknown and potentially interesting patterns in large datasets (Frawley et al, 1991). Feature selection optimizes the performance of the data mining algorithm and makes it easier for the analyst to interpret the outcome of the modeling. This procedure can reduce the cost of recognition by reducing the number of features to be collected, but in some cases it can provide a better classification of prediction accuracy due to finite sample size effects High ranked feature greater than a threshold values was selected They evaluated their system using knowledge discovery data dataset and Naïve Bayes algorithm. The aim of this research work is to identify important paddy field conditions (features) using feature selection algorithms for providing a comprehensive view about paddy crop yield. Our study investigates the behaviour of five feature selection algorithms with sixteen features and the outcome is given as input to multiple linear regression model, artifical neural network and M5Prime to find the accuracy

DATA SOURCE

DATA PRE-PROCESSING

Feature Selection and Evaluation

Sequencial Forward Feature Selection Algorithm

Sequencial Backward Elimination Feature Selection Algorithm

Correlation Based Feature Selection Algorithm

Variance Inflation Factor

Random Forest Variable Importance

Multiple Linear Regression Model

MLR Model for Crop yield Prediction

M5 Prime

Accuracy Metrics

RESULTS AND DISCUSSIONS

Selection Procedure

Selection procedure

MODEL VALIDATION

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Agricultural and Environmental Information Systems	Publication Date: Jul 1, 2019
Citations: 15	License type: other-oa

R Discovery Prime

R Discovery Prime

Selection of Important Features for Optimizing Crop Yield Prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Agricultural and Environmental Information Systems

Lead the way for us

Similar Papers

A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring
Fatemeh Nemati Koutanaei ... Mohammad Khanbabaei
Journal of Retailing and Consumer Services | VOL. 27
Fatemeh Nemati Koutanaei, et. al.Fatemeh Nemati Koutanaei ... Mohammad Khanbabaei
16 Jul 2015
Journal of Retailing and Consumer Services | VOL. 27

Optimum Feature Subset for Optimizing Crop Yield Prediction Using Filter and Wrapper Approaches
P S Maya Gopal ... R Bhargavi
Applied Engineering in Agriculture | VOL. 35
P S Maya Gopal, et. al.P S Maya Gopal ... R Bhargavi
01 Jan 2019
Applied Engineering in Agriculture | VOL. 35

Survey on Novel Approach for Crop Yield Prediction Using Machine Learning
Aditya Kamble ... Poonam Hake
International Journal for Research in Applied Science and Engineering Technology | VOL. 11
Aditya Kamble, et. al.Aditya Kamble ... Poonam Hake
28 Feb 2023
International Journal for Research in Applied Science and Engineering Technology | VOL. 11

Application of multi-class support vector machines for power system on-line static security assessment using DT - based feature and data selection algorithms
M Mohammadi ... G.B Gharehpetian
Journal of Intelligent & Fuzzy Systems | VOL. 20
M Mohammadi, et. al.M Mohammadi ... G.B Gharehpetian
01 Jan 2009
Journal of Intelligent & Fuzzy Systems | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Selection of Important Features for Optimizing Crop Yield Prediction

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Agricultural and Environmental Information Systems