Abstract

Techniques for classifying data using data mining are now a day prevalent in agriculture. The method of classifying seeds involves grouping various seed varieties according to their morphological characteristics. To accomplish categorization of the typical Charotar region (generally comprising Anand and Kheda districts of the Gujarat State of India) Gujarat Wheat (GW) varieties (TRITICUM – AESTIVUM) viz. GW 273, GW 496, GW 322, LOK-1, and GDW 1255 (TRITICUM – DURUM), Weka Explorer was used. The features used are area, perimeter, solidity, aspect ratio, major and minor axis of seed kernel, Hue, Saturation, Value, and SF1 (empirical). Features reduction was done using Information Gain (IG) and its modified version Gain Ratio (GR). This paper compares performance of Tree based data mining algorithms in classifying wheat varieties. For classification we used purely tree-based machine learning algorithms viz. J48, Random Forest, Hoeffding Tree, Logistic Model Tree (LMT), and REPTree. LMT- logistics regression method gives higher accuracy 96.4% compared to other classifiers. Hoeffding Tree classifiers stood second with 96% accuracy. For validation 10-fold cross validation was used. By reducing the number of folds in cross validation performance of most algorithms decreased except J48. The percentage of correctly classified instance increased for all algorithms when features were selected by GR except for J48.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.