In agrarian sectors, the prevalence of adulteration from seed selection to final agricultural outputs is a significant concern. This deceptive practice not only spans across distinct plant species but also extends to varieties within the same species, as observed in chili cultivation. Addressing this issue, the current study aims to develop a machine learning-based classification model that discriminates between cayenne pepper genotypes of IPB varieties and others. The model harnesses a range of plant physiological parameters monitored from the growth phase through to harvest. These parameters include plant and dichotomous heights, age at flowering, age at harvest, fruit stalk length, chili dimensions, individual chili weight, total number of chilies per plant, and overall yield productivity. Employing a dataset of 45 plant samples—30 from IPB varieties and 15 from non-IPB varieties—this research evaluates four machine learning algorithms: Linear Discriminant Analysis (LDA), k-Nearest Neighbors (kNN), Decision Tree (DT), and Random Forest (RF). To ascertain the robustness of the proposed model, the study also investigates the impact of varying data split ratios, including 90/10, 75/25, and 50/50, on model performance. Preliminary results indicate that the DT classifier, with an accuracy exceeding 80%, successfully differentiates the IPB cayenne pepper genotypes from other varieties. This promising conceptual model warrants further validation and enhancement through future research on a larger dataset, paving the way for its practical application in ensuring the authenticity of agricultural produce.
Read full abstract