Using Machine Learning and Multi-Element Analysis to Evaluate the Authenticity of Organic and Conventional Vegetables

Luís Reynaldo Ferracciú Alleoni,Márcio Dias de Lima,Rommel Barbosa,Eloá Moura Araújo

doi:10.1007/s12161-019-01597-2

Abstract

Concern for the consumption of organic vegetables is growing throughout the world. We verified the efficiency of machine learning techniques in the classification of vegetables produced under both organic and conventional systems in the state of Pernambuco, Brazil. The contents of 25 elements (Al, As, B, Ba, Ca, Cd, Co, Cr, Cu, Fe, Hg, K, Mg, Mn, Mo, Na, Ni, P, Pb, S, Se, Si, Ti, V, Zn) were determined in 364 vegetable samples. Principal component analysis (PCA) was displayed to get a primary distribution overview of samples. Data mining techniques such as linear discriminant analysis (LDA) were carried out to develop discrimination models based on organic vegetable samples, and feature selection (F-score and chi-squared) combined with classification algorithms (support vector machine—SVM, multilayer perceptron—MLP, and random forest—RF) was applied to these samples. LDA reached 100% in the discrimination models in tomato samples and bell pepper samples, while SVM, combined with chi-squared, outperformed the other algorithms obtaining accuracy of 100% in bell pepper samples (Capsicum annuum) and onion (Allium cepa Hysam) and 97% in tomato (Solanum Lycopersicum) samples, of which 95% was the hit rate in organic samples. For lettuce (Lactuca sativa) samples, the accuracy obtained was 92%, with a 90% hit rate of samples grown in organic systems. This high success rate highlights the potential of using elemental quantification and algorithms as support techniques in the process of authenticity and inspection of organic products.

Full Text