Geographical recognition of Syrah wines by combining feature selection with Extreme Learning Machine

Nattane Luíza Da Costa,Laura Andrea García Llobodanin,Márcio Dias De Lima,Inar Alves Castro,Rommel Barbosa

doi:10.1016/j.measurement.2018.01.052

Abstract

Data mining techniques have been used for the classification of many types of products. In order to classify the Syrah wines from Argentina (Mendoza) and Chile (Central Valley), according to their origin, we perform two feature selection methods with the following classification algorithms: Support Vector Machines (SVM), and two types of artificial neural networks, Multilayer Perceptron (MLP) and Extreme Learning Machine (ELM), on 10-fold cross-validation. Each feature selection method has a different approach, creating also different sets of the most important features. The best model was the combination of variables peon-3-glu, malv-3-glu and pet-3-acetylglu, selected by Random Forest Importance, reaching 98.33% accuracy with ELM, outperforming SVM and MLP. The results obtained from the classifiers and feature subsets are able to confirm the importance of the anthocyanins to classify Syrah wines according to their geographic region. ELM was the best algorithm for classifying Syrah wines.

Full Text