A machine learning application in wine quality prediction

Piyush Bhardwaj,Parul Tiwari,Kenneth Olejar,Wendy Parr,Don Kulasiri

doi:10.1016/j.mlwa.2022.100261

Piyush Bhardwaj, Parul Tiwari + Show 3 more

Open Access

https://doi.org/10.1016/j.mlwa.2022.100261

Copy DOI

Abstract

The wine business relies heavily on wine quality certification. The excellence of New Zealand Pinot noir wines is well-known worldwide. Our major goal in this research is to predict wine quality by generating synthetic data and construct a machine learning model based on this synthetic data and available experimental data collected from different and diverse regions across New Zealand. We utilised 18 Pinot noir wine samples with 54 different characteristics (7 physiochemical and 47 chemical features). We generated 1381 samples from 12 original samples using the SMOTE method, and six samples were preserved for model testing. The findings were compared using four distinct feature selection approaches. Important attributes (referred as essential variables) that were shown to be relevant in at least three feature selection methods were utilised to predict wine quality. Seven machine learning algorithms were trained and tested on a holdout original sample. Adaptive Boosting (AdaBoost) classifier showed 100% accuracy when trained and evaluated without feature selection, with feature selection (XGB), and with essential variables (features found important in at least three feature selection methods). In the presence of essential variables, the Random Forest (RF) classifier performance was increased.

Full Text