Abstract

Wineinformatics is a field that uses machine-learning and data-mining techniques to glean useful information from wine. In this work, attributes extracted from a large dataset of over 100,000 wine reviews are used to make predictions on two variables: quality based on a “100-point scale”, and price per 750 mL bottle. These predictions were built using support vector regression. Several evaluation metrics were used for model evaluation. In addition, these regression models were compared to classification accuracies achieved in a prior work. When regression was used for classification, the results were somewhat poor; however, this was expected since the main purpose of the regression was not to classify the wines. Therefore, this paper also compares the advantages and disadvantages of both classification and regression. Regression models can successfully predict within a few points of the correct grade of a wine. On average, the model was only 1.6 points away from the actual grade and off by about $13 per bottle of wine. To the best of our knowledge, this is the first work to use a large-scale dataset of wine reviews to perform regression predictions on grade and price.

Highlights

  • Wine is one of the most popular drinks in the world, with over twenty-eight billion liters produced across 63 countries in the year 2015 alone [1]

  • Considering only the results for running Support Vector Regression (SVR) on grade, shown in Table 2, all three models reported similar levels of error, with the two Radial Basis Function (RBF) models tying for first place

  • The zero value for mean error (ME) suggested that the RBF/Laplace models were able to find a perfect balance for where to draw the hyperplane

Read more

Summary

Introduction

Wine is one of the most popular drinks in the world, with over twenty-eight billion liters produced across 63 countries in the year 2015 alone [1] To be this popular, wine must have several interesting characteristics which humans enjoy: aroma, color, and flavor. Many computer-based techniques have been applied or developed for use in the wine field, such as software for wine-making [2] and classification of the wines’ characteristics [3]. Wineinformatics studies how these characteristics can be used to make inferences about the wine, such as the wine’s quality and how expensive it may be. Comparing a chemical description of a wine to a qualitative description of a wine based on its review, as shown in Figure 1, demonstrates how a sensory description may be more understood than a chemical one

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call