Abstract

Recently, wine has become a common drink in most people's homes, but most people have different opinions on the evaluation of wine quality. Artificial intelligence can provide a relatively fair assessment and help practitioners focus on certain features to improve wine quality. This study uses decision trees and random forests to learn and predict on wine datasets and investigate feature importance to derive the features that have the greatest impact on wine quality. First of all, this study deals with the original data reasonably, and uses the IQR method to remove some outliers, specifically the data of the first 0.09 and the last 0.09. Second, since the correlation between the two features of density and residual sugar is as high as 0.84, this study removes density to improve the final prediction accuracy. When using both the decision tree and random forest models, the parameters are debugged multiple times in this study, and the three results are retained in this paper. Finally, on the basis of random forest, this study analyses feature importance, and draws a bar graph and the ranking order of different feature importance. In the final result, the prediction accuracy of random forest is relatively higher than that of decision tree, because the random forest model optimizes the decision tree to some extent. In the study on feature importance, alcohol has the greatest impact on the quality of white wine, while the smallest feature is citric acid. This study adjusts the original data set and compares the accuracy of different models, focusing on the importance of features based on the random forest model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.