Statistical Significance and Utility of Data-Driven Functional Dependencies of Wine Quality Data of Numerical Attributes

Hyontai Sug

doi:10.37394/23209.2023.20.30

Abstract

There has been a lot of research work to find out functional dependencies algorithmically from databases. But, when the databases consist of numerical attributes, some of the found functional dependencies might not be real functional dependencies, because numerical attributes can have a variety of values. On the other hand, regression analysis is an analysis method in which a model of the observed continuous or numerical variables is obtained and the degree of fit is measured. In this paper, we show how we can determine whether the found functional dependencies of numerical attributes have explanatory power by doing multivariate linear regression tests. We can check their explanatory power by way of adjusted R-squared, as well as other statistics like multicollinearity, the Durbin-Watson test for independence, and the F value for suitability of the regression models. For the experiment, we used the wine quality data set of Vinho Verde in the UCI machine learning library, and we found out that only 48.7% and 30.7% of functional dependencies found by the algorithm called FDtool have explanatory power for the red wine and white wine data set respectively. So, we can conclude that we should be careful when we want to apply the functional dependencies found by the algorithm. In addition, as a possible application of the found functional dependencies in the conditional attributes of the data sets, we have generated a series of random forests by dropping redundant attributes that appear on the right-hand side of the explanatory functional dependencies and acquired good results. So, we can also conclude that we may reduce our efforts by not collecting the data of the redundant attribute to check the wine quality because we can use samples with as few attribute values as possible in mass-produced wines like Vinho Verde.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Statistical Significance and Utility of Data-Driven Functional Dependencies of Wine Quality Data of Numerical Attributes

Abstract

Talk to us

Similar Papers

More From: WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS

Lead the way for us

Similar Papers

Social-Demographic Predictors of Sexual Self-concept in Women on the Verge of Marriage: A Cross-Sectional Study
Soheila Bani ... Mojgan Mirghafourvand
Modern Care Journal | VOL. 21
Soheila Bani, et. al.Soheila Bani ... Mojgan Mirghafourvand
06 Mar 2024
Modern Care Journal | VOL. 21

Rough Sets and Few-Objects-Many-Attributes Problem: The Case Study of Analysis of Gene Expression Data Sets
Dominik Slezak
-
Dominik SlezakDominik Slezak
01 Jan 2007
01 Jan 2007

Factors Associated with Cognitive Score in People with Schizophrenia at Prof. Dr. M. Ildrem Mental Hospital Medan
Julius Martin Siagian ... Vita Camellia
Open Access Macedonian Journal of Medical Sciences | VOL. 9
Julius Martin Siagian, et. al.Julius Martin Siagian ... Vita Camellia
21 Jun 2021
Open Access Macedonian Journal of Medical Sciences | VOL. 9

Basic life support skill retention of medical interns and the effect of clinical experience of cardiopulmonary resuscitation
Ji Ung Na ... Keun Jeong Song
Emergency Medicine Journal | VOL. 29
Ji Ung Na, et. al.Ji Ung Na ... Keun Jeong Song
01 Nov 2011
Emergency Medicine Journal | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Statistical Significance and Utility of Data-Driven Functional Dependencies of Wine Quality Data of Numerical Attributes

Abstract

Talk to us

Similar Papers

More From: WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS