Comparing the Linear and Quadratic Discriminant Analysis of Diabetes Disease Classification Based on Data Multicollinearity

Autcha Araveeporn

doi:10.1155/2022/7829795

Comparing the Linear and Quadratic Discriminant Analysis of Diabetes Disease Classification Based on Data Multicollinearity

Autcha Araveeporn

Open Access

PDF Available

https://doi.org/10.1155/2022/7829795

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Linear and quadratic discriminant analysis are two fundamental classification methods used in statistical learning. Moments (MM), maximum likelihood (ML), minimum volume ellipsoids (MVE), and t-distribution methods are used to estimate the parameter of independent variables on the multivariate normal distribution in order to classify binary dependent variables. The MM and ML methods are popular and effective methods that approximate the distribution parameter and use observed data. However, the MVE and t-distribution methods focus on the resampling algorithm, a reliable tool for high resistance. This paper starts by explaining the concepts of linear and quadratic discriminant analysis and then presents the four other methods used to create the decision boundary. Our simulation study generated the independent variables by setting the coefficient correlation via multivariate normal distribution or multicollinearity, often through basic logistic regression used to construct the binary dependent variable. For application to Pima Indian diabetic dataset, we expressed the classification of diabetes as the dependent variable and used a dataset of eight independent variables. This paper aimed to determine the highest average percentage of accuracy. Our results showed that the MM and ML methods successfully used large independent variables for linear discriminant analysis (LDA). However, the t-distribution method of quadratic discriminant analysis (QDA) performed better when using small independent variables.

Full Text