Exact Misclassification Probabilities for Plug-In Normal Quadratic Discriminant Functions: II. The Heterogeneous Case

H.Richard Mcfarland,Donald St.P Richards

doi:10.1006/jmva.2001.2034

H.Richard Mcfarland, Donald St.P Richards

Open Access

https://doi.org/10.1006/jmva.2001.2034

Copy DOI

Journal: Journal of Multivariate Analysis	Publication Date: Jul 25, 2002
Citations: 20	License type: publisher-specific-oa

Affiliation: University of Virginia

Abstract

We consider the problem of discriminating between two independent multivariate normal populations, N p ( μ 1, Σ 1) and N p ( μ 2, Σ 2), having distinct mean vectors μ 1 and μ 2 and distinct covariance matrices Σ 1 and Σ 2. The parameters μ 1, μ 2, Σ 1, and Σ 2 are unknown and are estimated by means of independent random training samples from each population. We derive a stochastic representation for the exact distribution of the “plug-in” quadratic discriminant function for classifying a new observation between the two populations. The stochastic representation involves only the classical standard normal, chi-square, and F distributions and is easily implemented for simulation purposes. Using Monte Carlo simulation of the stochastic representation we provide applications to the estimation of misclassification probabilities for the well-known iris data studied by Fisher ( Ann. Eugen. 7 (1936), 179–188); a data set on corporate financial ratios provided by Johnson and Wichern ( Applied Multivariate Statistical Analysis, 4th ed., Prentice–Hall, Englewood Cliffs, NJ, 1998); and a data set analyzed by Reaven and Miller ( Diabetologia 16 (1979), 17–24) in a classification of diabetic status.

Full Text