Abstract

We consider the problem of discriminating between two independent multivariate normal populations, N p ( μ 1, Σ 1) and N p ( μ 2, Σ 2), having distinct mean vectors μ 1 and μ 2 and distinct covariance matrices Σ 1 and Σ 2. The parameters μ 1, μ 2, Σ 1, and Σ 2 are unknown and are estimated by means of independent random training samples from each population. We derive a stochastic representation for the exact distribution of the “plug-in” quadratic discriminant function for classifying a new observation between the two populations. The stochastic representation involves only the classical standard normal, chi-square, and F distributions and is easily implemented for simulation purposes. Using Monte Carlo simulation of the stochastic representation we provide applications to the estimation of misclassification probabilities for the well-known iris data studied by Fisher ( Ann. Eugen. 7 (1936), 179–188); a data set on corporate financial ratios provided by Johnson and Wichern ( Applied Multivariate Statistical Analysis, 4th ed., Prentice–Hall, Englewood Cliffs, NJ, 1998); and a data set analyzed by Reaven and Miller ( Diabetologia 16 (1979), 17–24) in a classification of diabetic status.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call