Classification performance of mathematical programming techniques in discriminant analysis: Results for small and medium sample sizes

Antonie Stam,Dennis G Jones

doi:10.1002/mde.4090110406

Antonie Stam, Dennis G Jones

https://doi.org/10.1002/mde.4090110406

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

AbstractThe performance on small and medium‐size samples of several techniques to solve the classification problem in discriminant analysis is investigated. The techniques considered are two widely used parametric statistical techniques (Fisher's linear discriminant function and Smith's quadratic function), and a class of recently proposed nonparametric estimation techniques based on mathematical programming (linear and mixed‐integer programming). A simulation study is performed, analyzing the relative performance of the above techniques in the two‐group case, for various small sample sizes, moderate group overlap and across six different data conditions. Training samples as well as validation samples are used to assess the classificatory performance of the techniques. The degree of group overlap and sample sizes selected for analysis in this paper are of interest in practice because they closely reflect conditions of many real data sets. The results of the experiment show that Smith's nonlinear quadratic function tends to be superior on the training samples and validation samples when the variances–covariances across groups are heterogeneous, while the mixed‐integer technique performs best on the training samples when the variances–covariances are equal, and on validation samples with equal variances and discrete uniform independent variables. The mixed‐integer technique and the quadratic discriminant function are also found to be more sensitive than the other techniques to the sample size, giving disproportionally inaccurate results on small samples.

Full Text