Abstract

Subsequent to the identification of a set of variables that best discriminate between two populations, there might still exist nonstatistical reasons for further exploration of a specific subset in terms of its relative discriminating ability. This requires a more f exible criterion than the usual one based on tests of equality of the Mahalanobis distances. A measure of the extent to which a set and a specific subset of variables agree in the discriminant analysis problem, under multivariate normal assumptions, is proposed. This measure is the probability of concordance in classification and its value is shown to be high when the subset of variables suggest large separation between the two populations. A justification for the use of the subset of variables can be derivedfrom a hlgh probability of concordance and the non-statistical considerations discussed in the text. In the two-population (II; i - 1, 2) discriminant analysis problem based on a p-variate normal density, the hypothesis testing techniques intended to result in possible variable reduction are presented in texts such as Kshirsagar's (1972). The methods involve the comparison of the p-variate Mahalanobis distance with that of a specific subset q(q < p). The rejection of the hypothesis of equal distances dictates the use of the p variables in the classification procedure. However, there might be nonstatistical reasons for further exploration of the subset in terms of its relative discriminating ability. For instance, in critical care medical research, the variables that best differentiate between survivors and non-survivors are often realized using both invasive and non-invasive techniques. But the invasive techniques frequently interfere with the condition of the patient (Weil and Shubin 1967). In other research areas, some variables might be prohibitively expensive to measure on a routine basis although their inclusion might increase the probabilities of correct classification. Hence, there exists a real need for a flexible criterion, based on relative discriminating ability, that can be used for the comparison of a set of variables with a specific subset. In this article we present a measure of the degree to which a set and a specific subset of variables agree in the classification of an individual (object) into any one of two populations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call