Fisher's Hypergeometric Test for a Comparison in a Finite Population

Stuart Beal

doi:10.1080/00031305.1976.10479168

Abstract

Consider a finite population, called here the target population, and consider two conditions, each of which we may imagine applying to all the units of this population. Suppose that under each condition, each population unit will exhibit some binary (0-1) response. The problem is to the set of responses of all the units under one condition with the set of responses of all the units under the second condition. We call these two sets the two population responses. In order to make the comparison, an experiment, in which some responses are actually obtained, must be performed. The population may be large enough to make it impractical to obtain an entire population response, and consequently, part of the experiment may entail determining a sample of units whose response under one or both conditions will be obtained. Also, in a real experiment it may be impossible in general to actually apply both conditions to a unit. (The actual application of a condition to a particular unit may result in altering the unit so that the other condition cannot be applied to it later). Consequently part of the experiment may entail choosing the conditions which shall be applied to particular experimental units. A common experimental design in these situations involves 1) obtaining a random sample from the target population, 2) randomly allocating the sample units to two experimental groups, and 3) applying one condition to the units of one group, while applying the other condition to the units of the other group. Even when both conditions may actually be applied to all units, this design may still be helpful since both groups may undergo experimentation during the same time period, and therefore, total experiment time may be reduced. Fisher's well known hypergeometric test (2,3,8) or a test which is approximate to Fisher's test, such as the chi-squared test with Yates correction (1,2,3,8), is usually used to analyze the responses obtained from an experiment with the above design. When the sample size, N, is sufficiently small compared to the size, M, of the target population, one may regard the responses of the first (second) group as being approximately a sequence of Bernoulli 0-1 outcomes, with probability m1!M (m2!M) for outcome 1, where mi = the number of I's in the ith population response. Thus Fisher's test may be regarded to be approximately testing the null hypothesis m1!M = m2!M (or ml = M2). In this article we explain that the appropriateness of using Fisher's test to test m1 = m2 may be examined without recourse to the Bernoulli approximation. Such examination is especially important when NIM is so large that it is not really meaningful to talk about Bernoulli outcomes. An important requirement of the inferential procedure is that the sample be random. The usual assumption that the sample is indeed random is often rather shaky in application. This is often the case in medical experiments where an accessible group of patients is used for a sample, rather than a group of patients chosen from either a well-defined target population or sampling frame, and where the inference is about some population of interest which is characterized after the sample is determined and which should be identified here with the target population. For a proper interpretation of the experimental result, one may need to redefine the target population in such a way that the randomness assumption becomes more tenable (or, as it is said, in such a way that the sample becomes more representative of the population). In so doing, one may well add restrictions to the original definition of the target population, thereby decreasing the population size. In fact, it may be necessary to define the target population to be, in effect, the set of units comprising the sample; the sample is then undoubtedly random. Of course, when the population is made less encompassing, it may lose some interest value with respect to the conclusion to be drawn about it. However, this loss is often necessary for a valid inference to result. While constricting the population, the ratio of sample size to population size may increase to the point where the inferential argument should not depend on the use of the Bernoulli approximation. These considerations motivated this author's inquiry into the role of Fisher's test in the inference problem to compare the population responses.

Full Text