Abstract

Generalizability Theory Classical test theory (Gulliksen, 1950) has been shown to be limited in handling a wide variety of measurement problems (Cronbach, Gleser, Nanda, & Rajaratnam, 1972). To overcome some of the problems, Cronbach et a1. (1972) developed a theory of the generalizability of behavioral measurements. This theory does not rely upon the restricted assumptions of the classical test theory. Furthermore, it isolates more than one source of error variation affecting the measurement. Generalizability theory distinguishes two types of studies: a G study and a D study. A G study is a pilot study carried out according to an elaborate design which includes all possible sources that may influence the generalizability of measurement. A D study collects data for the purpose of making decisions or drawing conclusions. One can choose the design for the D study from information provided in the G study, considering, of course, the D study's purposes and the cost of collecting information. In contrast to the classical test theory, generalizability theory distinguishes between absolute decisionsdecisions made on an individual's absolute score (e.g., 80% correct constitutes the criterion on a written driver's examination)-and relative decisions-decisions based upon rank ordering of individuals (e.g., correlational studies). This distinction is used to determine which sources of error variability affect the measurement. In generalizability theory the sources of error affecting the measurement (e.g., judges, observation occasions, items) are called facets (cf. factors in the analysis of variance). The conditions (cf. levels of a factor in the analysis of variance) representing these facets either constitute a random sample from the universe of conditions (a random facet) or exhaust the universe of conditions (a fixed facet). The persons on whom the measurements are taken are considered a random sample from a well-defined target population. A variability in the measurement due to a facet or an interaction among facets is defmed as error variance, while the variability among the persons is defmed as the universe score variance. An observed score variance is the sum of the universe score variance and the sampling error variances. The sampling error variance is defined as an average error variance where the average is taken over the number of conditions in which the source of error is sampled. The coefficient of generalizability (the counterpart of reliability coefficient) is defmed as the ratio of universe score variance to observed score variance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call