Ben Armstrong (London School of Hygiene and Tr-opical Medicine) An ecological study is defined in epidemiology as 'A study in which the units of analysis are populations or groups of people rather than individuals' (Last, 1988). Ecological studies have long been introduced to students of epidemiology as occupying bottom place in the hierarchy of designs, barely above case reports. Most damning of all, students learn that whereas other designs might suffer from 'biases' and 'confounding', which can be controlled, ecological studies are said to suffer from the 'ecological fallacy', which appears to have no remedy. This is usually the only context in which students of epidemiology encounter this especially disapproving term. The ecological fallacy is usually demonstrated with a reference to Durkheim's example (cited by Greenland and Robins (1994)), of association between suicide rates and proportions of Protestants in Swiss Cantons-possibly due to Catholics committing suicide in Cantons dominated by Protestants. This with some examples of some obviously misleading associations between average levels of risk factors and outcomes across countries generally suffices to fix firmly in students' minds a deep-rooted lifelong suspicion of ecological studies. Published methodological work beginning as a trickle in the early 1980s and reaching a steady stream by the early 1990s has sought to clarify more carefully conditions under which ecological studies of disease yield useful results, and to improve methods of design and analysis to make this more likely. Emphasis has generally remained on the limitations, but some strengths have also been noted. A greater robustness to measurement error bias is an example. Several epidemiological developments have encouraged methodological work in this area. First, it has been observed that some putative disease risk factors, notably diet, vary much more between communities than between individuals within the same community (Rose, 1985). Could the advantages of this greater contrast outweigh the disadvantages of the ecological design? Second, hybrid or 'semiecological' designs have become more common. In these some explanatory variables can be linked to the outcome at an individual level, and others are available only at the group level. An example is the 'six-cities' study of air pollution and health, in which air pollution was measured at the city level, but smoking, other personal risk factors and health outcomes were measured in individuals (Dockery et al., 1993). Finally, small scale geographical referencing has enabled ecological studies in which the units of analysis are very small areas (Elliott et al., 1992). To what extent do semniecological and small area studies escape the limitations that are expected of ecological studies? Useful recent reviews from an epidemiological perspective of the methodological work before this issue include Walter (1991), Richardson (1992), Greenland (1992), Greenland and Robins (1994) and Morgenstern (1998). Although ecological studies often suffer from problems that are not intrinsic to the ecological design (aggregated data), such as omitted or poorly measured confounders, most of the literature is concerned with biases directly due to aggregation, usually called ecological bias. Ecological bias is a particular problem in epidemiology because under the non-linearity of most epidemiological models conditions for the absence of ecological bias are much rarer than under the linear models that are popular in other contexts (Greenland and Robins, 1994). This set of papers extends the understanding of ecological analyses in several complementary ways. Guthrie and Sheppard explore the potential of design and analysis techniques introduced as 'aggregate data' methods by Prentice and Sheppard to overcome limitations of ecological studies. In this approach models at an individual level are used to induce models at the group level, which can then be fitted to group level data to estimate individual level parameters. For non-linear models the group level models are typically functions of variances and covariances of explanatory variables, rather than the means used in conventional ecological analysis. As these are rarely available from routine sources, additional data collection is required, at least on a subsample of subjects. By comparing aggregate data estimators with conventional ecological methods in simulations, Guthrie and Sheppard confirm that these methods
Read full abstract