IN SEVERAL books, I have encouraged readers to graphs. To see through a graph is to ignore the picture and ascertain the meaning. A picture might be worth a thousand words, but pictures also lie on occasion. More often, they simply mislead. In the November 2002 issue of the NCTM News Bulletin, Johnny Lott, president of the National Council of Teachers of Mathematics, takes Brookings Institution fellow Tom Loveless to task for using misleading graphs concerning improvements on the NAEP mathematics assessments. The graphs appear in the Brookings publication The 2002 Brown Center Report on American Education: How Well Are American Students Learning? Lott begins by invoking the statistician's bible of sarcasm: Darrell Huff's 1954 How to Lie with Statistics, which is still in print. One way of making a graph look more impressive, Huff advised, is to change the proportion between the units used for the ordinate (vertical axis) and abscissa (horizontal axis). Lott accuses Loveless of using this deception, which is shown in Figure 1. The first points designating the NAEP (National Assessment of Educational Progress) testing years are two years apart -- 1990 and 1992 -- while the other two are four years apart -- 1996 and 2000. Yet the space of all three intervals is represented as the same. Loveless should have made the distance between 1990 and 1992 half the size of the distances between 1992 and 1996 and 1996 and 2000. In correcting Loveless, though, it seems to me that Lott has created a graph that is just as arbitrary as the original one -- and perhaps even more misleading. Lott converts Loveless' rectangles to squares, with each point between squares on the abscissa representing two years and each point between squares on the ordinate representing two points of the NAEP scale. Lott's version is shown in Figure 2. Lott claims that this is a more honest graph, and it does fix the problem of unequal intervals on the abscissa. But the choice of how to proportion the units in this case (and for most any graph) is arbitrary. Certainly there is nothing inherently meaningful in plotting every two NAEP points. We could plot every single NAEP point or every fifth NAEP point with equal justification. Similarly, there is no theoretical justification whatsoever for making the space between two points on the NAEP scale equal in size to two years of time, but that's what Lott's squares do. I repeat: the choice of proportions is arbitrary, not in the sense of being capricious but in the sense that judgment is required. If the two graphs represented rising costs of something and if you were a superintendent, you might choose one or the other depending on the message you wished to convey to your school board. Moreover, the gains are the same in both graphs. They show changes in average scores over time. The honesty -- or at least the meaningfulness -- of both graphs would be greatly improved by an indication of the variability of the scores around these averages. Remember from basic statistics: No measure of central tendency without a measure of dispersion. The scores rise from 213 in 1990 to 228 in 2000. Is this a meaningful gain? An important gain? Given what psychometricians have been saying lately about the volatility of test scores, we might also ask whether the gain is due mostly to chance or to factors that have nothing to do with the quality of instruction. We would interpret the gains differently if 228 were a full standard deviation above 213 or only one-fourth of a standard deviation above. We can't tell from the graphs. If we knew what the standard deviation was, we could calculate an effect size, a statistic that leads more to practical interpretations than does a calculation of whether or not the gain is statistically significant. …
Read full abstract