The U.S. Department of Education, the National Science Foundation (NSF), the Council of Chief State School Officer, and nearly all individual states have efforts under way to create educational indicators. Federal agencies view indicators as essential for monitoring the status of the nation's educational system and for tracking how it changes over time. At the state level, policy makers hope that indicators will provide information they can use to hold local districts and schools accountable for their performance. State policy makers also seek indicator data that can inform new improvement efforts. At the heart of each of these national and state efforts is a fundamental question: What indicators will be most valid and useful [22]? Introduction Indicators come in many forms, but all are descriptive data (for example, frequencies, percentages, ranks, means, rates, ratings) serving as proxies for conditions in education that policy makers wish to monitor. While imprecise and not without flaws, they were powerful numbers, capable of triggering important decisions relating to resource allocation, student admissions, staffing, curriculum, and other matters. What began in 1867 as simple record-keeping of enrollment statistics by the federal government [20] has since become big business [24], as evidenced by the multitude of studies and projects begin conducted. Today's activity results in part from twenty-five years of growing concern for student performance and the effectiveness of our schools. This study was prompted by the author's concern for the rigor with which indicators are developed or selected and for the validity of certain indicators that are being used today to evaluate institutions of higher education. This concern stems from the observation that indicators have the potential to be both technically inaccurate (for example, the numbers are calculated incorrectly or contain various sources of error) and conceptually invalid (that is, they are poor proxies for the concepts or conditions in education we wish to measure). The statistics we identify as indicators of quality of excellence require special justification because the inferences we draw from them extend beyond recorded facts (for example, student enrollment, cost-per-student) to constructs which are difficult to define or isolate. Additionally, many of us in the research and policy communities have, at best a fuzzy understanding about what it means to have a valid indicator. How does one go about testing the validity of an indicator or indicator system? Unlike the known standards for conducting validity studies on tests, there are no agreed-upon procedures for assuring that the indicators measure, or represent, that which we assume they measure. This lack of agreement is not limited to policy users, as de Nuefville [13] wrote in her treatise on a social indicators, but extends to developers of indicator systems, as a review of indicator literature revealed. This lack of a systematic method or conceptual framework lead to confusion as to which types of evidence might be gathered to support the interpretation and use of indicators. The same lack explains, perhaps, why virtually all indicator systems proposed or in place in education todat have no self-evaluate component. Although the practical requirements of good indicators (for example, cost effective, understood by lay audiences, feasible to collect) have been amply listed [for example, 21, 23, 26], most current developers fail to explain how their indicators should be tested or assessed other than by a largely subjective process of literature review, review by committee, and affirmation over time by the public [13, 26, 27]. The validity of current indicators might be less in question had they been developed systematically, beginning with careful definition of an underlying construct (for example, curriculum quality, ethic discrimination) and proceeding to the development and testing of standardized measures. …