Editorial

doi:10.1111/j.1745-3992.2009.00148.x

Abstract

Educational Measurement: Issues and PracticeVolume 28, Issue 3 p. 1-4 Open Access Editorial First published: 02 September 2009 https://doi.org/10.1111/j.1745-3992.2009.00148.xCitations: 1AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Special Issue on the Validity of Formative and Interim Assessment Measurement community, be careful what you wish for. We now have “formative assessment” in our toolkit, and we are claiming it is great stuff. As all of the authors of articles in this issue point out, a now-famous literature review by Black and Wiliam (1998) concluded that formative assessment can have powerful effects on student learning. As the authors in this issue also point out, there are some—both test developers and test users—who have focused on the wish (the positive effect sizes) and not the warrant (theories of learning and motivation that emphasize active student construction of knowledge and student self-regulation of learning). In many cases, developing, marketing, or selecting tests, creating item banks, scheduling common assessments, and other such things have overshadowed using information for student learning. In other words, work in formative assessment is out of balance. There is too much emphasis on “assessment” (tests and assessments, schedules, data reports) and not enough on “formation” (learning). However, as this issue demonstrates, formative assessment is as much about learning as it is about assessment. In my view, the major contribution this special issue makes is to point out some implications of this point for validity, both in theory and in practice. We have wished for formative assessment, and now we've got it. As the articles in this issue tell us, this means we must expand our notion of the validity of formative assessment to include evidence from learning theory, curriculum theory, and the content domain. That means that our validity studies need to change as well, to include investigations of whether and how assessment information is used formatively and whether such use results in effects that can be predicted based on underlying theories of learning and motivation, including theories of learning in the content area. In the remainder of this editorial, I will first explain how the articles in this issue coalesce around the theme formative assessment is as much about learning as it is about assessment. Second, I will demonstrate, with two examples, that others have recently felt the need to make this point as well. Finally, I will relate this point to the issue's theme, the validity of formative and interim assessment. In This Issue Upon a quick first reading, the three articles in this issue may not seem to have much in common, other than that they all discuss the term “formative assessment” in some way. Perie, Marion, and Gong promulgate a definition of interim assessment and its place in a comprehensive assessment system. In so doing, they reclaim some territory for formative assessment that had been preempted by assessments that are, by their definition, interim assessment. They also propose some criteria for evaluating interim assessments and advance the argument that unless interim assessments demonstrate substantial value, a district might consider using their resources for true formative assessment instead. I want to emphasize a crucial distinction between formative and summative assessment highlighted by Perie and her colleagues' definition. One of the critical aspects of formative assessment is that both students and teachers participate in generating and using the assessment information. However, Perie and colleague's definition of interim assessment (p. 6) is all about teacher use, either for instructional planning within the classroom or, via aggregation at the class and school level, for curriculum planning in schools and districts. Students are not mentioned as the agents of assessment or interpretation; rather, by their definition interim assessment is for policy and educator decisions. Student learning, then, will be only indirectly affected by changes in policies, curriculum, or instructional plans. According to their definition, a corollary of formative assessment is about learning might be that interim assessment is about planning for learning. Nichols, Meyers, and Burling describe a framework for organizing validity arguments for formative assessments, or more specifically, arguments needed to validate formative claims made for the use of an assessment. Their article explains the framework and provides graphic organizers that could be used in a practical way, to collect, organize, and interpret various pieces of evidence for a formative claim. The relationship of this article to our theme might be to point out that, if formative assessment is about learning, then evidence that learning occurred as a result of using formative assessment is necessary for validity. In my view, this places Nichols and colleagues as firm supporters of the use of consequential evidence for evaluating validity arguments, and I would agree. Heritage, Kim, Vendlinski, and Herman want to move “from evidence to action.” Specifically, they have evidence that interpreting assessment information is easier for teachers than deciding on the next instructional steps. The reason that this matters is that it is in selecting the next appropriate instructional steps that teachers affect student learning. And if formative assessment is about learning, a stumbling block at such an important place in the instructional process is a problem that could potentially keep formative assessment intentions from being formative in actuality. This article provoked a question in my mind, which I raise here but do not solve. The question is how to differentiate between the validity of formative assessment information and the validity of decisions based on formative assessment information. I have already gone on record as agreeing with those who think that consequential evidence for validity is important for formative assessment. If no learning is formed, there was nothing “formative” about an assessment. However, the Heritage et al. article also reminds me that often, teachers' instructional decisions or students' studying decisions are based on other information than just one formative assessment, and are carried out with varying levels of background knowledge and skill. I can, for example, envision two teachers, one of whom makes effective decisions about next steps in instruction and one who doesn't, based on the same piece of information for the same student. Or I can envision one teacher who makes a better next-step decision than another because she has some additional information beyond the one assessment. Therefore, it seems to me that validity of formative assessment information is a necessary, but not sufficient, condition for valid decision making. But the line is blurry, and I think some work could be done here. A future special issue, perhaps? This special issue brings to my mind, but does not address, another aspect of formative assessment, as well. Because teacher knowledge was the focus of their investigation, Heritage and her colleagues did not have measures of the degree to which students are able to use assessment information for self-regulation, to monitor and adjust their own learning. There is an as-yet small literature on student self-assessment that suggests students can be taught to work out next steps and act on them (Andrade, Du, & Wang, 2008; Ross, Hogaboam-Gray, & Rolheiser, 2002). In a perfect world, I would have been able to include in this special issue an article about student use of information in the quest for consequential evidence for the validity of formative assessment. Discussant Shepard found the thread that ties these articles together. She reminds readers that formative assessment is a powerful tool for student learning and reviews some of the evidence from Black and Wiliam's (1998) review to demonstrate the salience of student learning as a theme in their work. After a critique of each article, she returns to this point (p. 36): “I believe that the validity research that will tell us the most about how formative assessment can be used to improve student learning must be embedded in rich curriculum and must at the same time attempt to foster instructional practices consistent with research on learning.” Other Learning Perspectives on Formative Assessment and Learning Our special issue is not the only recent work on assessment that has emphasized that formative assessment is as much about learning as it is about assessment. Others have described how effective assessment requires a closer consideration of the relationship between assessment and learning than is typical practice, developed for heretofore mostly summative measures of achievement and more behaviorist models of learning (Shepard, 2006). In this section, I illustrate this point by discussing two other places where a major theme is that assessment and learning need to be considered together. There are certainly additional places where the connection between assessment and learning is plumbed (e.g., Wilson, 2004), and readers may be thinking of those as well. The polemical tone in parts of these works is evidence that the authors, like the authors in this special issue, consider that an opposing point of view exists that needs to be countered. One such work is the National Research Council report Knowing What Students Know (Pellegrino, Chudowsky, & Glaser, 2001), which may be familiar to readers. This work gave recommendations for research, policy, and practice regarding both classroom and large-scale assessment. Its focus was on both formative and summative assessment, but in both cases it considered mostly formal assessments. The theme of this special issue may be considered a subset of a broader theme in the NRC report, that all assessment (formative and summative) should be based on sound cognitive science. The NRC report conceptualized the foundations of assessment as a triangle (p. 44), the vertices of which represent the three elements on which reasoning from evidence are based: Cognition, Observation, and Interpretation. The element that relates most obviously to learning is Cognition, or “a theory or set of beliefs about how students represent knowledge and develop competence in a subject domain” (p. 44). However, the other two elements are related to it. What counts as appropriate tasks for students to respond to (Observation) and what to do with those responses (Interpretation) are decisions that must be linked to the model of learning. If assessment is reasoning from evidence, if must be evidence of something. In the title and throughout their report, it is clear that the NRC is interested in evidence of learning, for both summative and formative assessments. A second work that emphasizes the theme that formative assessment is as much about learning as it is about assessment is a position paper written by participants (including myself) in the Third International Conference on Assessment FOR Learning. Participants, organized into six international teams, assembled in New Zealand to discuss classroom assessment. Our host was Dr. Terry Crooks, whose own interest in the relationship between assessment and student learning is longstanding (Crooks, 1988). Our deliberations showed us that the problem that many things done in the name of “formative assessment” are, in fact, not primarily about student learning was international in scope and greater in magnitude than I, for one, had realized. We produced a brief position paper, which is contained in its entirety in the Appendix to this editorial. The intent of the position paper was to help clarify the meaning of assessment for learning (called formative assessment in this special issue) and help reclaim a for-learning stance for formative assessment. Thus the United States is not the only country that has the problem that Perie and her colleagues (p. 5) sum up in the U.S. context: “Many so called formative assessments are not at all similar to the types of assessments and strategies studied by Black and Wiliam (1998) but instead are interim assessments.” Shepard's commentary expands on that point, as well. The Validity of Formative and Interim Assessment The contribution this issue makes to the field, then, is to investigate some implications for validity, both in theory and in practice, of this point that is resurfacing: formative assessment is as much about learning as it is about assessment. With this foundation, a validity argument for a formative assessment can be mounted, starting by clarifying its purpose. Is the intention or purpose of the assessment to inform student understanding, or solely teacher planning (Perie et al.)? Does the assessment information inform both student behavior and teacher planning, and in fact result in student learning (Nichols et al.)? What might be some reasons that formative assessment information does not result in student learning (Heritage et al.)? As the previous section was meant to show, this special issue doesn't exactly kindle the debate, which is smoldering already. What I hope this special issue does accomplish is to provide some specific frameworks and examples of research—to fan the flames. I hope that it inspires more work along the lines of the three articles here, and inspires expanded work into the validity of formative assessment as well—for example, the research into student interpretation and use of information for learning I alluded to earlier. The ultimate goal is to restore the balance between “assessment” and “formation” in formative assessment. Formative assessment is as much about learning as it is about assessment. Susan M. Brookhart Editor From the Visuals Editor Readers may find a helpful advance organizer in the cover graphic for this special issue on formative assessment. Reproduced from Figure 1 of the article by Perie, Marion, and Gong (this issue), this figure represents “classes” of assessment as characterized jointly by two dimensions—“frequency of administration” and “scope and duration of cycle” (representing curricular breadth as well as time load associated with administration). Perie et al. use these dimensions to refocus the use of the “formative assessment label” back to that put forth by Black and Wiliam (1998); they do so by distinguishing a class of “interim” assessments that may share similarities with formative assessments (at least relative to summative assessments) but ultimately provide greatest insight at school and district levels rather than for than individual students or classrooms. As part of this special issue, the Perie et al. piece provides one of several perspectives on a topic of great interest among those in measurement, assessment, and accountability. As always, if you would like to share a visual from your own work, or would like to nominate one you've seen elsewhere, please contact me at Ed.Wiley@Colorado.EDU. Ed Wiley Visuals Editor Appendices AppendixPosition Paper on Assessment for Learning from the Third International Conference on Assessment for Learning, Dunedin, New Zealand, March 2009 “Assessment for learning” and “formative assessment” are phrases that are widely used in educational discourse in the United States, Canada, New Zealand, Australia, the United Kingdom, and Europe.1 A number of definitions, some originally generated by members of this Conference,2 are often referred to. However, the ways in which the words are interpreted and made manifest in educational policy and practice often reveal misunderstanding of the principles, and distortion of the practices, that the original ideals sought to promote. Some of these misunderstandings and challenges derive from residual ambiguity in the definitions. Others have stemmed from a desire to be seen to be embracing the concept—but in reality implementing a set of practices that are mechanical or superficial without the teacher's, and, most importantly, the students', active engagement with learning as the focal point. While observing the letter of Assessment for Learning (AFL), this does violence to its spirit. Yet others have arisen from deliberate appropriation, for political ends, of principles that have won significant support from educators. For example, “deciding where the learners are in their learning, where they need to go and how best to get there,” has sometimes been (mis)interpreted as an exhortation to teachers to (summatively) test their students frequently to assess the levels they attain on prescribed national/state scales in order to fix their failings and target the next level. In this scenario, scores, which are intended to be indicators of, or proxies for, learning, become the goals themselves. Real and sustained learning is sacrificed to performance on a test. In contrast, the primary aim of AFL is to contribute to learning itself. This follows from the logic that when true learning has occurred, it will manifest itself in performance. The converse does not hold: mere performance on a test does not necessarily mean that learning has occurred. Learners can be taught how to score well on tests without much underlying learning. AFL is the process of identifying aspects of learning as it is developing, using whatever informal and formal processes best help that identification, primarily so that learning itself can be enhanced. This focuses directly on the learner's developing capabilities, while these are in the process of being developed. AFL seeks out, analyses and reflects on information from students themselves, teachers and the learner's peers as it is expressed in dialogue, learner responses to tasks and questions, and observation. AFL is part of everyday teaching, in everyday classrooms. A great deal of it occurs in real time, but some of it is derived through more formal assessment events or episodes. What is distinctive about AFL is not the form of the information or the circumstances in which it is generated, but the positive effect it has for the learner. Properly embedded into teaching-learning contexts, AFL sets learners up for wide, lifelong learning. These ideas are summed up in a short second-generation definition of AFL generated by the Conference in March 2009. This is intended to make clear the central focus on learning by students. The definition is followed by some elaboration of it. Definition AFL is part of everyday practice by students, teachers, and peers that seeks, reflects upon, and responds to information from dialogue, demonstration, and observation in ways that enhance ongoing learning. Elaboration 1 “ Everyday practice”—this refers to teaching and learning, pedagogy, and instruction (different terms are used in different regions of the world but the emphasis is on the interactive, dialogic, contingent relationships of teaching and learning). 2 “By students, teachers, and peers”—students are deliberately listed first because only learners can learn. AFL should be student centred. All AFL practices carried out by teachers (such as giving feedback, clarifying criteria, rich questioning) can eventually be “given away” to students so that they take on these practices to help themselves, and one another, become autonomous learners. This should be a prime objective. 3 “Seeks, reflects upon, and responds to”—these words emphasize the nature of AFL as an enquiry process involving the active search for evidence of capability and understanding, making sense of such evidence, and exercising judgement for wise decision making about next steps for students and teachers. 4 “Information from dialogue, demonstration, and observation”—verbal (oral and written) and nonverbal behaviors during both planned and unplanned events can be sources of evidence. Observation of these during on-going teaching and learning activity is an important basis for AFL. Special assessment tasks and tests can be used formatively but are not essential; there is a risk of them becoming frequent minisummative assessments. Everyday learning tasks and activities, as well as routine observation and dialogue are equally, if not more, appropriate for the formative purpose. 5 “In ways that enhance ongoing learning”—sources of evidence are formative if, and only if, students and teachers use the information they provide to enhance learning. Providing students with the help they need to know what to do next is vital; it is not sufficient to tell them only that they need to do better. However, such help does not need to provide a complete solution. Research suggests that what works best is an indication of how to improve, so that students engage in mindful problem solving. Notes 1 All of these regions were represented at the conference: • Australia ○ Val Klenowski—Queensland University of Technology ○ Juliette Mendelovits—Australian Council for Educational Research ○ Royce Sadler—Griffith University ○ Claire Wyatt-Smith—Griffith University • Canada ○ Geoff Cainen—Halifax Regional School Board ○ Anne Davies—Education consultant, British Columbia ○ Lorna Earl—OISE University of Toronto ○ Dany Laveault—University of Ottawa ○ Anne Longston—Province of Manitoba ○ Ken O’Connor—Education consultant, Ontario • Europe ○ Linda Allal—University of Geneva ○ Menucha Birenbaum—Tel Aviv University ○ Filip Dochy—University of Leuven ○ Mien Segers—University of Leiden and University of Maastricht ○ Kari Smith—University of Bergen • New Zealand ○ Sandie Aikin—New Zealand Educational Institute ○ Mary Chamberlain—New Zealand Ministry of Education ○ Terry Crooks—University of Otago ○ Lester Flockton—University of Otago ○ Alison Gilmore—University of Canterbury ○ Jeff Smith—University of Otago • United Kingdom ○ Richard Daugherty—Cardiff University ○ Carolyn Hutchinson—Learning and Teaching Scotland ○ Mary James—University of Cambridge ○ Gordon Stobart—Institute of Education, University of London ○ Ruth Sutton—Education consultant, UK • United States of America ○ Susan Brookhart—Education consultant, Montana ○ Peter Johnston—University at Albany, SUNY ○ Frank Philip—Council of Chief State School Officers ○ W. James (Jim) Popham—University of California, Los Angeles ○ Rick Stiggins—ETS Assessment Training Institute, Oregon 2 For example: (1) “Assessment for Learning is the process of seeking and interpreting evidence for use by learners and their teachers to decide where the learners are in their learning, where they need to go and how best to get there.” In Assessment Reform Group (2002) Assessment is for Learning: 10 principles. Downloadable from http://www.assessment-reform-group.org. (2) “Practice in a classroom is formative to the extent that evidence about student achievement is elicited, interpreted, and used by teachers, learners, or their peers, to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions they would have taken in the absence of the evidence that was elicited.” In Black and Wiliam (2009). Developing the theory of formative assessment. Educational Assessment, Evaluation, and Accountability, 21, 5–31. (3) “Formative assessment is a process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to improve students’ achievement of intended instructional outcomes.” State Collaborative on Assessment and Student Standards, Council of Chief State School Officers, USA. In J. Popham (2008), Transformative Assessment, Alexandria, VA: Association for Supervision and Curriculum Development. (4) “Formative assessment is a planned process in which assessment-elicited evidence of students’ status is used by teachers to adjust their ongoing instructional procedures or by students to adjust their current learning tactics.” In J. Popham (2008), Transformative Assessment, Alexandria, VA: Association for Supervision and Curriculum Development.Note: This position paper is in the public domain. It may be copied in its entirety and used for educational purposes. References Andrade, H. L., Du, Y., & Wang, X. (2008). Putting rubrics to the test: The effect of a model, criteria generation, and rubric-referenced self-assessment on elementary school students' writing. Educational Measurement: Issues and Practice, 27(2), 3– 13. Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Educational Assessment: Principles, Policy and Practice, 5(1), 7– 74. Crooks, T. J. (1988). The impact of classroom evaluation practices on students. Review of Educational Research, 58, 438– 481. Ross, J. A., Hogaboam-Gray, A., & Rolheiser, C. (2002). Student self-evaluation in grade 5–6 mathematics: Effects on problem-solving achievement. Educational Assessment, 8(1), 43– 58. J. W. Pellegrino, N. Chudowsky, & R. Glaser (Eds.) (2001). Knowing what students know: The science and design of educational assessment. Washington , DC : National Academies Press. Shepard, L. A. (2006). Classroom assessment. In R. L. Brennan (Ed.), Educational measurement ( 4th ed., pp. 623– 646). Westport , CT : Praeger. M. Wilson (Ed.) (2004). Towards coherence between classroom assessment and accountability. 103rd Yearbook of the National Society for the Study of Education, Part II. Chicago : University of Chicago Press. Citing Literature Volume28, Issue3Fall 2009Pages 1-4 ReferencesRelatedInformation

Full Text