Complex Performance Assessment Research Articles

Why do we assign numerical ratings when assessing complex performances? And what is the meaning and usefulness of those ratings given the nature of these performances and the multiplicity of assessment purposes? These were the key questions we grappled with in developing an assessment system to be used for national certification of accomplished teachers. Along the way, our work became entangled with philosophical, political, technical, and practical problems that led us into relatively uncharted territories. We worked with definitions of accomplished teaching, for example, as a domain of assessment in a profession influenced by competing and often radically different ideologies. Our goal was the creation of an assessment procedure to support certification decisions, to change teaching practice, and to evaluate complex performances in professionally, technically, and administratively acceptable ways. In the process, we uncovered some of the tensions that exist between the various stakeholders involved and were provided with opportunities to examine the broader context of assessment and its instrumental nature. This work also led us to examine some key assumptions that are held in the field of measurement and their implications for the multiple assessment purposes that we had to consider. The system we were developing was to be used by the National Board for Professional Teaching Standards (NBPTS) for the Early Adolescence / English Language Arts (EA/ELA) certification of exemplary teachers. The assessment included two components: (1) a portfolio, for which teachers documented three different teaching activities over several months and (2) an assessment center, for which teachers participated in a number of tasks, including semistructured interviews, analyses of teaching, and essays on instructional issues. While the particulars of the assessment are not critical here and can be found elsewhere (Delandshere & Petrosky, 1993,1994), it is important to understand its complexity. For example, just one of the portfolio tasks—the Post-Reading Interpretive Discussion Exercise (PRIDE)— required teachers to conduct and videotape a 20-minute interpretive discussion of a literature selection, to write a 3to 10-page commentary analyzing the discussion and their understanding of interpretation, and to include the instructional artifacts used or referred to in the videotape or the commentary. Overall, the candidates' performances involved a range of evidence: (1) sets of responses to tasks that required extensive written commentaries, lengthy videotape segments of their teaching, and videotaped oral interviews with candidates; (2) documents that were produced by the candidates and their students and documents acquired from other sources (e.g., books, instructional materials); and (3) a process through which candidates could integrate and reflect on these different perspectives that resulted in another written document. The assumptions we worked from were mostly related to the technical aspects of assessment and the necessity for evidence of reliability and validity. These assumptions were based on our prior experience and grounded in the measurement tradition. For most of this century, educational achievement or the status of an individual's knowledge has been judged through measurement—that is, by the assigning of numbers to test responses. The resulting scores are used to make value judgments about the quality of performances. After working for several years to develop evaluation schemes, we considered an alternative to the practice'' of assigning numerical ratings, which was to formulate judgments based directly on the characteristics of the performance. Such an alternative may be unnecessary when there is a one-to-one correspondence between the assignment of points and the number of correct responses, but the complexity and breadth of responses for this assessment appeared to defy such correspondence. To further complicate the matter, the tasks developed for this assessment were grounded in a professional ideology that values knowledge as individually and socially constructed and as reflected in particular discourses and contexts. This conception of performance is quite different from those implied in many assessment contexts and seemed to require an evaluation scheme more consistent with this representation of knowledge than with the traditional numerical scoring schemes. To this end, our procedure for judging used what we called interpretive summaries of performance, written records that document the salient characteristics of the performance and the judges' interpretations of those as evi-

The number of assessments that are performance based is increasing rapidly, but to date there is not an established procedure for setting standards on these assessments. The purposes of this article are to describe several extensions to the Angoff procedure to accommodate the characteristics of a performance-based assessment and to present the results of our research to apply this new procedure. The extensions included a revised task for panelists, which involved the specification of (a) expected scores for just barely certifiable candidates on polytomously scored exercises and (b) weights to reflect the relative importance of scoring dimensions at the exercise level and the exercises themselves. The results obtained from the new procedure were mixed. On the one hand, panelists seemed to find the new procedure straightforward to apply and were able to reach a high level of agreement among themselves about the standards. Confidence levels in the resulting standards were high. However, when given the choice...

Complex Performance Assessment Research Articles

Articles published on Complex Performance Assessment

Response to Delandshere and Petrosky's "Assessment of Complex Performances: Limitations of Key Measurement Assumptions"

Components of Rater Error in a Complex Performance Assessment

Assessment of Complex Performances: Limitations of Key Measurement Assumptions

Assessment of Complex Performances: Limitations of Key Measurement Assumptions

Development of Automated Scoring Algorithms for Complex Performance Assessments: A Comparison of Two Approaches

084006 (M10) Variance based importance analysis applied to a complex probabilistic performance assessment : Manteufel R.D.,Risk Analysis 16 nr 4 587–598 (1996)

Variance‐Based Importance Analysis Applied to a Complex Probabilistic Performance Assessment

The Performance Domain and the Structure of the Decision Space

A Multi-Stage Dominant Profile Method for Setting Standards on Complex Performance Assessments

Using an Extended Angoff Procedure to Set Standards on Complex Performance Assessments

Capturing Teachers' Knowledge: Performance Assessment

Direct Assessment, Direct Validation? An Example From the Assessment of Writing

Considerations in the Application of Complexity Theory - Based Measures of Individual Performance to Team and Organizational Tasks

No Oscar for OSCA

Methodology in the assessment of complex performance: introduction.

Methodology in the assessment of complex performance: discussion and conclusions.

Methodology in the use of synthetic tasks to assess complex performance.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Complex Performance Assessment Research Articles

Articles published on Complex Performance Assessment

Response to Delandshere and Petrosky's "Assessment of Complex Performances: Limitations of Key Measurement Assumptions"

Components of Rater Error in a Complex Performance Assessment

Assessment of Complex Performances: Limitations of Key Measurement Assumptions

Assessment of Complex Performances: Limitations of Key Measurement Assumptions

Development of Automated Scoring Algorithms for Complex Performance Assessments: A Comparison of Two Approaches

084006 (M10) Variance based importance analysis applied to a complex probabilistic performance assessment : Manteufel R.D.,Risk Analysis 16 nr 4 587–598 (1996)

Variance‐Based Importance Analysis Applied to a Complex Probabilistic Performance Assessment

The Performance Domain and the Structure of the Decision Space

A Multi-Stage Dominant Profile Method for Setting Standards on Complex Performance Assessments

Using an Extended Angoff Procedure to Set Standards on Complex Performance Assessments

Capturing Teachers' Knowledge: Performance Assessment

Direct Assessment, Direct Validation? An Example From the Assessment of Writing

Considerations in the Application of Complexity Theory - Based Measures of Individual Performance to Team and Organizational Tasks

No Oscar for OSCA

Methodology in the assessment of complex performance: introduction.

Methodology in the assessment of complex performance: discussion and conclusions.

Methodology in the use of synthetic tasks to assess complex performance.