For practical, psychometric, and pedagogical reasons, strong interest exists in developing multiple-measure constructed-response items for use in large-scale performance assessments. Items that can be scored for evidence of proficiency in 2 or more content areas raise questions, however, about the "fit" between various content areas and the possibility of sending cross-messages or confounding different content demands. To determine the factors that contribute to or compromise the effectiveness of multiscored items, in this study we combine analysis of statewide score data from the 1996 Maryland School Performance Assessment Program tests, administered at Grades 3, 5, and 8, with systematic analysis of 60 activities providing measures of writing, language usage (LU), or both, as well as one or more content areas. Although test developers to date have had greater success in creating writing/LU items that can also be scored for reading and social studies than for mathematics and science, we argue for the validity of multiple-measure items across all content areas and suggest that, across content areas, successful multiple-measure items (a) make information sources explicit and allow students to draw on both text-based and personal knowledge; (b) identify the specific content area skills or concepts being assessed; (c) permit open-ended development; (d) maintain a good fit between content demands and the rhetorical situation, creating an authentic writer-audience relationship; and (e) are uncluttered, focused, and direct, with good recall capability. Thus, we suggest that test developers reorient their concerns from the difficulty of items to identifying elements of multiple measure activities that facilitate or impede students' ability to demonstrate what they know and can do in different content areas.