Integrated tasks are often used in higher education (HE) for diagnostic purposes, with increasing popularity in lingua franca contexts, such as German HE, where English-medium courses are gaining ground. In this context, we report the validation of a new rating scale for assessing reading-into-writing tasks. To examine scoring validity, we employed Weir’s (2005) socio-cognitive framework in an explanatory mixed-methods design. We collected 679 integrated performances in four summary and opinion tasks, which were rated by six trained student raters. They are to become writing tutors for first-year students. We utilized a many-facet Rasch model to investigate rater severity, reliability, consistency, and scale functioning. Using thematic analysis, we analyzed think-aloud protocols, retrospective and focus group interviews with the raters. Findings showed that the rating scale overall functions as intended and is perceived by the raters as valid operationalization of the integrated construct. FACETS analyses revealed reasonable reliabilities, yet exposed local issues with certain criteria and band levels. This is corroborated by the challenges reported by the raters, which they mainly attributed to the complexities inherent in such an assessment. Applying Weir’s (2005) framework in a mixed-methods approach facilitated the interpretation of the quantitative findings and yielded insights into potential validity threads.