Abstract

Possible integrated and independent tasks were pilot tested for the writing section of a new generation of the TOEFL® (Test of English as a Foreign Language™). This study examines the impact of various rating designs and of the number of tasks and raters on the reliability of writing scores based on integrated and independent tasks from the perspective of generalizability theory (G-theory). Both univariate and multivariate G-theory analyses were conducted. It was found that (a) in terms of maximizing the score dependability, it would be more efficient to increase the number of tasks rather than the number of raters per essay; (b) two particular single-rating designs of “having different tasks for the same examinee rated by different raters” [p × (R:T), R:(p × T)] achieved relatively higher score dependability than other single-rating designs; and (c) a somewhat larger gain in composite score reliability was achieved when the number of listening–writing tasks was larger than that of reading–writing tasks.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.