Dependability of New ESL Writing Test Scores: Evaluating Prototype Tasks and Alternative Rating Schemes

Author(s):
Yong-Won, Lee; Kantor, Robert
Publication Year:
2005
Report Number:
RR-05-14, TOEFL-MS-31
Source:
Document Type:
Subject/Key Words:
Absolute error dependability index ESL (English as a second language) generalizability coefficient generalizability theory integrated task rating design relative error score dependability task generalizability variance components writing assessment

Abstract

Possible integrated and independent tasks were pilot tested for the writing section of a new generation of the TOEFL® test (Test of English as a Foreign Language™) examination. This study examines the impact of various rating designs as well as the impact of the number of tasks and raters on the reliability of writing scores based on integrated and independent tasks from the perspective of generalizability theory (G-theory). Both univariate and multivariate G-theory analyses were conducted. It was found that (a) in terms of maximizing the score reliability, it would be more efficient to increase the number of tasks rather than the number of ratings per essay; (b) two particular single-rating designs having different tasks for the same examinee rated by different raters [p x (R:T), R:(p x T)] achieved relatively higher score reliabilities than other single-rating designs; and (c) a somewhat larger gain in composite score reliability was achieved when the number of listening-writing tasks was larger than the number of reading-writing tasks.

Read More