A Preliminary Study of Raters for the TSE®
- Bejar, Isaac I.
- Publication Year:
- Report Number:
- RR-85-05, TOEFL-RR-18
- Document Type:
- Subject/Key Words:
- Evaluation methods evaluators interrater reliability scoring speech skills
The investigation was undertaken to provide information about the feasibility of reducing scoring costs by using one rater instead of the two that are now used for the TSE® (Test of Spoken English™). It was concluded that because of the possibility of different standards among potential raters, it does not appear feasible to use a single rater as the sole determiner of speaking proficiency under the current system. Other possible alternatives to a single rating, relying on psychometric methodology and technology, are discussed. The approach was to first examine the possibility of developing a "quality control" index that would predict the extent of the disagreement between two raters. The index that was developed for this purpose could not be validated. It was found that the best predictors of rater disagreement were the identities of the raters. The disagreements, however, resulted from the differing standards used by different raters. That is, raters agree substantially about the ordering of examinees but vary slightly in the severity of their ratings.