Investigating the Utility of Analytic Scoring for the TOEFL® Academic Speaking Test (TAST)

Author(s):
Xi, Xiaoming; Mollaun, Pam
Publication Year:
2006
Report Number:
RR-06-07, TOEFLiBT-01
Source:
Document Type:
Subject/Key Words:
TOEFL® iBT speaking analytic scoring score dependability dimension distinctness score profile G theory

Abstract

This study explores the utility of analytic scoring for the TOEFL® Academic Speaking Test (TAST) in providing useful and reliable diagnostic information in three aspects of candidates’ performance: delivery, language use, and topic development. G studies were used to investigate the dependability of the analytic scores, the distinctness of the analytic dimensions, and the variability of analytic score profiles. Raters’ perceptions of dimension separability were also obtained. Based on the phi coefficients and standard errors of measurement (SEMs), the dependability of analytic scores averaged across six tasks and double ratings was acceptable for both operational and practice settings. However, scores averaged across two tasks and double ratings were not reliable enough for operational use. Correlations among the analytic scores by task were high, but those between delivery and topic development were lower, and these results were corroborated by raters’ perceptions. When averaged across tasks or task types (two or more tasks), correlations among the analytic scores were very high, and the profiles of scores were flat. The utility of analytic scoring is discussed, and both score dependability and whether analytic scores provide diagnostic information beyond that provided by holistic scores are considered

Read More