skip to main content skip to footer

Multimethod Construct Validation of the Test of Spoken English TSE

Boldt, Robert F.; Oltman, Philip K.
Publication Year:
Report Number:
RR-93-58, TOEFL-RR-46
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Construct Validity, Factor Analysis, Multidimensional Scaling, Rating Scales, Test Validity, Test of Spoken English (TSE)


Administration of the TSE® test (Test of Spoken English™) yields tapes of oral performance on items within six sections of the test. Trained scorers subsequently rate responses using four proficiency scales: pronunciation, grammar, fluency, and overall comprehensibility. This project examined the consistency of statistical relations among TSE scores with the measurement constructs these scores purport to reflect. Analyses included factor analysis and multidimensional scaling, which examined dimensions underlying the scores. These dimensional methods were applied to the 18 scores yielded by the combinations of section and scale. Another analysis was applied to scale scores averaged over sections. This analysis compared the ranking of pairs of scale scores obtained during the original scoring of selected taped performances with the ranking resulting from a rescoring by different raters. Multidimensional scaling analyses revealed that three-dimensional solutions fit the scale intercorrelations with low stress values and that the coordinates of the scales fell into three clusters in three-dimensional space, those clusters being defined primarily by test section rather than by proficiency scales. An exploratory factor analysis revealed that a single dimension dominated the variation of the 18 section-scale scores. However, when tapes for scale scores with substantial discrepancies were rerated, agreement between the order of original and rerated scale scores far exceeded chance. This indicated that raters were able to modify their judgments according to the scale being rated. Subsequent exploratory factor analysis indicated that both section and scale factors contribute to score variation. The factors were highly correlated, and the predominance of the single factor in the exploratory analysis was seen as arising from those correlations.

Read More