Statistical Analyses for the Expanded TOEIC Speaking Test
- Qu, Yanxuan; Cid, Jaime; Chan, Eric; Huo, Yan
- Publication Year:
- Report Number:
- ETS Research Memorandum
- Document Type:
- Page Count:
- Subject/Key Words:
- Analysis of Covariance (ANCOVA) Item Difficulty Item Formats Pilot Study Test of English for International Communication (TOEIC) Test Reliability TOEIC Speaking Test
Testing programs should periodically review their assessments to ensure that their items or tasks are well-aligned with real-world activities. For this reason, to better support communicative language learning and to discourage the use of memorization and test-taking strategies, ETS expanded the existing format of some items of the TOEICSpeaking test in May 2015. The primary objective of the item format expansion was to benefit both test takers and score users. Therefore, it was also important to ensure that the new expanded item formats were comparable to the existing formats.
This paper reports the results of a pilot study conducted in November 2013 to evaluate the comparability of items with new and existing formats in terms of difficulty, score consistency, and the overall test reliability. This report also summarizes the operational trends observed after the implementation of the expanded item formats.
The results of the pilot study suggest that even though modifications to existing item formats had a slight effect on the difficulty of items, as some items were more difficult and others were less difficult, the effects observed were within the range of variation typically observed across different forms of the test. Further monitoring of the difficulties of the new item formats based on real testing results also indicates that items with the new formats have performed similarly as items with existing formats.
This report shows that the expansion in the TOEIC Speaking item formats did not have any significant unwanted effects on item difficulty or test score reliability, indicating that the TOEIC Speaking test scores stay consistent and reliable with more authentic real-world tasks.