Monitoring TOEIC Listening and Reading Test Performance Across Administrations Using Examinees’ Background Information TOEIC ELA EFL ESL
- Publication Year:
Powers, Donald E. (ed.) The Research Foundation for the TOEIC Tests: A Compendium of Studies: Volume II. Princeton, NJ: Educational Testing Service, Sep 2013, p11.1-11.28
- Document Type:
- Page Count:
- Subject/Key Words:
Test of English for International Communication (TOEIC),
English Language Proficiency,
English Language Assessment (ELA),
English Language Skills,
English as a Foreign Language (EFL),
English as a Second Language (ESL)
SUMMARY: The scoring process for the TOEIC Listening and Reading test includes monitoring procedures that help ensure that scores are consistent across different test forms (https://www.ets.org/understanding_testing/glossary/#test_forms) and test administrations, and that skill interpretations are fair. This study explores the possibility of using information about test takers' backgrounds in order to enhance several types of monitoring procedures in the scoring process. Results of the analyses suggested that some background variables may facilitate the monitoring of test performance across administrations. This has the potential to further enhance quality control procedures for the TOEIC Listening and Reading test and strengthen evidence of score consistency.
For this large-scale study, the researchers collected TOEIC Listening and Reading examinees’ scale scores and background information (i.e., educational and work experience, English-language experience, test-taking experience) for 1,499,313 examinees across 71 test administrations. Researchers used a variety of statistical techniques to examine the relationships between scores and the various background variables in order to determine the extent to which background variables predict scores. One of the techniques used was multilevel modeling, which enabled researchers to examine the extent to which background variables at the individual level (i.e., as related to individual examinees) and administration level (i.e., as related to groups of examinees) predicted scores. The researchers concluded that background variables were more precise predictors of scores at the administration level than at the individual level. The researchers therefore argue that the application of the statistical techniques identified in this research to examinees’ backgrounds at the administration level could be used to continuously evaluate the performance of the TOEIC tests across administrations.
Test developers need to monitor the stability of scores across test administrations. This is important for two reasons:
1. If group mean scores fluctuate substantially, this could indicate an issue with a particular administration or that something is different about the group of examinees for that administration.
2. If group mean scores drift over time, this could indicate a more fundamental change in the population of examinees that needs to be understood.
Additionally, the application of this method may help further improve equating procedures, which are routinely used to help ensure the consistency TOEIC Listening and Reading test scores across administrations. As a result, this research study is a supporting evidence to the consistency and reliability of the TOEIC listening and reading test scores.
ABSTRACT: The scoring process for the TOEIC Listening and Reading test includes monitoring procedures that help ensure that scores are consistent across different forms and test administrations, and that skill interpretations are fair. This study explores the possibility of using information about test takers' backgrounds in order to enhance several types of monitoring procedures. Results of the analyses suggested that some background variables may facilitate the monitoring of test performance across administrations, thereby strengthening quality control procedures for the TOEIC Listening and Reading test as well as strengthening evidence of score consistency.