skip to main content skip to footer

The Feasibility of Modeling Secondary TOEFL Ability Dimensions Using Multidimensional IRT Models TOEFL IRT

McKinley, Robert L.; Way, Walter D.
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
TOEFL Research Committee, Item Response Theory (IRT), Models, Multidimensional Item Response Theory (MIRT), Performance Factors, Test of English as a Foreign Language (TOEFL)


An analysis of the skills necessary for performance on the TOEFL® test tends to support the view that there are important, although perhaps subtle, secondary dimensions present in the test. Given that these subtle secondary ability dimensions may be present in examinee response data and that they do represent meaningful psychological variables, the purpose of this research was to explore the feasibility of an IRT-based method of modeling examinee performance on these secondary ability dimensions. The procedure investigated is based on a multidimensional extension of the IRT model currently used for equating the TOEFL test. Both exploratory multidimensional IRT (MIRT) and confirmatory multidimensional IRT (CMIRT) models were investigated in the study. The work performed included the application of unidimensional IRT, MIRT, and CMIRT models to two TOEFL forms to evaluate the extent to which model fit is enhanced by using a multidimensional model and to determine to what extent the additional fitted ability dimensions correspond to meaningful cognitive processes or content areas. The results of this study indicate that the MIRT and CMIRT procedures were successful in modeling secondary ability dimensions on the TOEFL test. The two procedures provided corroborative evidence in interpreting the structure of the test that was consistent with previous interpretations of the test's structure. The data presented in this study also provide an illustration of how a particular criterion for assessing model fit—the consistent Akaike information criterion—can be utilized to identify the best of several competing models of test structure.

Read More