skip to main content skip to footer

Cross-Validation of a Proportional Item Response Curve Model IRT TOEFL

Boldt, Robert F.
Publication Year:
Report Number:
RR-91-33, TOEFL-TR-04
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
TOEFL Research Committee, Equated Scores, Factor Analysis, Item Characteristic Curves, Item Response Theory (IRT), Models, Test of English as a Foreign Language (TOEFL)


In a previous research study, factor analyses of TOEFL® section item correlations (phi-coefficients) yielded a single factor in each section. This result is consistent with the assumption that, within sections, the item response curves are proportional, a possibility not previously suggested or explored in item response theoretical studies. One potential use to the TOEFL program is that this assumption could serve as a basis for simpler equating methods than are now being used. Use of the assumption entails estimating fewer parameters and the computations could be simpler than with current methods. Other anticipated benefits are that it could reduce the chances of error in calibrating items and would permit the use of smaller equating samples, which would in turn allow pretest evaluation of substantially more items for possible operational use. To increase the body of data evaluating the model, a cross- validation study was undertaken in which PIRC, 3PL, and a modified Rasch model were used to predict item scores of selected examinees on selected items. The study also compared predicted half-test scores with actual scores for each method. These comparisons were made using varying amounts of data to calibrate the items and calculate scores for the examinees in accordance with different estimation sample sizes. Surprisingly, the accuracy of predictions made by the models was approximately the same, and the estimation sample size appeared to make little difference. Further research in terms of randomization trials on the current data set and self-equatings similar to those conducted by Hicks (1984) is discussed.

Read More