A Comparison of the Properties of IRT Parameter Estimates Using Two Different Calibration Designs IRT

DeChamplain, Andre F.; Wightman, Lawrence E.
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Item Analysis, Item Response Theory (IRT), LOGIST (Computer Program)., Parameter Estimation, Pretests


(29pp.) The purpose of this study was to compare two different methods of obtaining 3PL IRT pretest item parameter estimates for the Graduate Management Admissions Testing Program. The first method consisted of calibrating pretest and operational items simultaneously in a LOGIST run, that is, a concurrent calibration design. The second approach entailed analyzing the pretest items separately from the operational items holding examinee ability scores constant from a previous operational items run, that is, using a two-stage calibration design. Results show that the means of the item difficulty (b-parameter) estimates were very similar, regardless of the method employed. However, the higher b-parameter values using the two-stage calibration run method (i.e., holding ability fixed excluding the studied items from the criterion) were slightly overestimated, and the lower b-parameter values were slightly underestimated. The a-parameters were consistently underestimated using the two-stage estimation procedure. Finally, the slopes of the item-ability regressions using a concurrent calibration (including the studied items) are steeper for nearly all of the items. The preliminary results are consistent with those reported in past studies (Stocking & Eignor, 1986) and suggest that non- operational (pretest) items should be calibrated concurrently with operational items for item banking purposes.

