skip to main content skip to footer

Fit of Item Response Theory Models: A Survey of Data from Several Operational Tests

Sinharay, Sandip; Haberman, Shelby J.; Jia, Helena
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Generalized Residual, Item Fit, Residual Analysis, Two-Parameter Logistic Model, Three Parameter Logistic Model


Standard 3.9 of the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council for Measurement in Education, 1999) demands evidence of model fit when an item response theory (IRT) model is used to make inferences from a data set. We applied two recently suggested methods for assessing goodness of fit of IRT models—generalized residual analysis (Haberman, 2009) and residual analysis for assessing item fit (Bock & Haberman, 2009)—to several operational data sets. We assessed the practical significance of misfit whenever possible. This report summarizes our findings. Though evidence of misfit of the IRT model was found for all the data sets, the misfit was not always practically significant.

Read More