Fit of Item Response Theory Models: A Survey of Data from Several Operational Tests

Author(s):: Sinharay, Sandip; Haberman, Shelby J.; Jia, Helena
Publication Year:: 2011
Report Number:: RR-11-29
Source:: ETS Research Report
Document Type:: Report
Page Count:: 80
Subject/Key Words:: Generalized Residual, Item Fit, Residual Analysis, Two-Parameter Logistic Model, Three Parameter Logistic Model

Abstract

Standard 3.9 of the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, & National Council for Measurement in Education, 1999) demands evidence of model fit when an item response theory (IRT) model is used to make inferences from a data set. We applied two recently suggested methods for assessing goodness of fit of IRT models—generalized residual analysis (Haberman, 2009) and residual analysis for assessing item fit (Bock & Haberman, 2009)—to several operational data sets. We assessed the practical significance of misfit whenever possible. This report summarizes our findings. Though evidence of misfit of the IRT model was found for all the data sets, the misfit was not always practically significant.

Request Copy (specify title and report number, if any)
http://dx.doi.org/10.1002/j.2333-8504.2011.tb02265.x

Fit of Item Response Theory Models: A Survey of Data from Several Operational Tests

Abstract

Read More