The Effects of Rater Severity and Rater Distribution on Examinees’ Ability Estimation for Constructed-Response Items

Wang, Zhen; Yao, Lihua
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
IRT-Based Rater Model Item Response Theory (IRT) Markov Chain Monte Carlo (MCMC) Rater Distribution Rater Severity


The current study used simulated data to investigate the properties of a newly proposed method (Yao’s rater model) for modeling rater severity and its distribution under different conditions. Our study examined the effects of rater severity, distributions of rater severity, the difference between item response theory (IRT) models with rater effect and without rater effect, and the difference between the precision of the ability estimates for tests composed of only constructed-response (CR) items and for tests composed of multiple-choice (MC) and CR items combined. Our results indicate that rater severity and its distribution can increase the bias of examinees’ ability estimates and lower test reliability. Moreover, using an IRT model with rater effects can substantially increase the precision in the examinees’ ability estimates, especially when the test was composed of only CR items. We also compared Yao’s rater model with Muraki’s rater effect model (1993) in terms of ability estimation accuracy and rater parameter recovery. The estimation results from Yao’s rater model using Markov chain Monte Carlo (MCMC) were better than those from Muraki’s rater effect model using marginal maximum likelihood.

Read More