An Empirical Investigation of Impact Moderation in Test Construction DIF

Author(s):: Stocking, Martha L.; Lawrence, Ida M.; Feigenbaum, Miriam; Jirele, Thomas J.; Lewis, Charles; Van Essen, Thomas
Publication Year:: 2001
Report Number:: RR-01-04
Source:: ETS Research Report
Document Type:: Report
Page Count:: 33
Subject/Key Words:: Bias, Gender Bias, Ethnic Bias, Test Construction, Differential Item Functioning (DIF), Expert Systems, Test Selection, Test Reliability

Abstract

This investigation constructed four different kinds of test sections using three methods of test assembly that incorporate the goals of simultaneous moderation of three different kinds of impact--gender impact, African American impact, and Hispanic American impact. The test sections were administered undetectably to random samples from the appropriate population. The results were evaluated by comparison of the characteristics of moderated sections with those of parallel operational sections. Almost all methods of test assembly produced either moderation of impact in the appropriate direction or no change in impact. Taking impact into account in test assembly tended to lower reliability slightly, reduce the relative efficiency for test takers in the middle score range while increasing efficiency for those with more extreme scores, raise concurrent validity slightly and maintain the construct measured by the parallel operational section.

Request Copy (specify title and report number, if any)
http://dx.doi.org/10.1002/j.2333-8504.2001.tb01846.x

An Empirical Investigation of Impact Moderation in Test Construction DIF

Abstract

Read More