skip to main content skip to footer

An Empirical Investigation of Impact Moderation in Test Construction DIF

Stocking, Martha L.; Lawrence, Ida M.; Feigenbaum, Miriam; Jirele, Thomas J.; Lewis, Charles; Van Essen, Thomas
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Bias, Gender Bias, Ethnic Bias, Test Construction, Differential Item Functioning (DIF), Expert Systems, Test Selection, Test Reliability


This investigation constructed four different kinds of test sections using three methods of test assembly that incorporate the goals of simultaneous moderation of three different kinds of impact--gender impact, African American impact, and Hispanic American impact. The test sections were administered undetectably to random samples from the appropriate population. The results were evaluated by comparison of the characteristics of moderated sections with those of parallel operational sections. Almost all methods of test assembly produced either moderation of impact in the appropriate direction or no change in impact. Taking impact into account in test assembly tended to lower reliability slightly, reduce the relative efficiency for test takers in the middle score range while increasing efficiency for those with more extreme scores, raise concurrent validity slightly and maintain the construct measured by the parallel operational section.

Read More