Using Confusion Infusion and Confusion Reduction Indices to Compare Alternative Essay Scoring Rules

Author(s):: Dorans, Neil J.; Patsula, Liane N.
Publication Year:: 2003
Report Number:: RR-03-09
Source:: ETS Research Report
Document Type:: Report
Page Count:: 40
Subject/Key Words:: Electronic Essay Rater (E-rater), Confusion Reduction, Essay Scoring, Confusion Infusion

Abstract

Observed proportion agreement as a measure of association between two ratings of essay performance can be inflated when the number of rating categories is small. Cohen's Kappa adjusts observed agreement by subtracting out what one might expect if ratings were assigned independently of each other. The matrix of proportion agreements between two sets of assignment rules can be recast as a confusion matrix in which zero confusion is the equivalent of perfect agreement. Kappa can be viewed then as a measure of confusion reduction. A complementary measure, confusion infusion is defined. Its usefulness is illustrated with live data from a large-scale testing program where e-rater, an automatic essay-scoring algorithm, is used in place of a second reader. The confusion reduction and confusion infusion indices help make comparisons among the relative efficacy of two versions of e-rater, and two other methods of assigning scores, a second reader and assigning all candidates the mode of the first reading.

Request Copy (specify title and report number, if any)
http://dx.doi.org/10.1002/j.2333-8504.2003.tb01901.x

Using Confusion Infusion and Confusion Reduction Indices to Compare Alternative Essay Scoring Rules

Abstract

Read More