Observed proportion agreement as a measure of association between two ratings of essay performance can be inflated when the number of rating categories is small. Cohen's Kappa adjusts observed agreement by subtracting out what one might expect if ratings were assigned independently of each other. The matrix of proportion agreements between two sets of assignment rules can be recast as a confusion matrix in which zero confusion is the equivalent of perfect agreement. Kappa can be viewed then as a measure of confusion reduction. A complementary measure, confusion infusion is defined. Its usefulness is illustrated with live data from a large-scale testing program where e-rater, an automatic essay-scoring algorithm, is used in place of a second reader. The confusion reduction and confusion infusion indices help make comparisons among the relative efficacy of two versions of e-rater, and two other methods of assigning scores, a second reader and assigning all candidates the mode of the first reading.