Using Automated Scoring as a Trend Score: The Implications of Score Separation Over Time

Author(s):: Trapani, Catherine S.; Bridgeman, Brent; Breyer, F. Jay
Publication Year:: 2014
Source:: Wendler, Cathy; Bridgeman, Brent (eds.) with assistance from Chelsea Ezzo. The Research Foundation for the GRE revised General Test: A Compendium of Studies. Princeton, NJ: Educational Testing Service, 2014, p2.4.1-2.4.4
Document Type:: Chapter
Page Count:: 4
Subject/Key Words:: Graduate Record Examination (GRE), Revised GRE, Test Revision, Score Scale, Analytical Writing, Writing Tasks, e-rater, Essay Prompts, Human Raters, Automated Essay Scoring (AES)

Abstract

Describes an evaluation of the ETS automated scoring engine, e-rater, as a way to ensure stability of the scale used with the Analytical Writing measure. Analyses examined the agreement of e-rater with human scores in terms of percent agreement, correlation, and mean differences and the relationship of external variables to the scores produced by e-rater. The study also attempted to determine if changes in agreements between human and e-rater scores provided a plausible method for monitoring changes in human scoring over time. Results indicated that exact and adjacent rates of agreement between human raters and e-raters were acceptable and on par with previous research. In addition, results indicated that monitoring discrepancies between scores generated by human raters and e-rater over time helps to assure consistency in the meaning of Analytical Writing scores.

http://www.ets.org/s/research/pdf/gre_compendium.pdf

Using Automated Scoring as a Trend Score: The Implications of Score Separation Over Time

Abstract

Read More