Understanding Average Score Differences Between e-rater and Humans for Demographic-Based Groups in the GRE General Test AES GRE

Author(s):: Ramineni, Chaitanya; Williamson, David M.; Weng, Vincent Z.
Publication Year:: 2014
Source:: Wendler, Cathy; Bridgeman, Brent (eds.) with assistance from Chelsea Ezzo. The Research Foundation for the GRE revised General Test: A Compendium of Studies. Princeton, NJ: Educational Testing Service, 2014, p4.9.1-4.9.5
Document Type:: Chapter
Page Count:: 5
Subject/Key Words:: Graduate Record Examination (GRE), General Test (GRE), e-rater, Revised GRE, Human Raters, Automated Essay Scoring (AES)

Abstract

Examined possible root causes for those discrepancies seen in Chapter 4.8 (of "A Comprehensive Meta-Analysis of the Predictive Validity of the GRE: Implications for Graduate Student Selection and Performance") between scores generated by human raters and those generated by e-rater across various subgroups. The research suggested that e-rater is not severe enough on grammatical language errors (compared to humans), tends to overvalue long essays, and occasionally undervalues content.

http://www.ets.org/s/research/pdf/gre_compendium.pdf

Understanding Average Score Differences Between e-rater and Humans for Demographic-Based Groups in the GRE General Test AES GRE

Abstract

Read More