Evaluating the Advisory Flags and Machine Scoring Difficulty in the e-rater Automated Scoring Engine

Author(s):: Zhang, Mo; Chen, Jing; Ruan, Chunyi
Publication Year:: 2016
Report Number:: RR-16-30
Source:: ETS Research Report
Document Type:: Report
Page Count:: 16
Subject/Key Words:: Electronic Essay Rater (E-rater), Automated Scoring, Response Analysis, Machine Scoring, Human Raters

Abstract

The results suggested that some advisory flags operated more consistently across measures and tasks in detecting responses that the machine was likely to score differently from human raters than did other flags, and relatively little scoring difficulty was found for three of the four tasks examined in this study, with the relationship between machine and human scores being reasonably strong. Limitations and future studies are also discussed.

Request Copy (specify title and report number, if any)
http://dx.doi.org/10.1002/ets2.12116

Evaluating the Advisory Flags and Machine Scoring Difficulty in the e-rater Automated Scoring Engine

Abstract

Read More