skip to main content skip to footer

GE FRST Evaluation Report: How Well Does a Statistically-Based Natural Language Processing System Score Natural Language Constructed Responses? GE FRST

Burstein, Jill; Kaplan, Randy M.
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Automation, Constructed Responses, Evaluation, General Electric Free-Response Scoring Tool (GE FRST), Automated Scoring and Natural Language Processing


There is a considerable interest at Educational Testing Service (ETS) to include performance-based, natural language constructed-response items on standardized tests. Such items can be developed, but the projected time and costs required to have these items scored by human graders would be prohibitive. In order for ETS to include these types of items on standardized tests, automated scoring systems need to be developed and evaluated. Automated scoring systems could decrease the time and costs required for human graders to score these items. This report details the evaluation of a statistically based scoring system, the General Electric Free-Response Scoring Tool (GE FRST). GE FRST was designed to score short-answer, constructed- responses of up to 17 words. The report describes how the system performs for responses on three different item types. For the sake of efficiency, it is important to evaluate systems on a number of item types to see if the system's scoring method can generalize to a number of item types. (30pp.)

Read More