Evaluating a Prototype Essay Scoring Procedure Using Off-the-Shelf Software

Author(s):
Kaplan, Randy M.; Burstein, Jill; Trenholm, Harriet; Lu, Chi; Rock, Donald A.; Kaplan, Bruce A.; Wolff, Susanne
Publication Year:
1995
Report Number:
RR-95-21
Source:
Document Type:
Subject/Key Words:
Automation Computer Software Constructed Responses Essay Tests Models Scoring

Abstract

Constructed-response items, whose responses consist of words, phrases, sentences, paragraphs, and essays, are among the most difficult and costly to score. The increased use of constructed-response items like essays creates a need for tools to partially or fully automatically score these responses. This study explores one approach to analyzing essay-length natural language constructed-responses. In this study we develop and evaluate a decision model for scoring essays. The decision model uses off- the-shelf software for grammar and style checking of the English language. The first part of this study consisted of an evaluation of several commercial grammar checking programs. From this evaluation we select the best performing grammar checking programs to construct a decision model for scoring the essays. The second part of the study uses data produced from the selected grammar checking program(s) to make a decision about the score for an essay. Through statistical and linguistic methods, we analyze the performance of the decision model in an effort to understand its usefulness and practicality in a production scoring setting. (80pp.)