Agreement Between Expert System and Human Ratings of Constructed-Responses to Computer Science Problems CAT

Author(s):: Bennett, Randy Elliot; Gong, Brian; Kershaw, Roger C.; Rock, Donald A.; Soloway, Elliot; Macalalad, Alex
Publication Year:: 1988
Report Number:: RR-88-20
Source:: ETS Research Report
Document Type:: Report
Page Count:: 57
Subject/Key Words:: Computer Assisted Testing, Computer Science, Constructed Responses, Interrater Reliability, MicroPROUST, Scoring

Abstract

If computers can be programmed to score complex constructed response items, substantial savings in selected ETS programs might be realized and-the development of mastery assessment systems that incorporate "real-world" tasks might be facilitated. This study investigated the extent of agreement between MicroPROUST, a prototype microcomputer-based expert scoring system, and human readers for two Advanced Placement Computer Science free-response items. To assess agreement, a balanced incomplete block design was used with two groups of four readers grading 43 student solutions to the first problem and 45 solutions to the second. Readers assigned numeric grades and diagnostic comments in separate readings. Results showed MicroPROUST to be unable to grade a significant portion of solutions, but to perform impressively on those solutions it could analyze. MicroPROUST's interchangeability with human readers on one problem suggests that there are conditions under which automated scoring of complex constructed-responses might be implemented by ETS. (57pp.)

Request Copy (specify title and report number, if any)
http://dx.doi.org/10.1002/j.2330-8516.1988.tb00276.x

Agreement Between Expert System and Human Ratings of Constructed-Responses to Computer Science Problems CAT

Abstract

Read More