Automated Scoring of Math Responses
ETS's m-rater scoring engine is used for scoring open-ended mathematical responses, such as those which take the form of mathematical expressions, equations or graphs. Dating from the late 1990s, the m-rater scoring engine is one of the longest first-standing ETS automated scoring capabilities. The scores generated by the m-rater engine demonstrate very strong agreement with human ratings.
The m-rater scoring engine evaluates the correctness of a mathematical expression by determining symbolically, using a computer algebra system, if the expression is equivalent to the correct response. This enables the m-rater scoring engine based on numerical equivalence, enabling it to identify expressions equivalent to the key no matter what form they are found in, and to assign credit as appropriate. For instance, partial credit may be assigned if a linear equation was supposed to be provided in slope-intercept form, but was instead provided in a different, equivalent form. Scoring of mathematical responses based on string matching or text-based patterns is much more limited and error-prone than the m-rater scoring engine's capabilities for establishing equivalence symbolically.
Similarly, graph items can be scored based on a key which specifies constraints on the response entered with the graph editor. For some items, many different graphs may constitute valid answers, and the m-rater scoring engine can allow all of these variants to be scored using an elegant specification of the key.
Of course many math items are written to elicit short, text-based responses and may be more suitable for the c-rater™ engine. Written responses with embedded equations can even be handled using a hybrid of the m-rater and c-rater scoring engines.
Below are some recent or significant publications that our researchers have authored on the subject of automated scoring of mathematical content.
Difficulty Modeling and Automatic Generation of Quantitative Items: Recent Advances and Possible Next Steps
E. A. Graf & J. H. Fife
Chapter in Automatic Item Generation: Theory and Practice, pp. 157–180
Editors: M. Gierl & T. Haladyna
This ETS-authored chapter is part of a book volume that aims to summarize current knowledge about the field of automatic item generation. The chapter appears in Part III of the volume, which covers psychological and substantive characteristics of generated items. Read more on the publisher’s website.
Automated Scoring of Constructed-Response Literacy and Mathematics Items
R. E. Bennett
Publisher: White paper published by Arabella Philanthropic Investment Advisors
The Race to the Top assessment consortia have indicated an interest in using "automated scoring" to more efficiently grade student answers. This white paper identifies potential uses and challenges around automated scoring to help the consortia make better-informed planning and implementation decisions. Download the full report
Automated Scoring of CBAL Mathematics Tasks with m-rater
J. H. Fife
ETS Research Memorandum RM-11-12
The goal of the CBAL™ research initiative is to develop a research-based assessment system that provides accountability testing and formative testing in an environment that is a worthwhile learning experience in and of itself. This report describes the m-rater-related automated scoring work done in CBAL Mathematics in 2009. View the full abstract or download this report.
Online Assessment in Mathematics
B. Sandene, R. E. Bennett, J. Braswell, & A. Oranje
Online Assessment in Mathematics and Writing: Reports from the NAEP Technology-based Assessment Project (NCES 2005–457)
U.S. Department of Education, National Center for Education Statistics
The Math Online (MOL) study is one of three field investigations in the National Assessment of Educational Progress (NAEP) Technology-Based Assessment Project, which explores the use of new technology in administering NAEP. Learn more or download the full report.