skip to main content skip to footer

An Investigation of the e-rater Automated Scoring Engine’s Grammar, Usage, Mechanics, and Style Microfeatures and Their Aggregation Model GUMS

Chen, Jing; Zhang, Mo; Bejar, Isaac I.
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
e-rater, Automated Scoring and Natural Language Processing, Aggregation, Alternative Feature Weighting Schemes, Linear Regression, Machine Scores, Generalized User Modelling System (GUMS), Test Score Validity, Automated Essay Scoring (AES), Grammar, Language Usage, Language Styles, Writing Processes


Automated essay scoring (AES) generally computes essay scores as a function of macrofeatures derived from a set of microfeatures extracted from the text using natural language processing (NLP). In the e-rater automated scoring engine, developed at Educational Testing Service (ETS) for the automated scoring of essays, each of four macrofeatures (grammar, usage, mechanics, and style [GUMS]) is computed from a set of microfeatures. Statistical analyses reveal that some of these microfeatures might not explain much of the variance in human scores regardless of the writing tasks. Currently, the microfeatures in the same macrofeature group are equally weighted to produce the macrofeature score. We propose an alternative weighting scheme that gives higher weights to the microfeatures that are more predictive of human scores in each macrofeature group. Our results suggest that even though there is negligible difference between the proposed and the current equal weighting schemes and the current model in terms of the prediction of human scores and the correlation with external measures, our scheme improves the consistency of the resultant macrofeature scores across writing tasks to a considerable extent.

Read More