How the e-rater® Engine Works

The e-rater® engine scores essays by extracting a set of features representing important aspects of writing quality from each essay. These features must not only be predictive of readers' scores, but must also have some logical correspondence to the features that readers are instructed to consider when they award scores. These scoring features are then combined in a statistical model to produce a final score estimate, with the weight of each feature determined by a statistical process designed to maximize the agreement with human scoring.

The features currently included in the e-rater scoring engine include:

  • content analysis based on vocabulary measures
  • lexical complexity/diction
  • proportion of grammar errors
  • proportion of usage errors
  • proportion of mechanics errors
  • proportion of style comments
  • organization and development scores
  • features rewarding idiomatic phraseology

The weighting of features to assign a total score to an essay can be done in a way tailored to a particular prompt, or in a "generic" fashion, so that the same e-rater model can be used to score responses to a variety of prompts. Work has also been done to establish a vertically linked scale of K–12 writing scores across grades based on the e-rater engine, known as the Developmental Writing Scale.

The features used for e-rater scoring are the result of nearly two decades of natural language processing research at ETS, and each feature may itself be composed of a multiplicity of independent sub-features. For instance, the grammatical error feature includes modules for detecting errors in preposition usage, run-on sentences and errors in subject-verb agreement. The e-rater engine is continually updated to reflect advances in natural language processing that can be applied to student texts.