Frequently Asked Questions About the e-rater® Technology

What is the technology used in the e-rater® engine?

The e-rater technology is an application of Natural Language Processing (NLP), a field of computer science and linguistics that uses computational methods to analyze characteristics of a text. Natural language processing methods support such burgeoning application areas as machine translation, speech recognition and information retrieval.

The e-rater engine uses NLP to identify features relevant to writing proficiency in training essays and their relationship with human scores. The resulting scoring model, which assigns weights to each observed feature, is stored offline in a database that can then be used to score new essays according to the same formula. The e-rater engine cannot read so it cannot evaluate essays the same way that human raters do. However, the features used in e-rater scoring have been developed to be as substantively meaningful as they can be, given the state of the art in natural language processing. They also have been developed to demonstrate strong reliability — often greater reliability than human raters themselves.

How often does the computer's score agree with the score of a faculty reader?

For tasks that are appropriate for the e-rater engine (essay-length writing tasks that are scored for writing quality rather than correctness of claims made in the response), agreement with human raters can be very strong. As Attali, Bridgeman & Trapani found in 2010, the e-rater engine's agreement with a human rater on the TOEFL® Independent and GRE® Issue tasks was higher than the agreement between two independent human raters.

Can the e-rater engine score hand-written essays?

No. The e-rater engine can only score essays that have been entered into the computer electronically.

What is an e-rater advisory?

The e-rater engine will generate an advisory if it has difficulty scoring or identifying some, or all, of the writing sample. Scores are reported with low confidence and summarized in a flagged message to the writer. Currently, the e-rater technology will flag essays that demonstrate a wide range of anomalous conditions, including excessive brevity, excessive length, significant repetition of material and responses determined to be off-topic.

What ETS products use the e-rater technology?

The e-rater engine is currently used in high-stakes assessments, low-stakes practice tests and formative applications. The engine is used in conjunction with human raters to score the essay portions of the TOEFL and GRE tests. The low-stakes practice tests where it is used without human ratings include TOEFL Practice Online and GRE ScoreItNow!™. The Criterion® classroom writing evaluation software is the primary example of formative use, in which feedback is provided in addition to essay scores in order to help students improve their essays and deepen their understanding of writing fundamentals.