The e-rater® automated writing evaluation engine is ETS's patented capability for automated evaluation of expository, persuasive and summary essays. Multiple assessment programs use the engine. The engine is used in combination with human raters to score the writing sections of the TOEFL iBT® and GRE® tests.
The e-rater engine is also used as the sole score in learning contexts, such as formative use in a classroom setting with ETS's Criterion® online essay evaluation system. In the Criterion application, the engine is used to generate individualized feedback for students, addressing an increasingly important need for automated essay evaluation that is reliable, valid, fast and flexible.
The e-rater engine features related to writing quality include:
- errors in grammar (e.g., subject-verb agreement)
- usage (e.g., preposition selection)
- mechanics (e.g., capitalization)
- style (e.g., repetitious word use)
- discourse structure (e.g., presence of a thesis statement, main points)
- vocabulary usage (e.g., relative sophistication of vocabulary)
- sentence variety
- source use
- discourse coherence quality
The e-rater engine can also automatically detect responses that are off-topic or otherwise anomalous and, therefore, should not be scored.
ETS has an active research agenda that investigates new automated scoring features for genres of writing beyond traditional essay genres, and now includes source-based and argumentative writing tasks found on assessments, as well as lab reports or social science papers.
Featured Publications
Below are some recent or significant publications that our researchers have authored that highlight research in automated writing evaluation.
2017
-
Exploring Relationships Between Writing & Broader Outcomes with Automated Writing Evaluation
J. Burstein, D. McCaffrey, B. Beigman Klebanov, & G. Ling
Paper in Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 101–108The exploratory study was conducted using test-taker essays from a standardized writing assessment of postsecondary student learning outcomes. Findings showed that for the essays, automated writing evaluation (AWE) features were found to be predictors of broader outcomes measures: college success indicators and learning outcomes measures. Learn more about this publication >
-
Detecting Good Arguments in a Non-Topic-Specific Way: An Oxymoron?
B. Beigman Klebanov, B. Gyawali, & Y. Song
Paper in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Volume 2: Short Papers, pp. 244–249We investigate the extent to which it is possible to close the performance gap between topic-specific and across-topics models for identification of good arguments. Learn more about this publication >
2016
-
Informing Automated Writing Evaluation Using the Lens of Genre: Two Studies
J. Burstein, N. Elliott, & H. Molloy
CALICO Journal, Vol. 33, No. 1To construct-relevant systems used for writing instruction and assessment, researchers conducted two investigations of post-secondary writing requirements and faculty perceptions of student writing proficiency. Study results suggested ways that the role of automated writing evaluation might be expanded and aligned with instruction in higher education. Learn more about this publication
2015
-
Supervised Word-Level Metaphor Detection: Experiments with Concreteness and Reweighting of Examples
B. Beigman-Klebanov, C. W. Leong, & M. Flor
Paper in Proceedings of the Third Workshop on Metaphor in NLP, pp. 11–20The authors discuss a supervised machine learning system that classifies all content words in a running text as either metaphorical or nonmetaphorical. Learn more about this publication
-
Automated Scoring of Picture-based Story Narration
S. Somasundaran, C.-M. Lee, M. Chodorow, & X. Wang
Paper in Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 42–48This paper describes an investigation of linguistically motivated features for automatically scoring a spoken picture-based narration task. Learn more about this publication
-
Scoring Persuasive Essays Using Opinions and their Targets
N. Farra, S. Somasundaran, & J. Burstein
Paper in Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 64–74In this work, researchers investigate whether the analysis of opinion expressions can help in scoring persuasive essays. Experiments on test taker essays show that essay scores produced using opinion features are indeed correlated with human scores. Learn more about this publication
-
Automated Writing Evaluation: A Growing Body of Knowledge
M. Shermis, J. Burstein, N. Elliot, S. Miel, & P. Foltz. In C. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of Writing Research, 2nd Edition Guilford PressThe authors present automated writing evaluation in terms of the categories of evidence that are used to demonstrate that these systems are useful in teaching and assessing writing. Learn more about this publication
-
Automated Analysis of Text in Graduate School Recommendations
M. Heilman, F. J. Breyer, F. Williams, D. Klieger, & M. Flor
ETS Research Report No. RR-15-23This report explores evaluation of sentiment in letters of recommendation. Researchers developed and evaluated an approach to analyzing recommendations that involves (a) identifying which sentences are actually about the student; (b) measuring specificity; (c) measuring sentiment; and (d) predicting recommender ratings. Learn more about this publication
-
Patterns of Misspellings in L2 and L1 English: A View from the ETS Spelling Corpus
M. Flor, Y. Futagi, M. Lopez, & M. Mulholland
Bergen Language and Linguistics Studies, Vol. 6This paper presents a study of misspellings, based on annotated data from ETS's spelling corpus. Researchers examined data from the TOEFL® and GRE tests and found that the rate of misspellings decreased as writing proficiency (essay score) increased for test takers in both testing programs. Learn more about this publication
2014
-
Content Importance Models for Scoring Writing From Sources
B. Beigman Klebanov, N. Madnani, N., J. Burstein, & S. Somasundaran
Paper in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short Papers), pp. 247–252This paper describes an integrative summarization task used in an assessment of English proficiency for nonnative speakers applying to higher education institutions in the United States. Researchers evaluate a variety of content importance models that help predict which parts of the source material the test taker would need to include in a successful response. Learn more about this publication
-
Using Writing Process and Product Features to Assess Writing Quality and Explore How Those Features Relate to Other Literacy Tasks
P. Deane
ETS Research Report No. RR-14-03This report explores automated methods for measuring features of student writing and determining its relationship to writing quality and other features of literacy, such as reading test scores. The e-rater automated essay-scoring system and keystroke logging are a central focus. Learn more about this publication
-
Predicting Grammaticality on an Ordinal Scale
M. Heilman, A. Cahill, N. Madnani, M. Lopez, M. Mulholland, & J. Tetreault
Paper in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Short Papers), pp. 174–180This paper describes a system for predicting the grammaticality of sentences on an ordinal scale. Such a system could be used in educational applications such as essay scoring. Learn more about this publication
-
An Explicit Feedback System for Preposition Errors based on Wikipedia Revisions
N. Madnani & A. Cahill
Paper in Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 79–88In this paper, the authors describe a novel tool they developed to provide automated explicit feedback to language learners based on data mined from Wikipedia revisions. They demonstrate how the tool works for the task of identifying preposition selection errors. Learn more about this publication
-
Difficult Cases: From Data to Learning and Back
B. Beigman Klebanov & E. Beigman
Paper in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Short Papers), pp. 390–396This paper addresses cases in annotated datasets that are difficult to annotate reliably. Using a semantic annotation task, the authors provide empirical evidence that difficult cases can thwart supervised machine learning on the one hand and provide valuable insights into the characteristics of the data representation chosen for the task on the other. Learn more about this publication
-
Different Texts, Same Metaphors: Unigrams and Beyond
B. Beigman Klebanov, C. Leong, M. Heilman, & M. Flor (2014)
Paper in Proceedings of the Second Workshop on Metaphor in NLP, pp. 11–17This paper describes the development of a supervised learning system to classify all content words in a running text as either being used metaphorically or not. Learn more about this publication
-
Lexical Chaining for Measuring Discourse Coherence Quality in Test-taker Essays
S. Somasundaran, J. Burstein, & M. Chodorow
In The 25th International Conference on Computational Linguistics (COLING), Dublin, Ireland, August 23–29, 2014.
Paper in Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 950–961Researchers investigated a technique known as lexical chaining for measuring discourse coherence quality in test-taker essays. In this paper, they describe the contexts in which they achieved the best system performance. Learn more about this publication
-
Applying Argumentation Schemes for Essay Scoring
Y. Song, M. Heilman, B. Beigman Klebanov, & P. Deane
Paper in Proceedings of the First Workshop on Argumentation Mining, pp. 69–78In this paper, the authors develop an annotation approach based on the theory of argumentation schemes to analyze the structure of arguments and implement an NLP system for automatically predicting where critical questions are raised in essays. Learn more about this publication
2013
-
Handbook of Automated Essay Evaluation: Current Applications and New Directions
M. D. Shermis & J. BursteinThis comprehensive, interdisciplinary handbook reviews the latest methods and technologies used in automated essay evaluation (AEE) methods and technologies. New York: Routledge. Learn more about this publication
-
Word Association Profiles and their Use for Automated Scoring of Essays
B. Beigman Klebanov & M. Flor
Paper in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1148–1158The authors describe a new representation of the content vocabulary in a text, which they refer to as "word association profile." The paper presents a study of the relationship between quality of writing and word association profiles. Learn more about this publication
-
Robust Systems for Preposition Error Correction Using Wikipedia Revisions
A. Cahill, N. Madnani, J. Tetreault, & D. Napolitano
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 507–517, Atlanta, Ga.This paper addresses the lack of generalizability in preposition error correction systems across different test sets. The authors then present a large new annotated corpus to be used in training such systems, and illustrate the use of the corpus in training systems across three separate test sets. Learn more about this publication
-
Detecting Missing Hyphens in Learner Text
A. Cahill, M. Chodorow, S. Wolff & N. Madnani
In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 300–305, Atlanta, Ga.This paper presents a method for automatically detecting missing hyphens in English text. Learn more about this publication
-
The e-rater® Automated Essay Scoring System
J. Burstein, J. Tetreault, & N. Madnani. In M. D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Scoring: Current Applications and Future Directions. New York: Routledge.This handbook chapter includes a description of the e-rater automated essay scoring system and its NLP-centered approach, and a discussion of the system's applications and development efforts for current and future educational settings. Learn more about this publication
2012
-
A Fast and Flexible Architecture for Very Large Word n-gram Datasets
M. Flor
Natural Language Engineering, FirstView online publication, pp. 1–33This paper presents a versatile architecture that uses a novel architecture, features lossless compression, and optimizes both speed and memory use. Learn more about this publication
-
Correcting Comma Errors in Learner Essays, and Restoring Commas in Newswire Text
R. Israel, J. Tetreault, & M. Chodorow (2012)
Proceedings of the 2012 Meeting of the North American Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 284–294
Association for Computational LinguisticsThe authors present a system for detection and correction of the placement of commas in English-language sentences. The system likewise can restore commas in well-crafted sentences. Learn more about this publication
-
On Using Context for Automatic Correction of Non-Word Misspellings in Student Essays
M. Flor & Y. Futagi
Proceedings of the 7th Workshop on Innovative Use of Natural Language Processing for Building Educational Applications (BEA) pp. 105–115The authors discuss a new system for spell-checking that uses contextual information to perform automatic correction of non-word misspellings. The article relates how the system has been evaluated against a large body of TOEFL® and GRE® essays, which were written by both native and nonnative English speakers. Learn more about this publication
2010
-
Using Parse Features for Preposition Selection and Error Detection
J. Tetreault, J. Foster, & M. Chodorow
Proceedings of the 2010 Association for Computational Linguistics (ACL 2010)
Association for Computational LinguisticsThis paper evaluates the effect of adding features that aim to improve the detection of preposition errors in writing from speakers of English as a second language. Learn more about this publication
-
Progress and New Directions in Technology for Automated Essay Evaluation
J. Burstein & M. Chodorow
The Oxford Handbook of Applied Linguistics, 2nd Edition, pp. 487–497
Oxford University PressThis ETS-authored work is part of a 39-chapter volume that covers topics in applied linguistics with the goal of providing a survey of the field, showing the many connections among its subdisciplines, and exploring likely directions of its future development. Learn more about this publication
-
Using Entity-Based Features to Model Coherence in Student Essays
J. Burstein, J. Tetreault, & S. Andreyev
Human language technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp. 681–684
Association for Computational LinguisticsThis paper describes a study in which researchers combined an algorithm for observing what computational linguists refer to as entities — nouns and pronouns — with natural language processing features related to grammar errors and word usage with the aim of creating applications that can evaluate evidence of coherence in essays. Learn more about this publication
2008
-
A Developmental Writing Scale
Y. Attali & D. Powers
ETS Research Report No. RR-08-19This report describes the development of grade norms for timed-writing performance in two modes of writing: persuasive and descriptive. Learn more about this publication
2006
-
Automated Essay Scoring With e-rater v.2.0
Y. Attali & J. Burstein
Journal of Technology, Learning, and Assessment, Vol. 4, No. 3This article describes Version 2 of ETS's e-rater essay scoring engine. The authors present evidence on the validity and reliability of the scores that the system generates. Learn more about this publication
2003
-
Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays
J. Burstein, D. Marcu, & K. Knight
IEEE Intelligent Systems: Special Issue on Advances in Natural Language Processing, Vol. 18, No. 1, pp. 32–39In this article, the authors discuss the use of automated essay-scoring applications in the elementary through university levels for large-scale assessment and classroom instruction. Learn more about this publication
Find More Articles
View more research publications related to automated scoring of writing quality.