From Biology to Education: Scoring and Clustering Multilingual Text Sequences and Other Sequential Tasks

Sukkarieh, Jana Z.; von Davier, Matthias; Yamamoto, Kentaro
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Bio-NLP Large-Scale Assessment Multilingual Automated Scoring Multilingual Character or Grapheme Sequences Natural Language Processing (NLP) Prefix-Infix Omission and Insertion Sequence Alignment and Clustering Sequential Tasks


This document describes a solution to a problem in the automatic content scoring of the multilingual character-by-character highlighting item type. This solution is language independent and represents a significant enhancement. This solution not only facilitates automatic scoring but plays an important role in clustering students’ responses; consequently, it has a nontrivial impact on the refinement of the items and/or their scoring guidelines. Furthermore, though designed for a specific problem, the proposed solution is general enough for any educational task that can be transformed into a sequential one. To name a few: It can be used for a set of actions expected from a student in simulations or learning trajectories as projected by a teacher, inside an intelligent tutoring system, or even in a game—or it can simply be used for a set of student clicks, button selections, or keyboard hits expected to reach a correct answer. This solution provides flexibility for existing automatic-scoring techniques and potentially could provide more flexibility if coupled with statistical data-mining techniques.

Read More