skip to main content skip to footer

Computer Analysis of the TOEFL Test of Written English (TWE) TWE TOEFL ESL CAT

Frase, Lawrence T.; Faletti, Joseph; Ginther, April; Grant, Leslie
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Test of Written English (TWE), Test of English as a Foreign Language (TOEFL), Essay Tests, Computer Assisted Testing, Chinese, Arabic, Analysis of Variance (ANOVA), Sentence Structure, Predictor Variables, Vocabulary, Comparative Analysis, Factor Analysis, Databases, Discriminant Analysis, Writing Evaluation, Spanish Speaking, English as a Second Language (ESL)


This project had three main objectives: 1) to establish a database of essays written by different language groups on a variety of topics for the Test of Written English (TWE) that can be used in the future research; 2) to summarize, analyze, and compare the linguistic properties of those essays; and 3) to determine how the TWE performance of language groups relates to essay styles. As part of the first objectives of this project we created a database of 1,737 essays, a data matrix of essay variables, files containing sorted phrases and vocabulary of different language groups, and files of the common and unique vocabulary items for each pair of language groups. The essay sample consisted of TWE essays from five language groups, including Arabic, Chinese, English (including native-English in nonnative-English), and Spanish speakers. Essays from the English-speaking students in the United States were collected and scored to provide a baseline with which to compare essays by students who speak English as a second language (ESL). This report includes the analysis of 1006 variables for each essay along with summary analysis, such as correlation, analysis of variance, discriminative analysis, and factor analysis. Results show that topic differences affected some essay variables, but these effects were generally felt equally by the different language groups. Discriminant analysis suggest three features that might distinguish the performance of different language groups -- directness, expressiveness, and academic stance. Nevertheless, a linguistic analysis of the accuracy of underlying text analysis programs shows that for some word classes implicated in defining "academic" or "expressive" styles, cautions are needed in interpreting program outputs. Two variables that can be measured unambiguously by computer -- number of words and the average length of words -- taken together are quite predictive of TWE essays scores of non-English speakers.

Read More