Reliability and Generalizability

In order to be useful in helping institutions gauge the English proficiency of potential students, TOEFL iBT® test scores need to be reliable and have the same meaning regardless of the institution that plans to use them.

For this reason, part of the TOEFL® program's ongoing research considers the criteria, tools and methods used to score the test. (See also our section on Constructed-Response Scoring and Technology.)

Featured Publications

These are some publications related to the reliability and generalizability of TOEFL iBT scores:

Reliability and Comparability of TOEFL iBT® Scores
ETS (2011)
TOEFL iBT Research Insight Series, Vol. 3 (2011)

Repeater Analyses for the TOEFL iBT® Test
Y. Zhang (2008)
ETS Research Memorandum No. RM-08-05

Dependability of Scores for a New ESL Speaking Test: Evaluating Prototype Tasks
Y.-W. Lee (2005)
TOEFL Monograph No. MS-28

Dependability of New ESL Writing Test Scores: Evaluating Prototype Tasks and Alternative Rating Schemes
Y.-W. Lee & R. Kantor (2005)
TOEFL Monograph No. MS-31