skip to main content skip to footer

 

TOEIC® Research

Advancing English-language assessment, teaching and learning

Select a topic below to learn more about the TOEIC® Research Program.

 

Validity and Fairness of TOEIC Score Interpretations

TOEIC test design and development processes are systematic and rigorous, so score users can be confident that score interpretations are meaningful, fair and relevant to the real world. The substantiating research examines: 1) whether scores mean what they are intended to mean and 2) that TOEIC score interpretations about English skills are unbiased, fair and reflect real-world ability.

Justifying the Construct Definition for a New Language Proficiency Assessment: The Redesigned TOEIC Bridge® Tests — Framework Paper

This paper describes the motivations behind the design of the redesigned TOEIC Bridge assessments to measure all four communication skills, the purposes of the assessments and how we defined English-language listening, reading, speaking and writing proficiency in everyday contexts for basic to intermediate learners. This information provides a basis for test development and subsequent validity research. 

Read more

Read Justifying the Construct Definition for a New Language Proficiency Assessment: The Redesigned TOEIC Bridge Tests — Framework Paper

Development of the Redesigned TOEIC Bridge® Tests

The redesigned TOEIC Bridge tests were developed using an assessment development methodology called evidence-centered design (ECD). This paper documents how ECD was used to design tasks that elicited evidence of test takers' basic- to intermediate-level English proficiency, and how various sources of data obtained throughout the test development process influenced item and test design decisions. 

Read more

Read the Development of the Redesigned TOEIC Bridge Tests

Field Study Statistical Analysis for the Redesigned TOEIC Bridge® Tests

This paper reports the results of a field study that contributed to the development of the redesigned TOEIC Bridge tests. The statistical analyses provide initial evidence to support claims that redesigned TOEIC Bridge test scores are consistent, and that test scores are meaningful indicators of English proficiency basic to intermediate levels.

Read More

Read Field Study Statistical Analysis for the Redesigned TOEIC Bridge Tests

The Redesigned TOEIC Bridge® Tests: Relations to Test-taker Perceptions of Proficiency in English

This paper describes two studies that examined the relationship between redesigned TOEIC Bridge test scores and self-assessments of language proficiency. The results provide evidence to support claims that test scores are meaningful indicators of language proficiency, and that test scores can be used to meaningfully differentiate levels of language proficiency as defined by language proficiency standards such as the Common European Framework of Reference for Languages (CEFR).  

Read More

Read The Redesigned TOEIC Bridge Tests: Relations to Test-taker Perceptions of Proficiency in English

Predicting Communicative Effectiveness in the International Workplace: Support for TOEIC® Speaking Test Scores from Linguistic Laypersons

This study investigated whether TOEIC® Speaking test scores were good predictors of test takers' communicative effectiveness, as judged by professionals in the international workplace. Professionals in 10 countries participated in a survey in which they evaluated the communicative effectiveness of test takers in different communicative scenarios. The results provide evidence that TOEIC Speaking test scores are a meaningful evaluation of proficiency that compares to how professionals perceive communicative effectiveness.  

Read More

Read Predicting Communicative Effectiveness in the International Workplace: Support for TOEIC Speaking Test Scores from Linguistic Laypersons

TOEIC® Writing Test Scores as Indicators of the Functional Adequacy of Writing in the International Workplace: Evaluation by Linguistic Laypersons

This study investigated whether TOEIC Writing test scores were good predictors of the "functional adequacy" (or communicative effectiveness) of test takers' writing skills, as judged by professionals in the international workplace. Professionals in 10 countries participated in a survey in which they evaluated the functional adequacy of test takers’ writing in different communicative scenarios. The results provide evidence that TOEIC Writing test scores are a meaningful evaluation of proficiency that compares to how professionals perceive the functional adequacy of writing.  

Read More

Read TOEIC Writing Test Scores as Indicators of the Functional Adequacy of Writing in the International Workplace: Evaluation by Linguistic Laypersons

Making the Case for the Quality and Use of a New Language Proficiency Assessment: Validity Argument for the Redesigned TOEIC Bridge® Tests

This paper summarizes the "validity argument" for the redesigned TOEIC Bridge tests. The validity argument consists of four major claims about score consistency, validity and fairness, appropriate test use and positive impacts; together, this provides a coherent narrative about the measurement quality and intended uses of test scores. By considering the claims and supporting evidence presented in the validity argument, readers should be able to better evaluate whether the redesigned TOEIC Bridge tests are appropriate for their situation.

Read more

Read Making the Case for the Quality and Use of a New Language Proficiency Assessment Validity Argument for the Redesigned TOEIC Bridge Tests

Validity: What Does It Mean for the TOEIC® Tests?

This paper provides a nontechnical overview of test development and research projects undertaken to ensure that TOEIC test scores serve as valid indicators of test takers' skills to communicate in English in global workplace environments.

The main value of the TOEIC tests lies in their validity, which can be defined as the extent to which the tests do what we claim they can do.  

Read more

read more about Validity: What Does It Mean for the TOEIC Tests?

The Relationship Among TOEIC® Listening, Reading, Speaking and Writing Skills

Through examination of test scores, this research found that the TOEIC tests measure distinct but related skills, and that, taken together, they provide a reasonably complete picture of English-language proficiency. This finding provides additional evidence that four-skill approach to language proficiency assessment is crucial.

Read more

read more about The Relationship Among TOEIC Listening, Reading, Speaking and Writing Skills

Measuring English-Language Proficiency across Subgroups: Using Score Equity Assessment to Evaluate Test Fairness

English-language proficiency assessments are designed for a targeted test population and may include test takers from diverse demographic, sociocultural and educational backgrounds. The test is assumed to be fair and the scores earned by different subgroups of test takers have the same meaning. One way of evaluating the test fairness is to produce a linked test for each subgroup and compare the test score results of the linked test with the test scores of the original test they took.

Read More

read more about Measuring English-Language Proficiency across Subgroups: Using Score Equity Assessment to Evaluate Test Fairness

Best Practices for Comparing TOEIC® Speaking Test Scores to Other Assessments and Standards: A Score User’s Guide

In order to better understand the meaning of test scores and to facilitate decision making, score users may need to understand how scores from two different tests are related. The relationship between scores from two different tests are typically summarized in a “concordance table” that indicates the correspondence between the scores on the two tests. Unfortunately, some concordance tables are produced and distributed without any research support, which can lead to inaccurate and unfair decisions about test takers.

Read More

read more about Best Practices for Comparing TOEIC Speaking Test Scores to Other Assessments and Standards: A Score User’s Guide

Measuring English-Language Workplace Proficiency across Subgroups: Using CFA models to Validate Test Score Interpretation

This study used a statistical technique called "factor analysis" to determine which statistical model best explained performance on the TOEIC Listening and Reading test. Researchers found that a model (two-factor model) in which reading and listening skills were represented as distinct abilities best accounted for performance, consistent with how scores are supposed to be interpreted.

Read More

read more about Measuring English-Language Workplace Proficiency across Subgroups: Using CFA models to Validate Test Score Interpretation

Linking TOEIC® Speaking Scores Using TOEIC® Listening Scores

In testing programs, multiple forms of a test are used across different administrations to prevent overexposure of test forms and to reduce the possibility of test takers gaining advance knowledge of test content. Because slight differences may occur in the statistical difficulty of the alternate forms, a statistical procedure known as test score linking has been commonly used to adjust for these differences in difficulty so that test forms are comparable.

Read More

read more about linking TOEIC speaking scores using TOEIC listening scores

Expanding the Question Formats of the TOEIC® Speaking Test

Traditionally, researchers have used the term "authenticity" to refer to the degree to which tasks on a language test correspond to those used in the real world, with authenticity being a desired characteristic of tasks and tests. This white paper explains how the format of several questions in the TOEIC Speaking test was expanded to include a greater variety of real-world situations.

Read More

read more about Expanding the Question Formats of the TOEIC Speaking Test

The Case for a Comprehensive, Four-Skills Assessment of English-Language Proficiency

This paper explains how four-skill language testing is the best way to evaluate whether someone can communicate in English, and explains how this approach can:

  • result in a fairer way of assessment for test takers
  • improve the quality of test users' decisions
  • create more positive impact for decision makers, teachers and learners

Read More

read more about The Case for a Comprehensive, Four-Skill Assessment of English-Language Proficiency

Analyzing Item Generation with Natural Language Processing Tools for the TOEIC® Listening Test

The TOEIC Listening test includes items or tasks related to the global workplace and with a variety of authentic contexts. As the need for an ever-larger number of test forms has increased, an important goal for the TOEIC Listening test has been to increase the efficiency of item generations by maintaining a large pool of items across a wide range of contexts has been an important goal for the TOEIC Listening test.  

Read more

read more about Analyzing Item Generation with Natural Language Processing Tools for the TOEIC Listening Test

The Incremental Contribution of TOEIC® Listening, Reading, Speaking and Writing Tests to Predicting Performance on Real-Life English-Language Tasks

This study investigated whether proficiency in a particular language skill (e.g., speaking) could be better estimated by considering not only the TOEIC test scores corresponding to that skill, but also TOEIC tests scores for other skills. The results supported this assertion, suggesting that scores on the four-skill TOEIC tests together provide a more valid measurement of English-language proficiency than any skill in isolation.

Read more

read more about The Incremental Contribution of TOEIC Listening, Reading, Speaking and Writing Tests to Predicting Performance on Real-Life English-Language Tasks

The TOEIC® Listening, Reading, Speaking and Writing Tests: Evaluating Their Unique Contribution to Assessing English-Language Proficiency

This study investigates:

  • The extent to which TOEIC test scores of one ability correlate with test takers' self-assessments of their English abilities across all four skills
  • Whether one English skill (e.g., reading) can be more accurately estimated or predicted using multiple other TOEIC test scores, i.e., listening, speaking and writing  

Read more

ead more about The TOEIC Listening, Reading, Speaking and Writing Tests: Evaluating Their Unique Contribution to Assessing English-Language Proficiency

Constructed-Response (CR) Differential Item Functioning (DIF) Evaluations for TOEIC® Speaking and Writing Tests

Differential item functioning (DIF) is a statistical procedure used to identify items or tasks that are unexpectedly biased in some way, inappropriately favoring one group of test takers over another. One of the challenges for speaking and writing tests is the lack of proven, practical DIF techniques that can be used to analyze performance-based or "constructed-response" tests. This paper investigates several such techniques and illustrates how research is being conducted to ensure the fairness of score interpretations.   

Read more

read more about Constructed-Response (CR) Differential Item Functioning (DIF) Evaluations for TOEIC Speaking and Writing Tests

Comparison of Content, Item Statistics, and Test Taker Performance on the Redesigned and Classic TOEIC® Listening and Reading Test

This paper compares the content, reliability and difficulty of the classic and 2006 redesigned TOEIC Listening and Reading tests. Although the redesigned tests included slightly different item types to better reflect current models of language proficiency, the tests were judged to be similar across versions.

Read More

read more about the comparison of content, item statistics and test taker performance on the redesigned and classic TOEIC listening and reading test

Evidence-Centered Design: The TOEIC® Speaking and Writing Tests

Evidence-Centered Design (ECD) is an assessment development methodology which explicitly clarifies what an assessment measures and supports skills interpretations based on test scores. This paper describes the ECD processes used to develop the TOEIC Speaking and Writing tests. Evidence collected through the test design process produced foundational support for the validity of TOEIC Speaking and Writing test score interpretations.

Read More

read more about Evidence-Centered Design: The TOEIC Speaking and Writing Tests

Statistical Analyses for the TOEIC® Speaking and Writing Pilot Study

This paper reports the results of a pilot study that contributed to TOEIC Speaking and Writing test development. The analysis of the reliability of test scores found evidence of several types of score consistency, including inter-rater reliability (agreement of several raters on a score) and internal consistency (a measure based on correlation between items on the same test).

Read More

read more about Statistical Analyses for the TOEIC Speaking and Writing Pilot Study

Statistical Analyses for the Updated TOEIC® Listening and Reading Test

To ensure that tests continue to meet the needs of test takers and score users, it is important that testing programs periodically revisit their assessments. For this reason, in order to keep up with the continuously changing use of English and the ways in which individuals commonly communicate in the global workplace and everyday life, an updated TOEIC Listening and Reading test was designed and first launched in May 2016.

Read More

read more about the statistical analyses for the TOEIC test launched in May 2016

The Redesigned TOEIC® Listening and Reading Test: Relations to Test Taker Perceptions of Proficiency in English

After any test redesign project — such as the redesign of the TOEIC Listening and Reading test in 2006 — it is important to provide evidence that test scores can still be meaningfully interpreted. This study examined the relationship between scores on the redesign of the TOEIC Listening and Reading test and test takers' perceptions of their own English proficiency. Researchers found moderate correlations between the test scores and test takers' perceptions, providing evidence that scores on the redesigned TOEIC Listening and Reading tests are meaningful indicators of English ability.

Read More

read more about The Redesigned TOEIC Listening and Reading Test: Relations to Test Taker Perceptions of Proficiency in English

TOEIC Bridge™ Scores: Validity Evidence from Korea and Japan

This study sought to compare TOEIC Bridge scores to test takers' self-evaluations of their own abilities to perform everyday language tasks in English. The results suggest that the test scores correlated well with test takers' self-evaluations, providing further evidence in support of the of TOEIC Bridge scores as valid and fair indicators of English-language proficiency.

Read More

read more about TOEIC Bridge Scores: Validity Evidence from Korea and Japan

The Relationships of Test Scores Measured by the TOEIC® Listening and Reading Test and TOEIC® Speaking and Writing Tests

This study examines the relationship between TOEIC Listening and Reading scores and TOEIC Speaking and Writing scores in order to determine whether or not Listening and Reading scores should be used as predictors of Speaking and Writing scores and vice versa. Findings support the validity of test scores for the measured skills (e.g., Listening and Reading test scores provide meaningful interpretations of Listening and Reading skills).

Read More

read more about The Relationships of Test Scores Measured by the TOEIC Listening and Reading Test and TOEIC Speaking and Writing Tests

The TOEIC® Speaking and Writing Tests: Relations to Test Taker Perceptions of Proficiency in English

This study sought to compare scores on the TOEIC Speaking and Writing tests to students' self-evaluations of their abilities to perform everyday English-language tasks. The researchers reported relatively strong correlations between test scores and the self-evaluations. This finding contributes further evidence in support of TOEIC Speaking and Writing test scores as indicators of English-language proficiency. This study was also published as Powers, Kim, Weng, and Van Winkle (2009).  

Read More

read more about The TOEIC Speaking and Writing Tests: Relations to Test Taker Perceptions of Proficiency in English

TOEIC® Listening and Reading Test Scale Anchoring Study

Scale anchoring is a process that groups test scores into score ranges or proficiency levels. It uses a combination of statistical methods and expert judgment to produce descriptions of the skills and knowledge typically exhibited by test takers at each proficiency level. This research report describes the scale anchoring process for TOEIC Listening and Reading tests, which facilitates meaningful score interpretations.

Read More

read more about TOEIC Listening and Reading Test Scale Anchoring Study

Background and Goals of the TOEIC® Listening and Reading Test Update Project

This report describes the goals and outcomes of a project to update the TOEIC Listening and Reading test in 2016. The use of English for communication, particularly in international workplace contexts, is continually evolving. Therefore, the TOEIC Listening and Reading test is reexamined periodically to ensure that the test content reflects current communication in the workplace and in daily life, thereby supporting meaningful interpretations about English-language skills and promoting a positive impact on English teaching and learning.

Read More

read more about Background and Goals of the TOEIC Listening and Reading Test Update Project

Background and Goals of the TOEIC® Listening and Reading Test Redesign Project

As time progresses, it becomes important to revisit the design of a test to ensure that its conceptualization of language proficiency aligns with current theory and test tasks continue to be indicative of real-world tasks. This report outlines the goals, theoretical alignment, procedures and outcomes of a redesign effort for the TOEIC Listening and Reading test.  

Read More

Background and Goals of the TOEIC® Listening and Reading Test Redesign Project

Field Study Results for the Redesigned TOEIC® Listening and Reading Test

This paper describes the results of a field study for the 2006 redesigned TOEIC Listening and Reading tests, which includes analyses of item and test difficulty, reliability and correlations between test sections with classic TOEIC Listening and Reading tests. Results are consistent with another comparability study (Liao, Hatrak and Yu's in 2010), which found evidence of the reliability of the redesigned tests and suggested that scores on the redesigned test could be interpreted and used in similar ways to classic TOEIC Listening and Reading test scores.

Read More

read more about field study results for the redesigned TOEIC listening and reading test

Validating TOEIC Bridge™ Scores Against Teacher Ratings for Vocational Students in China

This study compared TOEIC Bridge scores with teachers' assessments of test takers' abilities to perform everyday language tasks in English. The authors reported moderate correlations between these assessments and test scores, which provide supporting evidence of the validity of TOEIC Bridge test scores as indicators of English-language proficiency.  

Read More

read more about Validating TOEIC Bridge Scores Against Teacher Ratings for Vocational Students in China

Validating TOEIC Bridge™ Scores Against Teacher and Student Ratings: A Small-Scale Study

This study sought to assess the degree to which TOEIC Bridge scores correspond to student self-assessments and teacher assessments of students, two measurements of English-language proficiency. TOEIC Bridge scores were found to be moderately correlated with these measurements, a finding which provides validity evidence that TOEIC Bridge scores can be meaningfully interpreted as indicators of English-language proficiency.  

Read More

read more about Validating TOEIC Bridge Scores Against Teacher and Student Ratings: A Small-Scale Study

Relating Scores on the TOEIC Bridge™ Test to Student Perceptions of Proficiency in English

This study investigated the relationship between TOEIC Bridge scores and students' evaluations of their own English-language proficiency. The TOEIC Bridge test scores were found to be correlated with self-reported reading and listening skills, providing evidence that TOEIC Bridge test scores are valid or meaningful indicators of English-language reading and listening proficiency.

Read More

read more about Relating Scores on the TOEIC Bridge Test to Student Perceptions of Proficiency in English