ETS develops assessments that are of the highest quality, accurately measure the necessary knowledge and skills, and are fair to all test takers. We understand that creating a fair, valid and reliable test is a complex process that involves multiple checks and balances.
That's why dozens of professionals — including test specialists, test reviewers, editors, teachers and specialists in the subject or skill being tested — are involved in developing every test question, or "test item." And it's why all questions (or "items") are put through multiple, rigorous reviews and meet the highest standards for quality and fairness in the testing industry.
To help you further understand our process, here's an overview of the key steps ETS takes when developing a new test.
Step 1: Defining Objectives
Educators, licensing boards or professional associations identify a need to measure certain skills or knowledge. Once a decision is made to develop a test to accommodate this need, test developers ask some fundamental questions:
- Who will take the test and for what purpose?
- What skills and/or areas of knowledge should be tested?
- How should test takers be able to use their knowledge?
- What kinds of questions should be included? How many of each kind?
- How long should the test be?
- How difficult should the test be?
Step 2: Item Development Committees
The answers for the questions in Step 1 are usually completed with the help of item development committees, which typically consist of educators and/or other professionals appointed by ETS with the guidance of the sponsoring agency or association. Responsibilities of these item development committees may include:
- defining test objectives and specifications
- helping ensure test questions are unbiased
- determining test format (e.g., multiple-choice, essay, constructed-response, etc.)
- considering supplemental test materials
- reviewing test questions, or test items, written by ETS staff
- writing test questions
Step 3: Writing and Reviewing Questions
Each test question — written by ETS staff or item development committees — undergoes numerous reviews and revisions to ensure it is as clear as possible, that it has only one correct answer among the options provided on the test and that it conforms to the style rules used throughout the test. Scoring guides for open-ended responses, such as short written answers, essays and oral responses, go through similar reviews.
Step 4: The Pretest
After the questions have been written and reviewed, many are pretested with a sample group similar to the population to be tested. The results enable test developers to determine:
- the difficulty of each question
- if questions are ambiguous or misleading
- if questions should be revised or eliminated
- if incorrect alternative answers should be revised or replaced
Step 5: Detecting and Removing Unfair Questions
To meet the stringent ETS Standards for Quality and Fairness guidelines, trained reviewers must carefully inspect each individual test question, the test as a whole and any descriptive or preparatory materials to ensure that language, symbols, words, phrases and content generally regarded as sexist, racist or otherwise inappropriate or offensive to any subgroup of the test-taking population are eliminated.
ETS statisticians also can identify questions on which two groups of test takers who have demonstrated similar knowledge or skills perform differently on the test through a process called Differential Item Functioning (DIF). If one group performs consistently better than another on a particular question, that question receives additional scrutiny and may be deemed biased or unsatisfactory. Note: If people in different groups actually differ in their average levels of relevant knowledge or skills, a fair test question will reflect those differences.
Step 6: Assembling the Test
After the test is assembled, it is reviewed by other specialists, committee members and sometimes other outside experts. Each reviewer answers all questions independently and submits a list of correct answers to the test developers. The lists are compared with the ETS answer keys to verify that the intended answer is, indeed, the correct answer. Any discrepancies are resolved before the test is published.
Step 7: Making Sure — Even After the Test is Administered — that the Test Questions are Functioning Properly
Even after the test has been administered, statisticians and test developers review to make sure that test questions are working as intended. Before final scoring takes place, each question undergoes preliminary statistical analysis and results are reviewed question by question. If a problem is detected, such as the identification of a misleading answer to a question, corrective action, such as not scoring the question, is taken before final scoring and score reporting takes place.
Tests are also reviewed for reliability. Performance on one version of the test should reasonably predict performance on any other version of the test. If reliability is high, results will be similar no matter which version a test taker completes.