The Scholastic Aptitude Test (SAT) has undergone a number of improvements since it was first administered in 1926. Although each test appeared to be similar over the years, and the scores were equated to insure comparability, each new test form uses items which have never been used. Test developers may begin writing test items two years before the actual SAT administration. A multiple choice item typically has one correct answer with four incorrect choices resembling free response answers that would be given by less able students. Consultants also contribute to the development of test items; all items are reviewed and edited. To pretest the efficacy of new items, they are administered to actual candidates two or three times; responses are reviewed to determine ambiguity in language or intent. The assembler then works with the file of pretested items, item statistics, and the SAT Classification Manual. Items are classified according to difficulty, discrimination, form, and content. Items are selected based on the pool of questions, as well as what is called for in the test specifications. SAT developers have attempted to concentrate on power rather than speed, on questions which maximize intellectual skills and minimize rote learning, and on that verbal or mathematical ability being measured. Verbal items include sentence completion, antonyms, analogies, and reading comprehension. Mathematics items involve data sufficiency, or problem solving using different types of thinking, presentation, and data characteristics. Sentence completion and data sufficiency types are described. Test developers strive to achieve balance regarding questions which may be answered more successfully by certain populations, as well as in the types of thinking required to answer successfully.