skip to main content skip to footer

The College Board Vocabulary Study

Breland, Hunter M.; Jenkins, Laura; Jones, Robert J.
Publication Year:
Report Number:
RR-94-26, CBR-94-04
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
College Freshmen, High School Students, Reading Research, Vocabulary, Word Frequency


This study was conducted to provide an up-to-date source of word frequency information based on the kinds of reading materials to which high school and first- year college students are exposed. It began with a comprehensive listing of reading materials from curriculum surveys, state curriculum guides, private school reading lists, research surveys, federal reports, recommended reading lists, and other sources. Materials mentioned most often were sampled, or entire documents were obtained when they were available in electronic form. Included in the sample of reading materials were American and British novels, poetry, drama, essays, biographies, autobiographies, current periodicals of various types, historical documents, and text from an encyclopedia. A corpus of 14,360,884 words of running text was assembled. This corpus was analyzed using the most sophisticated lexicographic methods available, and the following statistics were generated: the overall frequency of occurrence of each word in the corpus, an index of dispersion for each word over 27 text categories, an estimate of the number of occurrences per one million words of running text for each word that would be expected in a similar but different corpus, and a standard frequency index developed from a logarithmic transformation. This report describes the development of the corpus and the computation of the word frequency indexes. It also compares the corpus with other existing corpora and demonstrates the importance of the up-to-date word frequency information. The comprehensive listing of reading materials examined and a list of sampled materials are included in the Appendixes. (55pp.)

Read More