This chapter discusses some of the devices that aid in giving test scores the kind of meaning they need in order to be useful as instruments of measurement. The concepts of scaling, norming, and equating and calibration are all defined and then considered separately. The problems of comparable scores are given separate treatment, but within the context of the equating of nonparallel tests. Types of scales discussed include raw score scales, mastery scales, linear transformation (standard scores, scales, percentile rank scales, normalized scales (normalized standard scores), stanine scales, scaled scores, age equivalent scales, grade equivalent scales, EQ and AQ scales, nonnormative scales, and Tucker's proficiency scale. The discussion of norms and score interpretation includes defining the distinction between clinical and statistical norms, consideration of national norms, local norms, age and grade equivalents, age and grade norms, over- and under-achievement, expectancy tables, item norms, school mean norms, user-selected norms, "special-study" norms, norms that yield "direct meaning," functional interpretations, quality ratings, profile charts, technical problems in the development of norms, sampling and sampling techniques--simple random sampling, stratified random sampling, systematic sampling, cluster sampling--the size of tolerable error in norms, and general considerations in the development of norms. The discussion of equating and calibration includes definition of equivalent scores, equating and calibration systems, calibration of tests at different levels of ability, and various charts and tables of scores on different forms of the same test. The discussion of comparable scores includes definition of comparable scores and distinction between equivalent and comparable scores, procedures for deriving comparable scores, comparability of SAT-V and SAT-M scores, difficulties of comparing College Board achievement test scores, and the uses and qualifications of use of comparable scores.