The various methods for computing the reliability of scores on Advanced Placement Program® examinations are summarized. For the free response portion of the examinations, raters can contribute to score unreliability through both systematic severity errors (in which some raters consistently rate more severely than other raters) and through inconsistency. Inconsistency appears to be a much greater problem then systematic severity errors. Question to question variation (or score reliability) is seen as a greater problem than rater inconsistencies. The impact of increasing or decreasing the number of topics is demonstrated by showing the proportion of students correctly classified as the number of topics changes. Procedures to enhance both rater and score reliability are discussed.