Variability in judgments of ESL compositions is inherent in the view that raters are “readers” with prior experiences. Such a view, however, obliges researchers to understand how personal background and professional experience influence both scoring procedures and scoring criteria. These issues were explored by asking four raters to construct scoring criteria while assessing corpora of 60 TOEFL essays without the aid of a scoring rubric, and to discuss their procedures and criteria in follow-up interviews. The study revealed key points in the decision-making process, where raters’ behavior diverged, and examined the impact of prior experience on these. The identification of such divergences, and potential explanations for them, were undertaken to lay the foundations for a principled explanation of rater variability.