Finding such evidence in the discourse is important, as it adds to the validity argument of the TOEFL iBT writing test and is useful for a verification of the rating scale descriptors used in operational rating. This study applied discourse-analytic measures to the writing of 480 test takers who each responded to the two writing tasks. The discourse analysis focused on measures of accuracy, fluency, complexity, coherence, cohesion, content, orientation to source evidence, and metadiscourse. An analysis with a multivariate analysis of variance (MANOVA) using a two-by-five (task type by proficiency level) factorial design with random permutations showed that the discourse produced by the test takers varies significantly on most variables under investigation. The discourse produced at different score levels also generally differed significantly. The findings are discussed in terms of the TOEFL iBT test validity argument. Implications for rating scale validation and automated scoring are discussed.