SUMMARY: This paper reports the results of a pilot study that contributed to TOEIC Speaking and Writing test development. The analysis of the reliability of test scores found evidence of several types of score consistency, including inter-rater reliability (agreement of several raters on a score) and internal consistency (a measure based on correlation between items on the same test). The correlational analysis found evidence that each test section measured three distinct claims about Speaking or Writing, as intended. These results helped the development of final specifications for the TOEIC Speaking and Writing tests in addition to providing evidence of score reliability and the validity of score interpretations. ABSTRACT: A pilot study conducted in December 2006 evaluated the statistical properties of the TOEIC Speaking and Writing tests in order to confirm whether the planned design for the tests was achieved. The results of the study also helped fine-tune the final design of the tests before they were launched for operational use. The statistical analyses conducted for this study included determining the difficulty of the tests, establishing the correlation among different parts of the tests, and examining test score reliability and inter-rater reliability. This report documents the results of these statistical analyses. This paper is part of the Research Foundation for TOEIC: A Compendium of Studies, published by ETS in 2010.