A total of 2,080 final assessments were administered to 1,718 teachers in the 6 participating MET study districts. Assessment results for 194 teachers were excluded due to evidence that assessments were either completed together by 2 or more participants or that insufficient time was devoted to represent a good faith effort at answering the assessment questions. The final sample included 1,886 assessments. Assessment scores included both selected-response and constructed-response (CR) questions. We used information from item level statistics, including percent correct and biserial correlations, to systematically remove poorly performing items in order to improve assessment reliabilities. Item level statistics for each assessment are presented. Descriptive statistics and histograms indicate that participants are well distributed over the range of possible score responses. Assessments had moderate to strong levels of reliability, ranging from 0.69 to 0.83.