skip to main content skip to footer

Monitoring and Improving a Portfolio Assessment System IRT AP

Myford, Carol M.; Mislevy, Robert J.
Publication Year:
Report Number:
Report of the ETS Center for Performance Assessment
Document Type:
Page Count:
Subject/Key Words:
Rasch Model, Quality Assurance, Performance Assessment, Portfolios, Item Response Theory (IRT), Art, Advanced Placement Program (AP)


An assessment system can succeed in provoking productive and sustained performances, yet fail to support instruction and evaluation unless shared standards exist among students, teachers,and judges as to what is valued in performance and how it fits into an evaluative framework. Establishing and refining such a framework is especially difficult in large-scale settings that can involve hundreds of judges and thousands of students. This presentation advocates the interactive use of two complementary analytic perspectives and illustrates the approach in the context of the College Entrance Examination Board's Advanced Placement Studio Art portfolio assessment. The "naturalistic" component of the project involved in-depth discussions with judges about 18 portfolios from the 1992 assessment that received discrepant ratings. These discussions provided insights into the kinds of evidence, inference, arguments, and standards that underlie ratings. Since it is impossible to hold such discussions for each of the 50,000+ individual ratings produced in the assessment, summary results for each, in the form of numerical ratings, provided the data for the "statistical" component. Linacre's (1989) FACETS model was used to (i) summarize overall patterns in terms of effects for students, judges, and portfolio section, (ii) quantify the weight of evidence associated with these effects, and (iii) highlight rating profiles and judge/portfolio combinations that are unusual in light of the typical patterns. This focuses attention where it is apt to be most useful in improving the process (e.g., in clarifying expectations to students, improving judge training, or sharpening the definition of standards). Further, by making public the materials and results of both perspectives, one can better communicate the meaning and value of the work such an assessment engenders and the quality of the processes by which evidence about students' accomplishments is evaluated.

Read More