skip to main content skip to footer

Sex-Related Performance Differences on Constructed-Response and Multiple-Choice Sections of Advanced Placement Examinations AP DIF

Mazzeo, John; Schmitt, Alicia P.; Bleistein, Carole A.
Publication Year:
Report Number:
ETS Research Report
Document Type:
Page Count:
Subject/Key Words:
Advanced Placement Program (AP), Constructed-Response Tests, Differential Item Functioning (DIF), Item Analysis, Multiple Choice Tests, Sex Differences


(29pp.) A number of studies in which scores on multiple-choice and constructed-response tests have been analyzed in terms of the sex of the test takers have indicated that the test performance of females relative to that of males was better on constructed-response tests than on multiple-choice tests. This report describes three exploratory studies of the performance of males and females on the multiple-choice and constructed-response sections of four Advanced Placement (AP) Examinations: United States History, Biology, Chemistry, and English Language and Composition. The studies were intended to evaluate some possible reasons for the apparent relationship between test format and the magnitude of sex-related differences in performance. For the first study, analyses were carried out to evaluate the extent to which such differences could be attributed to differences in the score reliabilities associated with these two modes of assessment. For the second study, analyses of the multiple-choice sections and follow-up descriptive analyses were conducted to assess the extent to which sex-related differences in multiple-choice scores could be attributed to the presence of differentially functioning items favoring males. For the third study, a set of exploratory analyses was undertaken to determine whether patterns of sex-related differences could be observed for dif- ferent types of constructed-response questions. The results of the first study provided little support for the "different-reliabilities" hypothesis. Across all exams and all ethnic groups, there were substantial differences between the scores of males and females even after taking into account differences in the reliabilities of the two sections. The results of the second study indicated that fairly small numbers of items exhibited substantial amounts of sex-related differential item functioning (DIF), and removing these items resulted in almost no reduction in the magnitude of sex-related differences on the multiple-choice sections. The results of the third study identified some consistent patterns across ethnic and racial groups regarding which questions females will perform best on, relative to males. However, taken as a whole, the results of the third study suggest that topic variability may have a greater effect than the variability associated with particular question types or broadly defined content areas.

Read More