The Effect of the Position of an Item Within a Test on Item Responding Behavior: An Analysis Based on Item Response Theory
- Kingston, Neal M.; Dorans, Neil J.
- Publication Year:
- Report Number:
- Document Type:
- Subject/Key Words:
- Equated scores item analysis item response theory responses test construction test theory
The research described in this paper deals solely with the effect of the position of an item within a test on examinee's responding behavior at the item level. For simplicity's sake, this effect will be referred to as practice effect when the result is improved examinee performance and as fatigue effect when the result is poorer examinee performance. Item response theory item statistics were used to assess position effects because, unlike traditional item statistics, they are sample invariant. In addition, the use of item response theory statistics allows one to make a reasonable adjustment for speededness, which is important when, as in this research, the same item administered in different positions is likely to be affected differently by speededness, depending upon its location in the test. Five types of analyses were performed as part of this research. The first three types involved analyses of differences between the two estimations of item difficulty (b), item discrimination (a), and pseudoguessing (c) parameters. The fourth type was an analysis of the differences between equatings based on items calibrated when administered in the operational section and equatings based on items calibrated when administered in section V. Finally, an analysis of the regression of the difference between b's on item position within the operational section was conducted. The analysis of estimated item difficulty parameters showed a strong practice effect for analysis of explanations and logical diagrams items and a moderate fatigue effect for reading comprehension items. Analysis of other estimated item parameters, a and c, produced no consistent results for the two forms analyzed. Analysis of the difference between equatings for Form 3CGR1 reflected the differences between estimated b's found for the verbal, quantitative, and analytical item types. A large practice effect was evident for the analytical section, a small practice effect, probably due to capitalization on chance, was found for the verbal section. Analysis of the regression of the difference between b's on item position within the operational section for analysis of explanations items showed a rather consistent relationship for Form ZGR1 and a weaker but still definite relationship for Form 3CGR1. The results of this research strongly suggest one particularly important implication for equating. If an item type exhibits a within-test context effect, any equating method, e.g., IRT based equating, that uses item data either directly or as part of an equating section score should provide for administration of the items in the same position in the old and new forms. Although a within-test context effect might have a negligible influence on a single equating, a chain of such equatings might drift because of the systematic bias.