One of the persistent problems facing the researcher in the field of testing is that of determining the extent to which a test is appropriate for use with different subsamples of a general population. An obvious, but somewhat difficult approach to the problem would be to validate the test under consideration for every identifiable subgroup in the population and to develop specific interpretative guides for use with each of the subgroups. Such an approach, however, throws little light on the question of what factors are responsible for and differences which appear. In order to develop an understanding one must deal with the data at the level of the individual test item. But a direct comparison of the difficulty of an individual item for sub-samples of the population is complicated by the fact that the subsamples may actually differ significantly in the overall complex of abilities measured by the test. One needs to develop a method which separates the overall differences from differences attributable to individual questions; in other words, one needs a method which will provide an estimate of the interaction of items with subsamples. In this study, the two-factor analysis of variance design with repeated measures on one factor is demonstrated to be appropriate to the problem if item difficulties are first subjected to the arcsin transformation. Application is illustrated using SAT candidates from three different subgroups of the total population-Black candidates from the Southeast, candidates from small town and rural centers in Indiana and Illinois, and candidates from centers in the Bronx, New York.