This is a description of work in progress; only preliminary results are available at this point. We are presenting this work now because the substantive findings and the methodology are of some interest, and the methodology may be useful in tackling a variety of research problems. The study involves the Test of English as a Foreign Language , a test of English proficiency that is taken by foreign students who plan to study here, typically at the college or graduate-school level. The test has several kinds of items: three kinds of listening comprehension items, incomplete sentences, incorrect words and phrases, vocabulary, and reading comprehension. The usual way of studying a test is to pool all the examinees and use right-wrong scoring of the items. However, the questions that motivate this investigation require a different approach. Consequently, we decided to look at differences in the clustering of items for groups of examinees that systematically vary in native language and overall ability, clustering the items on the basis of all the responses-- choosing the right answer or one of the distractors, and omitting or not reaching an item. We could then try to infer what the items in the clusters had in common.