The report is the first systematic evaluation of the sentence equivalence item type introduced by the GRE revised General Test. We adopt a validity framework to guide our investigation based on Kane’s approach to validation whereby a hierarchy of inferences that should be documented to support score meaning and interpretation is evaluated. We present evidence relevant to the generalization inference as well as evidence of construct representation. We analyzed the pool of sentence equivalence items in three studies. The first and second studies focused on the generalization inference and sought to document the construction principles behind the sentence equivalence items, specifically the nature of the vocabulary tested. The third study focused on construct representation and evaluated the contribution of the stem, the keys, and the distractors to item difficulty. We concluded that the vocabulary tested by the sentence equivalence items is appropriate given the purpose of the GRE, namely, to assist in the selection of graduate students. The difficulty of the items was shown to be, in part, a function of the familiarity of the vocabulary as well as the context in which the vocabulary is tested, which we argue is positive validity evidence.