A method, system and processor-readable storage medium for evaluating vocabulary similarity are disclosed. A generic rate may be determined for each word in a plurality of first responses. Each first response may respond to one of a plurality of first prompts. At least one first response may respond to each of the first prompts. A specific rate may be determined for each word in a plurality of second responses, which each respond to a second prompt. A target response may be received that is associated with the second prompt and has a plurality of words. A vocabulary similarity index may be computed for the target response based on one or more generic rates and on or more specific rates. A determination of whether the target response is off-topic may be made based on the vocabulary similarity index for the target response.