Lexical Association Metric for Knowledge-Free Extraction of Phrasal Terms
- Author(s):
- Deane, Paul
- Patent Issued:
- Dec 13, 2011
- Patent Number:
- 8,078,452
- Source:
- ETS Patent
- Document Type:
- Patent
- Family ID:
- 35150613
- Subject/Key Words:
- Patent, Active Patent, Automated Response Evaluation, Phrasal Terms, Corpus Analysis, Context (Linguistics), Ranking and Selection (Statistics)
Abstract
A method and system for determining a lexical association of phrasal terms are described. A corpus having a plurality of words is received, and a plurality of contexts including one or more context words proximate to a word in the corpus is determined. An occurrence count for each context is determined, and a global rank is assigned based on the occurrence count. Similarly, a number of occurrences of a word being used in a context is determined, and a local rank is assigned to the word-context pair based on the number of occurrences. A rank ratio is then determined for each word-context pair. A rank ratio is equal to the global rank divided by the local rank for a word-context pair. A mutual rank ratio is determined by multiplying the rank ratios corresponding to a phrase. The mutual rank ratio is used to identify phrasal terms in the corpus.