Cooccurrence and Constructions
- Author(s):
- Deane, Paul
- Patent Issued:
- Apr 03, 2012
- Patent Number:
- 8,147,250
- Source:
- ETS Patent
- Document Type:
- Patent
- Family ID:
- 34193204
- Subject/Key Words:
- Patent, Active Patent, Automated Scoring and Natural Language Processing, Automated Essay Scoring (AES), Automated Response Evaluation, Test Constructions, Ranking, Sentence Arrangement Test, Similarity Measures, Correspondence, Corpus Analysis, Cooccurrence, Text Analysis
Abstract
A method and system for performing automatic text analysis is described. A local ranking for one or more contexts with respect to a word and a global ranking for one or more contexts are generated. The rankings are based on the frequency with which the contexts appear in a corpus. A statistic may be generated using the local and global rankings, such as a log ratio rank statistic equal to the logarithm of the global rank divided by local rank, to measure the similarity of contexts with respect to words with which they combine. A source matrix of word to context values is then created. Singular value decomposition is used to create sub-matrices from the source matrix. Vectors from the sub-matrices corresponding to context(s) and/or word(s) are then selected to determine term-term or context-context similarity or term-context correspondence.