Systems and Methods for Identifying Collocation Errors in Text

Futagi, Yoko; Deane, Paul; Chodorow, Martin
Jun 25, 2013
Patent, Active Patent, Collocation (Linguistics), Automatic Error Detection, Automated Scoring and Natural Language Processing, Language Learning Tools, Syntactic Analysis, Linguistic Annotation, Text Analysis


Systems and methods for detecting collocation errors in a text sample using a reference database from a corpus are provided. Collocation candidates are identified within the text sample based upon syntactic patterns in the text sample. Whether a given collocation candidate contains a collocation error is detected, the detecting including: determining a first association measure using the reference database for the given collocation candidate; determining whether the first association measure satisfies a predetermined condition and identifying the given collocation candidate as proper if the first association measure satisfies the predetermined condition; determining an additional association measure for a variation of the given collocation candidate using the reference database; and determining whether or not the collocation candidate contains an error based upon the additional association measure of the variation.

