System and Method for Automated Detection of Plagiarized Spoken Responses
- Author(s):
- Evanini, Keelan; Wang, Xinhao
- Patent Issued:
- Sep 13, 2016
- Patent Number:
- 9,443,513
- Source:
- ETS Patent
- Document Type:
- Patent
- Family ID:
- 54142709
- Subject/Key Words:
- Patent, Active Patent, Automatic Speech Recognition, Plagiarism, Automated Scoring and Natural Language Processing, Automated Scoring of Speech, Spoken Language Assessment, Content-Based Scoring
Abstract
Systems and methods are provided for automated detection of plagiarized spoken responses. A spoken response is processed to generate a text that is representative of the spoken response. The text is processed to remove disfluencies in the text and to identify a plurality of sentences in the text. A first numerical measure indicative of a number of words and phrases of the text that are included verbatim in a source text is determined. The source text has been designated as a source of plagiarized content. A second numerical measure indicative of an amount of the text that paraphrases portions of the source text is determined. A third numerical measure indicative of a similarity between sentences of the text and sentences of the source text is determined. A model is applied to the first, second, and third numerical measures to classify the spoken response as being plagiarized or non-plagiarized.