Systems and methods are provided for automatic detection of plagiarized spoken responses during standardized testing of language proficiency. A first spoken response to a speaking task that elicits spontaneous speech and a second spoken response to a source-based speaking task that is assumed to be non-plagiarized are digitally recorded. A first and second sets of speaking proficiency features are calculated for the first and the second spoken response, respectively. The first spoken response is classifying as plagiarized or non-plagiarized based on the comparison between the first and the second set of speaking proficiency features. Corresponding apparatuses, systems, and methods are also disclosed.