NLP-guided Video Thin-slicing for Automated Scoring of Non-Cognitive, Behavioral Performance Tasks
- Author(s):
- Leong, Chee Wee; Chen, Xianyang; Basheerabad, Vinay; Lee, Chong Min; Houghton, Patrick
- Patent Issued:
- May 13, 2025
- Patent Number:
- 12,300,244
- Source:
- ETS Patent
- Document Type:
- Patent
- Family ID:
- 1000006574956
- Subject/Key Words:
- Patent, Active Patent, Artificial Intelligence, Machine Learning, Video, Natural Language Processing (NLP), Automated Scoring, Automatic Speech Recognition, Noncognitive Assessment
Abstract
Data is received that encapsulates a video of a subject performing a task. This video is used to generate a transcript using an automatic speech recognition (ASR) system. A plurality of text segments are generated from the transcript and then tokenized. A textual representation of each segment is extracted by a transformer model using the tokenized text segment (i.e., the tokens corresponding to the text segment). Thereafter, for each segment, a fused representation derived from the textual representations and corresponding visual and audio features from the video is generated. A sparse attention machine learning model then selects an optimal slice of the video based on the fused representations. The optimal slice can then be input into one or more machine learning models trained to characterize performance of the task by the subject.