Inside SourceFinder: Predicting the Acceptability Status of Candidate Reading Comprehension Source Documents

Author(s):
Sheehan, Kathleen M.; Kostin, Irene; Futagi, Yoko; Hemat, Ramin; Zuckerman, Daniel
Publication Year:
2006
Report Number:
RR-06-24
Source:
Document Type:
Subject/Key Words:
Content vector analyses GRE® reading-comprehension stimuli source-acceptability modeling SourceFinder

Abstract

This paper describes the development, implementation, and evaluation of an automated system for predicting the acceptability status of candidate reading-comprehension stimuli extracted from a database of journal and magazine articles. The system uses a combination of classification and regression techniques to predict the probability that a given document will be deemed acceptable for use in completing a specified passage-creation assignment by at least one test developer. The text features that form the basis of the estimated models are automatically extracted by natural language processing techniques. Model performance is evaluated by comparing the proportion of acceptable documents located with the screening capability turned on to the proportion of acceptable documents located with the screening capability turned off. The evaluation suggests that the estimated models have succeeded in capturing useful information about the characteristics of texts that affect test developers’ ratings of source acceptability and that they can help test developers find a greater number of high-quality sources in less time.

Read More