Systems and methods for training raters to rate constructed responses to tasks are described herein. In one embodiment, a plurality of trainee raters are selected without regard to their prior experience. The trainee raters are then train in individual training sessions, during which they are asked to rate responses to a task. Each session presents to the trainee rater the task, a rating rubric, and training responses to the task. The training program receives ratings assigned by the trainee rater to the training responses through a graphical user interface. Upon receiving the assigned rating, the training program presents feedback substantially immediately and determines a score for the trainee rater's assigned rating. Thereafter, qualified raters are selected from the plurality of trainee raters based upon their performance during the training sessions as compared with a statistical model. Operational constructed responses are then assigned to rated by the qualified raters.