Systems and methods described herein utilize supervised machine learning to generate a model for scoring interview responses. The system may access a training response, which in one embodiment is an audiovisual recording of a person responding to an interview question. The training response may have an assigned human-determined score. The system may extract at least one delivery feature and at least one content feature from the audiovisual recording of the training response, and use the extracted features and the human-determined score to train a response scoring model for scoring interview responses. The response scoring model may be configured based on the training to automatically assign scores to audiovisual recordings of interview responses. The scores for interview responses may be used by interviewers to assess candidates.