A method and system for grading free response tests including the use of highly accurate machine-readable data codes to uniquely associate test-taker, test and reader/grader, and a portable sensing device which stores codes read by the device for subsequent entry into a host computer. The method permits multiple readers/graders to evaluate the same test without one reader/grader influencing another, while reducing paper handling and key entry of data inherent in large volume paper and pencil testing techniques.