(67pp.) The construct validity of algebra word problems for measuring quantitative reasoning was examined from two perspectives, one focusing on an analysis of problem attributes and the other on the analysis of con- structed-response solutions. Twenty problems that had appeared on the Graduate Record Examinations General Test were investigated. Constructed-response solutions to these problems were collected from 51 undergradu- ates. Regression analyses of problem attributes indicated that models including factors such as the need to apply algebraic concepts, problem complexity, and problem content could account for 37% to 62% of the variance in problem difficulty. With respect to constructed-response solutions, four classes of strategies were identified: equation formulation, ratio setup, simulation, and other (unsystematic) approaches. Higher achieving students used equation strategies more and unsystematic approaches less than lower achieving examinees. Examinees' errors were classified into eight principal categories. Problem conception errors were the best predictor of perfor- mance on the constructed-response problems and on SAT- M. In contrast, procedural errors contributed to the prediction of performance on the constructed-response problems but not to standing on SAT-M. Overall, these results provide support for the construct validity of GRE algebra work problems and of SAT-M as measures of quantitative reasoning. A preliminary theoretical framework for describing performance on algebra word problems is proposed, and its usefulness for more systematic design of tests is discussed.