Testing programs are often classified as high or low stakes to indicate how stringently they need to be evaluated. However, in practice, this classification falls short. A high‐stakes label is taken to imply that all indicators of measurement quality must meet high standards; whereas a low‐stakes label is taken to imply the opposite. This approach can result in inappropriate allocation of resources and inadequate attention to needed evidence. We argue that “stakes” are better thought of as a profile of consequences. We suggest generalizable criteria for evaluating and responding to stakes in testing, with applications to licensure, employment, and K–12 accountability testing.