angle-up angle-right angle-down angle-left close user menu open menu closed search globe bars phone store

Applying Social Context to Psychometric Modeling in Educational Assessment

Focus on R&D

Issue 11

September 2018

By: Hans Sandberg

Photo of Robert Mislevy with new bookIn his latest book — Sociocognitive Foundations of Educational Measurement — Robert J. Mislevy suggests a rethinking of how psychometric models are used in educational assessment by viewing learning in a social context. To find out more, Focus on ETS R&D sat down with Mislevy, who holds ETS's Frederic M. Lord Chair in Measurement and Statistics, and is also Professor Emeritus of Measurement, Statistics and Evaluation at the University of Maryland.

What is a sociocognitive perspective?

The term reflects the fact that all learning and understanding takes place in a social context that reflects linguistic, cultural and social patterns. We have a new understanding of cognition that integrates individual, situative and social perspectives. For example, consider the knowledge it takes to play the guitar. You may be sitting alone in your room practicing on your guitar, jamming with friends or performing in a club. Much of what you have learned (and continue to learn) overlaps, but each setting introduces various ways of thinking and interacting with others, which includes local and historical traditions. The sociocognitive perspective gives us a more realistic understanding of how knowledge is acquired and practiced in different settings, and this understanding is becoming more important as society grows more diverse.

Why did you write the book? And why now?

I wanted to demonstrate how a sociocognitive perspective can help improve the ways we use psychometric models in educational measurement, which in turn can lead to better assessments. There are three things that make this important today. First, there is a growing gap between our current understanding of the psychology of learning on the one hand, and the ways we think and talk about educational measurement on the other. The latter has its roots in the trait and behavioral psychology from a hundred years ago. Those foundational concepts are as important as ever, but this gap hampers what we can do with assessment. Second, we have a much greater diversity among learners and in how people want to use assessments. Third, new technology gives us a chance to do new and more interactive assessments. The book represents a rethinking of what it is that we measure, and questions some of the familiar ways we do things. Such a reconceptualization in the way we think about educational measurement can improve the usefulness and validity of educational assessment in practice.

What does a sociocognitive perspective add to assessment development?

One big point in my book is that many of the deeper principles in educational measurement do apply to the new world of psychology and technology, but they are expressed in ways that evolved to fit our familiar forms of assessment. Some experts respond to this by saying, "yes, you have latent variables, measurement error and other things that fit the psychology of a hundred years ago, but that's not how we think about learning and assessment today."

In my view, we need to show how psychometric models can answer questions about evidence and inference in educational assessment, whatever psychological perspective or purposes you may have. How do you characterize evidence? What evidence do you need? How do you interpret it? How do you use statistical models to manage questions of design, reliability, fairness and validity?

People across disciplines are learning to accommodate new forms of learning and assessment, like simulation-based assessment where both learning and measurement take place in a virtual environment. What I'm trying to do in my book is to show how all these things fit together. Let's look at differential item functioning (DIF), as an example. People who work in educational measurement, including here at ETS, already use various methods for detecting anomalous DIF patterns, which then are shared with content experts and item writers, who often rely on tacit knowledge. Using such a rule-of-thumb approach is simply seen as good testing practice — which of course it is. In my book, I show how the work of test developers and DIF-analysts can be grounded in sociocognitive research and principles. It is not that we are doing DIF differently; it is that we can think about it and talk about it in ways that connect to current work on learning.

How do these principles impact the use and interpretation of educational assessment?

We are addressing new developments in psychology, new technological capabilities that we have developed and new delivery platforms. This allows us to better understand and interpret our current assessments and consider if we can do something more with them. Our aim is not to pull the rug from under the current assessments, but to improve them by deepening our understanding, adding context when and where it makes for better evidence, and ultimately developing new assessment possibilities. The only time when you should pull the rug from under an assessment is if you are doing something you shouldn't be doing anyway. You may think that you have a construct that means the same thing across all populations, and that you can gather evidence and interpret the numbers the same way; however, if you view it through the sociocognitive lens, you may find out that this is not the case. Taking a sociocognitive perspective on assessment can reduce the over-interpretation of scores.

How do you think your book can influence current and future assessments?

For current assessments, my hope is that the book will provide more realistic expectations of what you can learn from large-scale 'drop in from the sky' assessments, and that the principles laid out in this book can improve them. To the extent that I succeed, it will help educators and researchers use familiar kinds of assessment more wisely. In the future, I hope readers will see how concepts and tools from today's educational measurement methods can be useful with new forms of assessment, such as games and simulations. A situative perspective is necessary to meet design challenges for things like game-based assessments, which must serve purposes coming from the disparate worlds of games, learning and assessment. I would argue that deeper and more generalized insights will allow all of these technological possibilities to remain consistent with our social values: fairness, validity and so on. People who work in learning analytics and those who are developing simulations will not only have a language to understand and communicate in, but also a conceptual grounding that will help them bridge the divide.

Find out more about R.J. Mislevy's new book Sociocognitive Foundations of Educational Measurement.

R.J. Mislevy holds the Frederic M. Lord Chair in Measurement and Statistics, and is also Professor Emeritus of Measurement, Statistics and Evaluation at the University of Maryland.

Terms Explained

Behavioral psychology: Focuses on patterns of human behavior in classes of environmental stimuli. The construct in assessments with a behavioral orientation looks at how well people solve problems in a domain, rather than on how. (The stronger stance called behaviorist psychology rejects the study of how on principle.)

Construct: An abstract term used in psychology to facilitate the understanding of human behavior. A construct can be a description of a characteristic or set of characteristics that are not observed directly. It can represent the knowledge and skill that an assessment wants a student to demonstrate.

Differential Item Functioning (DIF): The term relates to a tendency for test questions to be more or less difficult for one group of test takers compared to a reference group, which has been modeled as having the same overall ability. A DIF analysis seeks to identify test questions that are significantly harder for one group compared to another group in this sense.

Latent variables: These are unobservable variables that are used in models to reason about patterns in observations. In statistical models, they connect psychological theories with observed variables that tend to "move together," for example responses to items in a test.

Measurement error: The difference between a measured value and its true value.

Psychometrics: The study of human psychological abilities by means of statistical models, i.e., simplified representations of selected aspects of human cognition and performance during assessments and in learning, as well as other human activities.

Situative: A psychological perspective that focuses on individuals interacting with each other and with physical and social subsystems in the environment. Learning is preparation for situated action, and situated action produces learning.

Trait psychology: A theory of human personality that focuses on the measurement of relatively stable patterns of human behavior, thought and emotion, referred to as traits.

Learn more

Mislevy, R.J.: A Dispatch from the Psychometric Front (Keynote at the Society for Learning Analytics Research's Learning Analytics & Knowledge conference, Friday April 29, 2016)