ETS Research Forum Innovations in Conceptualizing and Assessing Civil Competency and Engagement in Higher Education Washington, D.C., November 4, 2015

Speakers: Ou Lydia Liu, Ph.D., Director of Research in Higher Education, ETS; Judith Torney-Purta, Ph.D. Professor Emerita, University of Maryland, Consultant, Educational Testing Service; Katrina Roohr, Ed.D. Associate Research Scientist, Higher Education Research, Educational Testing Service; Peter Levine, Associate Dean for Research and the Lincoln Filene Professor of Citizenship and Public Affairs, Jonathan Tisch College of Citizenship and Public Services, Tufts University.

Ou Lydia Liu, Ph.D., Director of Research in Higher Education, ETS - So thanks, everyone, for coming to this forum. We're very excited to have this opportunity to share with you the latest research we are doing in the domain of civic competency and engagement.

So I'm going to talk about measuring student learning outcomes in higher ed. My presentation aims to provide a context and background information for the presentations we are going to hear from Judith and Katrina surrounding civic model. This work, it's part of a larger project at ETS that we're currently undertaking to look at what are the core college level learning outcomes that are important and also how we design assessments to measure them. So I'm going to provide an overview of the general principles that we followed in terms of defining those critical learning outcomes and also assessment design. So hopefully by the end of my presentation we'll have a better understanding of why certain decisions are made in the way that they are made for the civic module.

So student learning outcomes assessment has been widely used in the US. And when we talk about SLO assessments, they could be both generic assessments — those things are like critical thinking, written communication, civic competency — or they could be domain-specific assessments that reflect a student's learning in their particular disciplinary area like physics, psychology, and in this forum we focus on the generic side of the assessment. So in the past many institutions used SLO assessments to check a box for fulfilling accreditation requirements. I mean they typically don't go beyond that use of SLO assessments.

However, in 2005 the use of SLO assessments was put under the national spotlight because of the commission established by the US former Secretary of Education, Margaret Spellings, on the future of higher education. That commission was charged to do a couple things. One is to identify the most prominent issues in US higher education, and then to identify strategies to improve the process and outcomes. So in that report it pointed out a remarkable absence of accountability mechanisms to ensure that colleges succeed in educating students. So basically the report was saying that when evaluating how well an institution is doing, it's not enough to just look at the input and the resources it has, like how many faculty they have, what's the research productivity? It's also important to look at how much their students have learned. So the commission has asked the higher-ed institutions to demonstrate more direct evidence of student learning and to use standardized tools so such results are comparable. And ever since then institutions feel the pressure to demonstrate more comparable evidence of student learning. And the pressure also came from within higher ed. Some leading higher-ed organizations such as AASCU and APLU, they've organized various accountability initiatives and one of them being the voluntary system of accountability, which also asks its member institutions to demonstrate evidence of learning, of course on a voluntary basis. And in addition to that there is also pressure from the general public in terms of their willingness to understand how higher-ed institutions operate. And also in particular the parents would like to know how much their kids have learned, given the dollars they pay for the tuition fees.

So this figure comes from a report released last year by the National Institute for Learning Outcomes Assessment. It shows the reasons that higher-ed institutions are using SLO assessments. As we can tell, the regional and program accreditation are still the primary drivers for institutions to use this kind of assessment, followed by institutional improvement. According to many university administrators, they see an increasing importance in the internal use of the assessment results because they want to look at students' relative strength and weakness and see how the information can be used to improve curriculum and instruction, and there are a number of other uses as we see in the figure.

On-screen: [Graph titled 'Reasons for Using SLO Assessments'. the y-axis of graph has 6 points. 0 equals No Importance, 2 equals Minor Importance, 4 equals Moderate Importance, and 6 equals High Importance. Along of the bottom of the x-axis, there are 8 reasons, each with a bar from 2009 and a bar from 2013. All values are approximate. The results are as follows: Regional accreditation extends to 5.5 in 2009 and 5.6 in 2013; Program accreditation extends to 5.2 in 2009 and 5.3 in 2013; Institutional improvement extends to 5.1 in 2009 and 5.0 in 2013; Faculty or staff interest extends to 4.1 in 2009 and 4.4 in 2013; National calls extends to 4.0 in 2009 and 3.2 in 2013, Governing board/president extends to 2.9 in 2009 and 3.8 in 2013; Institutional membership initiatives extends to 2.8 in 2009 and 1.8 in 2013; Statewide/coordinating mandate extends to 2.0 in 2009 and 3.1 in 2013.]

And this one shows the wide range of tools that institutions are using to assess SLO's. The national student surveys, for example the National Survey of Student Engagement, the NSSE survey, is one that's most widely used, followed by alumni surveys, locally-developed surveys. For the locally-developed surveys, they can be both a survey, which only asks about students' engagement and experience, or it can be a cognitive test designed by that university. And then we have the general knowledge and skills measures. These are typically the standardized tests which we are going to focus on today, and there are a number of other ways to assess students' learning. For example, the rubrics. You see that there are two bars in this graph. The lighter one represents the data from 2009 and the green one represents the data from 2013. You can see that there is a sharp increase in terms of the use of rubrics. I think this relates to AASCU's release of their value rubrics, which is a set of very comprehensive definitions at the college level for a wide range of competencies or skills, including critical thinking, written communication, quant literacy and other important skills. Faculty can use the rubrics to adjust their own assessment activities. So as you can tell that it really depends on the purpose of the assessment. An institution can use different tools. And actually according to this survey an institution tends to use on average three tools, so you can see that they use multiple tools, trying to serve multiple purposes.

On-screen: [Graph titled 'Tools to Assess SLO'. The y-axis is labeled "Percentage of Institutions and starts at 0% and extends to 100%. There are 8 tools listed along the bottom of the graph. Each tool has a result from 2009 and 2013. Each value is approximate. The tools and results are as follows: National student surveys go to 74% in 2009 and 84% in 2013; Alumni surveys go to 50% in 2009 and 65% in 2013, Locally developed surveys go to 48% in 2009 and 62% in 2013, General knowledge and skills measures go to 38% in 2009 and 47% in 2013, Rubrics go to 22% in 2009 and 69% in 2013, Employer surveys go to 21% in 2009 and 46% in 2013, External performance assessments go to 9% in 2009 and 40% in 2013, Portfolios go to 9% in 2009 and 41% in 2013.]

So realizing the importance of student learning outcomes, we started an initiative at ETS about three or four years ago, and the project is called HEIghten® and we are charged to do two things. One is to identify the most critical learning outcomes at the college level, and the second is to design assessments for them. So we started with a very broad-based research synthesis because we want to know what are the competencies and skills people are talking about in the current theoretical frameworks, and also in the various reports put forward by organizations and institutions. So we looked at most of the influential frameworks in both higher ed and the workforce, and then we conducted several rounds of market research, both qualitative smaller-skill ones and quantitative larger-skill ones. So in the quantitative market research we surveyed provosts or vice presidents for academic affairs from over 200 US institutions, balancing two-year, four-year, public, private, profit, for profit, to understand what their assessment priorities are. When they think about important student learning outcomes, what are the things that they have on their list? And if they can choose, what are the ones that they want to make assessment for? And then we also actively reached out to higher-ed organizations to understand what they think are the most important learning outcomes.

Based on all these efforts, we identified six that we think are the most important ones. So we have critical thinking, written communication, quantitative literacy, digital information literacy, civic competency and engagement, and intercultural competency and diversity. So today here we are going to focus on the civic competency and engagement. But as you can tell, that is one of the six that we are working on.

So after we identified those learning outcomes, the next step was for us to take a very much research-driven approach to assessment design. We started a framework paper for each one of the six modules. The papers follow somewhat a similar structure in that we have a review of influential frameworks, and that review is really to help us align with the current research and thinking in the target domain, and then we also reviewed existing assessments. I think you are going to hear more about the civic assessments from Judith and Katrina's presentation. For example, for critical thinking we reviewed over 20 existing assessments, and for quant literacy we reviewed over 35 existing assessments. And the exercise here is really to identify the strength and the limitations with existing assessments. One question we asked ourselves is that since there are already so many assessments out there, whey do we need another one? What are the benefits by doing this? So when we focus on the limitations and weaknesses of existing assessments, we know that these are the areas that we can contribute with an ETS assessment.

And then another big section in each of the framework papers is the operational definition, which is really necessary. Take civic, for example. If you look across the frameworks, you can see so many different definitions, but we need a very transparent and clear definition when it comes to a next generation assessment. You're going to hear more about how we reviewed the different existing definitions and how we come up with our own definitions from Judith's presentation. That's really our way to contribute to the community by saying that we've looked at all the important components and with this assessment we are not saying that we will encapsulate everything that's caught out in the frameworks, but we'll be focusing on a few or several dimensions that are really important.

And the last section we provided assessment considerations about item types that can be used to measure the definitions we proposed, and also the item formats, and also things like accessibility, how to design a test form that's accessible for students with disabilities. A limitation with many of the current theoretical frameworks is that they stop at the framework level, so there is little information about how to translate the framework into the actual assessment. So we hope that the last section in our framework papers provides some insights to institutions when they think about how to design their own assessments or when they consider adopting an external measure.

So after we have the framework, we moved to the assembling of the pilot forms. In that activity we considered a number of parameters. The first one is to ensure that there is adequate construct coverage. So if we say that civic competency should measure these three things, we need to make sure we have enough number of items that capture those three dimensions. And then we also try to use innovative item types and balance that with the testing time that we have. We also try to use a range of item formats to make the assessment more authentic and engaging to students. And being responsive to the fact that higher ed is multidisciplinary, the assessment items are embedded in contexts that are relevant to higher education, for example in social sciences, humanities and natural science context. And as I just mentioned a minute ago, we also paid a great deal of attention to accessibility issues because there are 12 percent undergraduate students who have various forms of disabilities and we wanted to make sure that these students also have access to our assessment.

So right now we are working very closely with institutions in order to help them maximize their assessment effort. One thing that we always talk about is to think about the information you want to gather before implementing the assessment. For example, if an institution is interested in knowing the proficiency of their graduating seniors, then they can give the assessment to a group of representative seniors. If the purpose is to gather information on how much students have learned as they move through the college career, then we need to — that's what we call value added. Then we can test a group of students when they just started college and then retest them again when they become seniors, which is referred to as the longitudinal design. Or test a group of freshmen and seniors at the same time, so those would be two different groups of students, which we call the cross-sectional design. And regardless of which design, there are a number of methodological issues that we need to consider, and we've been writing a lot about these issues so our users understand the implications and consequences when we go into these kind of activities. And another thing is of course to make sure we have a representative sample based on which conclusions about the entire student cohort will be made.

Another issue that I wanted to call out to our attention is the motivation. Although students have taken motivation, although the tests bear very important consequences for institutions at the group level, they typically have no consequences for students. So there are two levels of motivational issues for students. One is that they may not have the motivation to sign up for the test, or on a different level they may not have the motivation to take the test seriously, even if they show up at the testing place. So we've done a lot of researching the past six or seven years to clarify the effect of students' test-taking motivation on their test performance, and we found that the performance difference between a group of motivated students and unmotivated students could be as large as .7 standard deviations. To put that number in a context, a lot of studies suggest that the performance difference between freshmen and seniors is about .4 standard deviations. So the motivational effect, it could be twice as large as the effect of four years of college education, which means that we really need to pay attention to this issue. We've done several experimental studies in the past, trying to identify practical strategies that institutions can use to boost their students' motivation, and we are more than happy to share with institutions such strategies so they're likely to have more motivated students in their samples.

And we are also working to make sure that we offer actionable data for institutions from these assessments. So we plan to offer both scaled score at the group and individual level, and scaled subscores for institutions so they can see their students' relative strength and weakness. We also provide proficiency levels with discrete performance descriptors so our score users understand what it means to be, for a student to be at the developing level, proficient level and advanced level, and also see that information across the institution. And we are thinking of introducing motivation to the score reports so institutions know the percentage of their students who are motivated. We have some statistical procedures that can help us determine who are motivated or not. So these are the things that we are doing on our end and the purpose is really to try to help institutions maximize the information they can obtain from their assessment effort.

So that's the end of my presentation. If you go to this website, you can find more information about these larger projects. Now I'm going to turn it over to Judith.

On-screen: [the website]

Judith Torney-Purta, Ph.D. Professor Emerita, University of Maryland, Consultant, Educational Testing Service - Good morning, afternoon. There are some seats if people who are standing in the back want to come up and sit down. Okay, good. Well, you may be depressed, but I'm enthusiastic about this. I've been in this field for 50 years this year, and if anybody told me about 15 years ago, or even 20 years ago, that this would be happening, I would have said they were smoking something because this is an unusual event in many respects. It's especially appropriate for this presentation to be part of this series because I was attending this series about two or three years ago, and I met Tom out in the corridor afterwards, and I just, on a whim I said, "By the way, I'm retiring in a couple of months. Let me know if there's anything that I can do to join any of these projects." And a few months later he asked me about my interest in being a consultant to this excellent team that was preparing this paper, and it was really an excellent experience for me over the last 18 months I think or so, so I thank you for this opportunity as much as anything else.

As your invitation to this event notes, ETS has come to recognize the importance of building skills of civic competency and engagement in order to foster young people's effective participation in democratic institutions. High levels of these skills are associated both with economic growth, and also with efforts to promote fairness and justice. Furthermore, assessments in this area of civic learning fit well into the HEIghten program. They're not a last-minute addition by any means. This paper has the potential I think to be a step forward for the field of civic competency and engagement, especially but not exclusively in higher education. Following the background that Lydia provided, I want to focus on the paper that you have received as a springboard, and I invite you at any point to take out a copy of the paper if you'd like to. It's in your folder. I will be referring to parts of it at various times.

I will consider the overall rationale, some historical background, a comment about current issues and context which surround college students as they acquire civic competency and become engaged. Then I'll move to the major substance of the paper itself which begins with a review of existing conceptualizations and approaches to assessment. When I joined the project 18 months ago, the second and third authors had already built the paper's scaffolding matching that of other papers in the series. I expanded some parts, I filled in some gaps, not all of them, I'm discovering, but many of them, and drew more fully on the research background. To do this I actually relied I think on some of you in this room to help me, because I did not know that much about the programs in higher education, having done most of my work in the pre-collegiate years.

Our team began basically by differentiating the field, getting as many sources of information as we could and then moving slowly toward a synthesis, which is what we're presenting today. I will introduce the two foundational domains and constructs for this new assessment framework based on the current set of the field, and then I believe that this can guide us in moving forward into the next steps of assessment building, and Katrina Roohr, the next speaker, will give you a lot more detail about that and about item types. The question I intend to address is why add civic learning to the competencies on the ETS agenda?

A meeting was convened a decade ago at the Carnegie Foundation by Tom Ehrlich, who was a major thinker in this field. The group concluded, and I know Peter was there—others may have been—quote, "Civic engagement is a public good and also a private good." Assessing what college students understand and how they are motivated in the civic and political arena would clearly be a positive step. Furthermore, surveys of employers point to the importance of civic-related values.

The recent National Conference on Citizenship considered how to connect service learning experiences to the working world. Service learning, for example, creates valuable capacities, but there are few standards or frameworks for program evaluation or for credentialing of students. It's clear that majors in political science are likely to have politically-relevant experience. There's been a movement also I discovered toward public anthropology and engaged sociology. The humanities and sciences are not far behind this. One of the most intriguing concepts we found in the review was of civic-minded graduates who take the sum of what they have learned in whatever major and from whatever value position and apply it to civic and political problems. I think this is a concept that may have legs, as they say. Finally, oh, numerous projects have linked higher education institutions and students to their communities through service learning.

Finally, psychologists like me pinpoint early adulthood as when individuals' lifelong social agendas are often formed. This was me in Chicago. For those of you who know Kay Deaux' book on immigrants, it started to happen when she moved to New York City. And so this is a period when we have an opportunity to have a long-lasting effect on young people. Civic learning engages students intellectually and particularly benefits their ability to view political and community problems as complex, rather than simple problems having an easy solution which can be implemented in the next couple of weeks. It also engages them personally and socially.

As many of you are aware, civic learning has become an increasingly important topic of discussion. The National Commission on Civic Renewal in the late 1990s was chaired by William Bennett and Sam Nunn and called adults "civic spectators." Campus Compact, the organization of colleges and universities with service learning programs is celebrating three years, and I have the sad news that Liz Hollander, who was a major founder of this organization, died last week after a long career. We will miss her a great deal.

A little more than ten years ago, the Civic Mission of Schools Report addressed the K-12 civic education community, arguing for promising practices and suggesting that a core of college students could serve as mentors to younger students. CIRCLE, the Center for Information and Research on Civic Learning and Engagement was founded at about the same time in Bill Galston's office as a group that included Peter and me actually. Tom Ehrlich and Anne Colby at the Carnegie Foundation spearheaded major research, the Political Engagement Project led by Liz Beaumont PEP surveyed students on 21 campuses before and after a program experience, and this has been incorporated into the American Democracy Project at AASCU. In addition, the Association of American Colleges and Universities, which is different, if you can remember it, and leaders such as Carrie Musia and a national taskforce issued the Crucible Moment Report and a roadmap. The Higher Education Research Institute at UCLA and the Lumina Foundation's support of the Degree Qualification Profile with a component on civic learning have made contributions. However, in my opinion there has been insufficient attention to synthesizing these ideas or projects, and especially insufficient attention to assessment of students' abilities, competencies and engagement in a valid and reliable way, and a way that would allow optimal program planning. So what are the issues here?

Ashley Finley of AAC&U articulated the absence of a coherent definition in this way, "It cannot be expected that students or faculty are responding to the same ideas about civic engagement when taking a survey or asked to respond in an interview." No single project of course can create agreement about definition, but I have learned from a number of sometimes painful, but always good learning experiences, that articulating foundational constructs and trying to create an assessment framework is a big step forward. That process requires the collection of existing examples of questions and scales to identify core issues and then rewriting, revising and reformulating. Assessing the psychometric quality of the instrument will be necessary. In my view, the field is plagued by a sustained lack of attention to the psychometric quality of instruments. Just look at the fact they're giving the immigration and naturalization test to I think 8th graders now in some states of this country. It wasn't valid necessarily for its intended purpose. It certainly is not valid for this one. ETS is supremely well qualified to deal with that issue.

With many organizations making proposals and sometimes advocating a very particular type of action, we need to capture a balanced, but multidimensional view of the field which is appropriate for the diversely-talented current generation of college students, traditional and nontraditional. Many college students never took a strong and involving secondary school civics course, and many students attend universities that do not have general studies requirements covering the principles of the American political system, or even concepts essential to understanding political news, such as what is a caucus or what does apportionment involve? Many students are enthusiastic about civic activities, but wary of the conflict in political organizations. They hold high ideals, but find them difficult to carry into practice. Finally, the issue of multiple contexts with their own policy challenges causes additional complexity.

Some definitions and measures are specific to each of these four contexts. Some universities focus on general studies courses on the foundations of American democracy. Other institutions work in the extracurricular space where student personnel services are found. This is often in programs designed to promote students as leaders. There is an impressive range of service learning opportunities with many innovations and some useful evaluations and research. Many of these programs focus not only on the learning by students, but on the campus's obligation to help solve community programs. Finally, there are programs that promote voter registration in a bipartisan way and others that do not shy away from partisan political conflicts. I have placed online activities in this box, although this is very wide ranging from individuals participating in consumer action against child labor to organization an active political demonstration.

At the NCOC conference there was a panel on online activity, and I stood up and asked whether Twitter was still where researchers should continue to focus if they wanted a picture of student civic activity. No one on the platform was willing to provide an answer to that question. They think there will be something new coming along and they don't want to miss it. At least that's one interpretation.

Now to the substance of the research report, and I have time to only briefly introduce the first three tables in the paper. This is where our team tried for maximum coverage and to differentiate maximally between approaches in projects. The large majority dealt with US higher education, but we included a small number of international projects and a few projects dealing with upper secondary students. Table 1 on pages 6 through 9 includes the constructs or concepts around which programs are based and quotes detailed definitions. If you want to know how many ways there are to say civic competency, you'll find it in Table 1. If an assessment did not have an explicit framework designed for some kind of program guidance, that project probably appears only in Table 2, which is the assessment table, which includes the measure's name, the institution or researcher who was the source, the format. Is it multiple choice, open ended, agree-disagree items, often called Likert Scales? Then we have a column for mode of delivery-paper and pencil, computer, phone survey—the number of items, the target audience, the themes and topics, and including many of, when they were available, scale and subscale titles. In addition to guiding our team's work, this table provides a service to the field as a starting point for individual researchers. If you know someone starting a dissertation in this field, give him this table, or her. Table 3, page 31 to 33, synthesizes what we observed in our review into a set of phrases about what we expect students to be able to understand or have the ability to do. In other words, a kind of operational definition.

Okay, I'm going to give you some samples of the tables. The space on a PowerPoint slide is inadequate to represent even Table 1, which has only three columns. However, I want to give you a sample of the conceptualizations from three organizations — AAC&U, AASCU and the American Association of Community Colleges, with which you may be less familiar. You will see from a scan of the second column that these examples, as well as most of the rest, include knowledge skills, motivation and action in some form or another, sometimes with different phraseology. I have marked with an asterisk one entry in column two for each organization and tried to give you the flavor of that definition in the third column. For example, AACC was especially strong on skills, such as the ability to judge the quality of political and civic information. But you have to look at the paper for further definitions and for a lot more detail.

On-screen: Condensed Samples of Conceptualizations (Table 1 pp 6-9):

Framework/Organization Terms or Constructs Excerpts from definition
Association of American Colleges and Universities
*Civic literacy
Civic inquiry
Civic action
*Foundational knowledge of fundamental principles – e.g., historical.
American Association of Community Colleges
Intellectual skills
Participation skills
*Research skills
Persuasion skills
*Track issues in media; research on community issues; judge information reliability
American Association of State Colleges and Universities (From PEP)
*Voting, volunteerism, voicing an opinion, direct action on a social problem, consumer-oriented action


Now for a brief sample from Table 2. We're doing assessments. Again I've tried to represent a range on the slide. The Intercollegiate Studies Institute believes that foundational knowledge about American exceptionality is an essential component of informed citizenship. Their website contains a rich set of details. The IEA Civic Education Study which I had the privilege to coordinate is a research effort. I include it because of the process of constructing this assessment instrument beginning with case studies from 24 countries, distilling and synthesizing into a content framework that nearly 30 national representatives could agree on, and then constructing and adapting knowledge and attitude items to make them clear enough to translate into 20-some languages. For the third example I've chosen the personal and social responsibility inventory constructed by Robert Reason at Ohio State based—I'm sorry, Iowa State, Iowa State—based on a literature review and on consultation with AAC&U. The dimension taking others' perspective seriously appears here and frequently across the frameworks and assessments reviewed. An innovative feature of this instrument and a few others actually is asking students to respond both about themselves and about their college or university's atmosphere and activities. Again I refer you to Table 2 for a very wide range of these instruments.

On-screen: [Condensed Samples of Assessment Descriptions (Table 2 pp 11-20):

Measure/Source Target Audience Themes/topics
Civic Literacy Exam: Intercollegiate Studies Institute College freshmen & seniors in phone interview or on the web Knowledge of themes related to "ordered liberty in America" – history, economics
CIVED Study: IEA (International Association for the Evaluation of Education Achievement) 14 year olds and 17 to 18 year olds, nationally representative samples of schools in 29 countries Civic knowledge/analytic skills – about democratic institutions, citizens; also attitude & behavior items
Personal and Social Responsibility Inventory: Iowa State University College students (responding about self and institution) and university personnel Taking others' perspectives seriously, contributing to the community


Here is the basis of our synthesis and the framework that provides the dimensions that our team believes might appear in a future assessment. Civic competency is the domain that includes the more cognitive aspects of civic learning, assessment of knowledge, analytic or cognitive skills and participatory skills lend themselves to items with correct answer keys. The second major domain, civic engagement, includes those assessments that do not have literally correct answers, but are also essential program outcomes for many.

On-screen: [Assessment Framework (Table 3 pp 31-33)

Domain Constructs
Civic Competency -Civic Knowledge
-Analytic Skills
-Participatory Skills
Civic Engagement -Motivation, Attitudes, and Efficacy
-Democratic Values
-Participation and activities


Civic and political knowledge is probably the most familiar construct here. We expect it will include what some call foundational knowledge, but also knowledge of political institutions, contemporaneously and civic processes. The first section of Table 3 includes categories of knowledge students are expected to possess.

On-screen: [Civic Knowledge

Facts, concepts, and principles
Concepts: Local, national, international, and past or present

Possession of:

  • Knowledge of government structures and processes
  • Factual information on institutions and processes

Ability to:

  • Relate national to international or historical to current

Understanding of:

  • Fundamental principles (e.g., democratic processes
  • Legal aspects (e.g., voting, citizenship)


Civic skills have two parts. Analytic skills questions will assess abilities and understandings, asking students to look for supportive evidence, analyze a mocked-up news story, or a graph or a political cartoon to give a few examples. The inclusion of participation skills here is a little bit innovative. When one looks at many service learning programs or social movements, it is clear that there are more and less valuable ways for students to approach groups, especially if there are conflicts in those groups. For a student to enter a discussion in a community with a single-problem solution to sell to the group is usually not the best approach, but many 20-year-olds don't realize that. Thus this section assesses abilities and understandings, focusing probably on situations presented to the students, hypothetical situations that focus on participatory and leadership skills such as how to enhance cooperation, give everyone a chance to express themselves and avoid the premature closing of discussion.

On-screen: [Civic Skills

Analytic Skills Participatory Skills
Application of political and civic knowledge to identify perspectives and to recognize, interpret and respond to issues presented in text scenarios and graphics. Ability to make reasoned judgments about political and civic situations or problem solving processes, especially in group and/or community contexts.


Civic engagement also has three parts. A sense of realistic political efficacy is one. Motivation to be interested in political matters and to communicate with others about them is another. The support for basic democratic norms is part of the large majority of the conceptualizations and especially central in some of them. Then there is a range of actual participation. In all three civic engagement areas researchers have developed open access sets of well-identified and usually psychometrically-strong items which can serve as a starting point for the reformulation and piloting that is needed to build psychometrically-strong subscales in these three areas.

On-screen: [Civic Engagement

Motivations, Attitudes, & Efficacy Democratic Norms & values Participation & Activities
Interest, involvement or engagement in attending to political information Belief in basic principles of democracy; sense of civic responsibility; valuing pluralism and diversity Civic and political behavior and actions (face-to-face and on-line)


Now the conclusions and a review of where I hope I have brought you in describing this effort to enhance civic competency and engagement in higher education by proposing an assessment framework. We have reviewed a wide range of constructs and contexts. What I have presented today barely skims the surface. The tables take you one step further, as well as the text, and there is more still to be done to enrich the picture. My purpose has been to show you that it is possible to synthesize work in this area into two major foundational domains, each with three parts, and that it appears feasible, though certainly challenging, to develop the next generation of assessments.

I know you will find Katrina's more detailed discussion of interest. Then we will have Peter's discussion, and after lunch we will have an opportunity for all of you to ask questions. Thank you very much.

Katrina Roohr, Ed.D. Associate Research Scientist, Higher Education Research, Educational Testing Service -Good I guess now afternoon. So I'm going to talk about how we can actually translate the framework that Judith just discussed into a potential assessment for civic competency and engagement. So just a brief review. I know that we just went over this, so this will be pretty quick, but the assessment framework has two main components, civic competency and civic engagement, with civic competency including civic knowledge and civic skills, including analytic, and participatory and involvement skills, and civic engagement including three parts, including motivations, attitudes and efficacy, democratic norms and values, and participation and activities.

And one thing I wanted to point out is what you have here is you sort of see two distinct columns. So in terms of translating this framework into an assessment, at least as of now we're thinking of civic competency sort of as one main component and then civic engagement as another, really because they're focusing on two different pieces, with civic competency focusing on the cognitive piece and civic engagement focusing more on that motivational side and self-reported perceptions and attitudes. So with that in mind, that has really been guiding sort of how we've been thinking about the different question types that you may want to use on an assessment, which could be driven by whether the items are cognitive or a sometimes called non-cognitive, but in this case it's sort of that engagement piece.

So as we translate an assessment framework into an actual assessment, there are a number of challenges that we also have to think about. So as you saw on the previous slide, and as you heard in Judith's presentation, the idea of both civic competencies are civic engagement are very multidimensional and very complex. So with that we need to think about how an assessment can potentially present information to stakeholders and provide useful information that could be used for things such as instructional improvement which Lydia had discussed.

So with that we have to think about ways to create reliable and distinct subscores, which can be a challenge, and we really need to make sure that we have a sufficient number of items that cover these different subdomains to ensure and make sure that we can actually report reliable and distinct subscores. And we'll have to do a number of analyses to really ensure that those subscores are providing meaningful information to institutions and that the information is not misleading.

We also have to think about fairness for subgroups. There have been a number of research studies that have come out that have shown that there are in fact gender differences on certain political items. So that's something that we really have to consider as we're developing various test items to make sure that we're not having any items that might be biased against particular subgroups.

Lydia discussed both the issue of accessibility and motivation, which are things that we really have to think about as we're developing this assessment as well, and Judith briefly touched upon the issue of context. One area that I also want to point out is the potential issue of inauthentic responding. As mentioned with civic engagement it's more the self-report type items, so it's possible that a student may, intentionally or unintentionally I guess, inauthentically respond to a particular item. For example, if it was a Likert type item where an examinee would either strongly disagree or strongly agree, they might feel as thought they should always strongly agree with a particular item so that they can appear as though they are more civically engaged. So this is something that we have to think about when we're developing the assessment to make sure that there aren't potential issues of inauthentic responding or socially desirable responses.

So with that in mind, some of those assessment challenges have really been sort of leading us into the principles for our choice of various question or item types. So one thing, and Lydia pointed this out as well, is that we really need to think about using a variety of question types or item formats. So with that we've been thinking about the use of technology-enhanced items, which means we would be using a computer-based administration for the assessment, which could potentially make an assessment more authentic and also help to keep the test taker engaged. So instead of just having sort of your straightforward multiple-choice item, there might actually be some kind of statement in an exam you might have to click on a box. So it still may be a self-report, sorry, a selected-response type of item, but it's a little, it may look a little different so it creates more of an authentic feel as an examinee is taking an assessment. Given that, though, also making sure that we're still producing a psychometrically-sound instrument, so making sure that the assessment scores are both valid and reliable, so that's something that we've been thinking about as we've been considering the various item formats that we could use on an assessment of civic competency and engagement.

These items should also address some of the assessment challenges. So, for example, a situational judgment item, which I'll present an example in the following, in the next slide, could be potentially used as a way of addressing inauthentic responding, and let me show you an example. So this is an example of a situational judgment item. This is from a paper by Peeters & Lievens in 2005, and as you can see it's not related to civic competency engagement, but I just wanted to give you sort of an understanding of what these items would look like. So in this particular example the stem says, "You have many assignments to complete and so much studying you accomplish, you feel you will never get caught up or accomplish anything. You are truly overwhelmed. What would you do?" There are four responses, sort of varying levels, with one being to prioritize your activities, one being to decide what you can do reasonably and focus on getting that work done, one is to talk to your professors and explain the situation and potentially ask for an extension, and the other one is to maybe take a break for a day, go out with friends and then go back to working hard. Now there are a number of responses here that an examinee could select. If this was for a civic engagement type of context where there might not be a correct answer, an examinee could select whichever item they felt best fit something that they would do and that would potentially match up to some kind of motivation or behavior that would be in a civic context. Now this particular item does sort of have a more correct answer—okay, it is turning red—which is answer A. And so these types of items could also be potentially used to assess civic skills, the analytic and participatory involvement skills where there may be a more correct approach that is sort of understood based on the situation and based on your civic knowledge that you would need to actually answer the question appropriately or understand how you would act in that particular situation. So this particular item type has the advantage of being potentially useful both to measure civic skills or to capture some of the areas in civic engagement such as democratic norms and values, which might sort of, which could potentially have some of those inauthentic responding.

We also have been thinking about the use of Likert type items. This is an example that was from the Carnegie Foundation's PEP project that Judith described briefly. So in this particular item you can see that there are a list of activities and an examinee would have to say whether on a scale of 1 to 6 they would certainly not do this to whether they would certainly do this. And these particular items are really great because they're quick and easy and a way to collect a lot of information about students, especially if we think about the types of activities that students are participating in in relation to civic engagement. So this type of item would be more appropriate for our civic engagement section of the assessment because they're more about self-report and about the students' perceptions versus the cognitive piece.

On-screen: [Sample Likert-Type item

Below is a list of activities. In the future what do you expect that you will do?

(Scale: 1=Will certainly not do this, 6=will certainly do this)

  1. Write to a newspaper or magazine to express your opinion about an issue
  2. Sign a written or e-mail petition about a political or social issue
  3. NOT buy something because of the conditions under which it was made
  4. Call to a radio or television talk show to express your opinion

Source: Carnegie Foundation for the Advancement of Teaching Political Engagement Project]

So with that we also have to think about the type of task that a student would have to take when they actually are answering an item. So in the paper we actually talk about a number of possible task types, but here's a sample of just a few. So, for example, one type of task could be: analyze a document or argument where the examinee would actually have to review an existing document, argument or graphic before they answer a question. One particular task type that I really want to point out is the justification. So this would be where an examinee would actually provide a rationale for a previous response to a self-report item, and this is another way that we've been thinking about sort of reducing some of that inauthentic responding. And let me provide an example of what a justification sample item might be.

On-screen: [Possible Task Types

Task Type The examinee:
Analyze a document/argument Reviews an existing document, argument, or graphic before answering a question
Draw conclusions Draws inferences from information provided or extrapolates additional likely consequences
Fact checker/recognize bias Reviews and analyzes facts and opinions, recognizing misleading information and/or bias against certain groups
Justification Provides a rationale for a previous response to a self-report item
Perspective taking Role plays, takes perspectives, or chooses which response is the best choice for particular participants/stakeholders


So first this would follow an item where the examinee would potentially mark the types of activities that they, or check off the types of activities that they've been participating in in the past year, and then the examinee would have to provide a specific example of one of the activities you engaged in. Now you could argue that an examinee could still just make something up, but the idea behind this is that it involves a little bit more thought than simply just checking off a box. So, and I guess if an examinee could come up with an example, that's also still pretty impressive. And the benefit to this particular task type is that it could help to reduce some of those socially-desirable responses, and it could also actually then provide additional information to the institution about specific activities that students are involved in in relation to both civics and political activities. So that's sort of the advantage to having this particular justification task.

And the next thing that we have to think about is the context in which these tasks are embedded in. So there are sort of two different pieces, one being the level, including local, national and global, and then the other being the setting, which can sort of fall into the level, including workplace, institution, community or neighborhood, political organizations, or online or virtual. And Judith already discussed briefly about some of the online virtual pieces in that setting, but I think it is something that's really important moving forward. A lot more college students have been engaged online in terms of political activities. I see it on Facebook all the time. You know, "Here's an article from the New York Times. Here's something from the Huffington Post." And especially right now with all of the GOP debates and things like that, you're seeing a lot more of these live action streaming. So this is another way that you're seeing more people being engaged, so it's really important that we're capturing that when we think about the context of various test items.

Lydia also briefly mentioned something about accessibility, and as she mentioned there is about 12 percent of the population in the college level do have disabilities. So something that we've been thinking about from the very beginning is this idea of universal test design, and this is making sure that the assessment is designed for all students within our intended population, which in this case would be all college students regardless of college major. And so with universal test design it's this idea that you develop the test with accessibility in mind so that you have the minimum number of item adaptations, and what I mean by that is sort of the minimum number of ways in which you have to increase the font size, or have different color codings or things like that. So that's where the advantage of an online administration can really come into play where a student could, where it's a lot easier to essentially adapt the font size on a test form. That being said, there are certain things that we have to consider and that may not be easy for us to just increase the font size. So it's possible and likely that we will have to develop a separate test form for, for example, for visually-impaired students. An example might be taking a political cartoon which would actually need to have a detailed description that could be then used and put into Braille form into an audio form.

So in terms of next steps for this particular project, we have a number of things that we need to really do before an assessment could even become operational. The very first thing that we're going to be doing moving forward is we're going to consider our various item formats and test types, some of which are presented in the paper, and actually prototype them with a small sample of students and conduct some cognitive interviews. So that means actually interacting with students face-to-face and getting their perceptions about the test items, as well as whether they feel any might be more biased for example. The other thing that we're going to be really asking and focusing on in the cognitive interviews is whether on the civic engagement piece there are any items that might actually make a student feel as though they should be responding a certain way to understand whether that issue of inauthentic responding actually is coming into play. We'll also be getting feedback from a number of user audiences or institutions and individuals such as yourself to get a feel for how we should be thinking about this assessment.

We'll then be taking all that information and actually revising our test blueprint that we've currently created, which will then bring us into item writing and actual test development for a large pilot study. The pilot study will likely include over 40 institutions and a number of students that we can then use that information to conduct a number of validation studies. And this will all happen even before the assessment goes operational, so we have a number of steps that we really have to get through before we'll be seeing that. But that's where having these types of forums is really beneficial to get your input on what we've been thinking about and how we've been thinking about these issues. So that's all I have for today. Thank you very much and now I'm going to pass it on to Peter, who's going to provide more of a different perspective on these presentations. Thank you.

On-screen: [ Ou Lydia Liu:, Judith Torney-Purta:, Katrina Roohr:]

Peter Levine, Associate Dean for Research and the Lincoln Filene Professor of Citizenship and Public Affairs, Jonathan Tisch College of Citizenship and Public Services, Tufts University. Thank you. That's great. I'll let you get to lunch soon. I won't take too long. But I was thinking, Judith, we've been to a lot of meetings, and we've seen a lot of papers, and we've written a few, and we've seen a lot of projects and I'm getting to the point where I only remember a very small subset of those fancy meetings. But I think we'll remember this one because I think this has, because I think this is well thought out, has high quality, but also I think it has the potential to actually be there, have a lasting effect and so we'll be constantly reminded of this meeting because it'll actually be out there in the world.

So just a few thoughts. One is to put this in this historic context. In different ways both Lydia and Judith did that, but I'll just kind of underline the challenge and the importance of this. So there's a long tradition of American higher education being understood as predominantly or mainly for democracy. So the founding of many of the older universities in America put public service or democracy at their nested purpose of their founding. The Morrill Act, which expanded public higher education, said that. You may or may not know this, but right after World War II of course education was, higher education was dramatically expanded thanks to the GI Bill, and the template for that or the announcement for that was the Truman Commission Report on Higher Education 1947, and it said, "Educating for democracy should come first among the principal goals of higher education." All right, "Education for democracy should come first among the principal goals of higher education."

So Judith did a very nice job of talking about the kind of movement for civic education and democracy in higher ed since the 1980s. Campus Compact, and AACU, and AASCU and the other organizations have done tremendous work. This is how I've made my life in the last 20 years. These are my friends and colleagues. I think we've achieved something, but we're certainly not at a place where the country would say, "Oh, yeah, the primary goal of higher education is democracy." They normally forget to mention it at all in big blue ribbon reports. And so the primary purpose of education is producing economic returns. That's what it is. And so that's one context, and I think the work that my good friends and colleagues have done is important, overlooked, high quality often, but honestly marginal in the big, against the big multibillion-dollar kind of investments.

Meanwhile there's the movement for measurement and accountability in K-12 and higher ed as well that Lydia mentioned, citing Margaret Spellings. That's a complicated topic. I think there's mixed impact from that. But I don't think it's all bad at all, and I do think that one of the things that higher education would need to make democracy education really central again would be a culture of measurement and constant improvement. It's probably better for that to be; just in general it's better for accountability to be owned by the people who are doing the work rather than simply imposed from the outside, but that's what we're trying to do here. We're trying to help colleges and universities measure. We're not saying, "You have to measure." Likewise, we're not telling the students, "You fail if you don't pass the test, but we are collecting data."

So why do we need that culture of measurement and accountability? I think, well, first of all let me say a little bit more about why I don't think we have it. So there is some measurement going on, and again I think it was Judith who showed a slide with HERI surveys and other things, and these are valuable. These are completely valuable. They have been hard to develop, and they've been hard work and they have taught us things. But we're still in a place where most of the measurement is inadequate to the task. A lot of it is, that really matters, are counting various kinds of outputs. So a lot of the attention in higher ed is spent on trying to count how many volunteer hours you have, for example, so that you can report to the federal government. Outputs are something, but they're not ultimate outcomes. You can spend hours, you can waste those hours, and we don't know whether people are either learning or changing the world.

And then there's a lot of self-report. There's self-report by individual students, which is maybe almost inevitable, although Katrina helped us see how you could make that better. There's also a lot of self-report by institutions, and I've done that myself on behalf of Tufts and we get recognition actually, but some of that is about the skill of the self-reporter. So if you're a very professional member of this field, you can write about what you do at the university in a way that will get you recognition. So the Carnegie classification for engaged universities is a very valuable thing, also very hard to achieve. Very well done by Tom Ehrlich and others, but I think a lot of that is the winners of that are the people who know how to play that game.

So the consequences of that are we don't know what works, not very well. We know a little bit, but we don't know much about what works. We don't direct our resources in the right direction, so a very, very typical phenomenon is that there are wonderful programs on a campus, and they are going to the most motivated and highly-skilled students, and we're not looking, we're not even seeing who isn't motivated and highly skilled. And then also in an environment where there is this pressure for measurement, if you're not doing it then all the real attention is going to the things that are measured. And that's of course dramatically true in K-12, but I think it's increasingly true in higher-ed as well.

Briefly, there are some special measurement challenges for civics. I think that all the presenters did a really good job of describing kind of measurement challenges that they've dealt with, but I just want to underline how civics is especially difficult. Three things that loom large. And I'll preface this by saying I was working for ETS last week, too, working on the NAEP, National Assessment of Educational Progress civics test, which is going to, well, it's an assessment. It's a zero-stakes test which is going to be given only for 8th graders, so it's much younger kids. But all these; and the NAEP is a high-quality instrument and I'm proud to be part of it, but all of the following limitations or challenges are really brought into stark relief by what NAEP does and doesn't measure.

So one thing is civics would be in large part about current events, current activities, current and often local events. So when you're a citizen, you're worried about what's happening now, here, and that's really hard to do with any kind of standardized measure. So the NAEP doesn't have anything about current events. So when we say, when you read that 24 percent of American kids only are proficient in the NAEP, that means that they're proficient in the understanding of things that happened at least 50 years ago because they don't measure anything either local or current, and they can't really. It takes three years to design and field the NAEP. It's national. It's supposed to be longitudinal, so, well, you're supposed to be able to track change over time. That's the purpose of it. So you can't put in a current events question about ISIS. What would you ask in three; first of all, you'd be writing it now, and I hope that we won't be thinking about ISIS in three years, although we probably will, but also the question would have to change. So current events are gone and that's a huge problem when you think that what you'd want to know is, a big thing you'd want to know is whether; and also local for the same reason, because a national test can't ask what you think about the D.C. board of whatever. I'm sorry. I used to be a D.C. resident for many years. I'm actually proud. Susan knows I'm actually a D.C. statehood proponent. I just blanked a good D.C. issue though, not for lack of caring.

[Unidentified female] Well, D.C. statehood would be one.

Yes, I'm for it. The second kind of special problem for civics is it's very much about interaction with other people. That's what we want people to do is interact well. That's what a good citizen does. And I think that the paper does the right job of proposing a framework for actually measuring interaction, and it would be a step forward, but it's a fundamental problem because we still are imagining people taking a test, that you're not allowed to look at your neighbor's paper, and that's just especially problematic or challenging for civics. I'm not saying that you guys don't have the best available answers, but it's especially problematic. I mean scientists need to collaborate too, but you're not studying collaboration. So a good physicist knows how to collaborate, but when you test them on physics, you're testing them on nature. But when it's civics, you're testing them on collaboration. It's the actual end and so it's an especially challenging problem.

And the third problem is values. So I think in the 21st century we don't do a good job thinking about the ways facts and values come together just in general. That's a deep, complicated problem, and so we don't even do things like science in a way that handles the fact value issue wisely. But it's really in your face with civics because you're measuring values and dispositions, and because every even factual question item you might ask is value laden. So in a context like NAEP, which is a federally-funded and overseen survey which by law can't ask about values, the pressure is to try to make the questions value free, and that means making them as concrete and factual, true or false as possible generally, but that's actually impossible. So even if you ask; and it's impossible in a way that's not true I think for physics, even though there's complications there too. So if you ask a question, you know, "How many branches of government are there?" for an 8th grader, that's a bad question. But if you ask a more sophisticated version of, "How many branches of government are there?" it looks like, well, true or false, you can look it up, but actually there's a whole bunch of values laden in that. First of all, I don't think there are three branches of government. I think that we forget about the administrative agencies, the national security state, the military, the military-industrial complex. We're on K Street. We forget about a branch of government which is located on K Street. What I'm doing is I'm suggesting a critical interpretation of the Constitution which says that it's not actually descriptive of the government that we have. I believe that's actually true, which means that the question how many branches of government we have is false, but what I'm saying is there's a value conflict. But there's also other things laden in that. One is that the citizen should spend time primarily thinking about the formal structure of the national government, which I do think is important, but there's no questions on the NAEP at all about the structure of their local government or about global issues, and there's much less about what they would do in their community. So there's a value judgment being made with that question, imaginary question, so that the challenge for values is particularly acute, and I think that; I mean the paper is explicit about this and wise about it, but this is just part of the challenge.

So I think the framework and then the actual assessments that come out of it will make a huge contribution by being thoughtful about those questions, by being relatively reliable, by being smart about the practical challenges, everything from response bias to just affordability, so we're not talking about, you could do different kinds of studies that would be much more expensive. And of course it will give comparable data, which is critical because we have a lot of homegrown assessment measures in the field, but not only are they typically not very good, but also they're not comparable, so you can't tell what's going on.

My colleagues on the team invited me or really encouraged me to talk about two other things we're doing at the Tisch College of Citizenship at Tufts, so apologies for being self-promotional, especially when you want lunch, but I'm going to mention them briefly because actually I do want to promote them, but also because what I'm going to argue for very — this is the last thing I'll say — is that we probably want to have a toolkit of assessment that's broader than a test, although a test, a standardized test should be part of it. We probably want to have some other options, and so here's two others that we actually have.

One is the National Study of Learning, Voting and Engagement, NSLVE, which you can google. It's on the Tisch College website. So we have about 700 colleges enrolled in that study right now. If a college signs up — it's free — they get a detailed report of their students' voting from us. And it's private to them, and they can use it any way they want, but it tells them who voted. Not the names. We keep the voting records of the individual private from the school, but we tell them whether their education, how their education majors voted, how their in state students voted, how their freshmen voted. So this is complementary because, so the problem with it is that voting is only one discrete, concrete act. It doesn't tell you what they know, or why they know it or anything about; it certainly doesn't tell you who they voted for, but that we want to do. The advantages of it though are no self-report. This is the actual voting record free and eminently comparable, so we immediately know how your voter turned out compared to everybody else's. So that's the complementary effort.

And then a different complementary effort we've been developing with support from Bringing Theory to Practice and some others, an elaborate multiplayer video game called Civic Seed. You play it. It takes a bit too long to play right now, so we need to cut it, but on the other hand I want to emphasize the length because it's a pretty deep learning experience, so it takes six or eight hours.

Civic Seed. And you have to interact with other students, and you have to build stuff together, and as a result you learn quite a bit about the specific host communities where our university is located, Medford-Somerville, Massachusetts, and the result at the end is you get a badge which says that you are qualified to do community service in that community without being uninformed and un-respectful of the community. And so that's an example of interactive and badge generating, which is different from what you guys are doing, although you could give a badge for high scores on the test. So I think, and it's actually quite a bit more expensive to develop, although some of those costs have already been raised, because it's an elaborate multiplayer game, so it's not a bunch of question items. So I think what we; so you're welcome to look those things up and work with us on them, but I think what we need to work towards collaboratively is a suite of measurement tools that include surveys, tests, games, badges and hard data about things like voting that we can start using not only to measure, but of course ultimately the only thing that really matters is to improve. So there has to be a cycle where people are actually using those data in colleges to make decisions, and that's, I guess that's what's on us to accomplish. Thank you.

End of Video: ETS Research Forum Innovations in Conceptualizing and Assessing Civil Competency and Engagement in Higher Education

Video Duration: 1:10:15