LTRC 2012 34th Language Testing Research Colloquium – PRINCETON April 1 - 5, 2012

Pre-conference Workshops

The 2012 LTRC organizing committee is pleased to announce three pre-conference workshops. Based on feedback from many of our colleagues, we have identified a set of topics that will interest both researchers and test developers. LTRC participants will have the opportunity to develop their knowledge of a highly relevant method or test development process under the tutelage of renowned scholars in the area. We hope that the LTRC participants will take advantage of these workshops and the opportunities to engage in discussion with the instructors and colleagues.

Workshop 1
Assessing the Ability to Convey Semantic and Pragmatic Meanings in Language Assessments

Sunday, April 1 – Monday, April 2, 2012
Instructors: James E. Purpura & Kirby C. Grabowski, Teachers College, Columbia University

Communication typically involves a social, cultural, and cognitive activity in which two or more participants utilize linguistic resources and relevant contextual elements to construct meanings collaboratively through written or spoken interaction. These meanings include the literal meaning of utterances, the intended meaning of these utterances given some context, and the extended meaning of utterances that are derivable primarily from the context of the situation itself and from an understanding of the shared norms, assumptions, expectations and presuppositions of the interlocutors. In other words, communication at any level of L2 proficiency involves not only the conveyance of semantic propositions (ideas, information, beliefs, intentions), but it also embodies a host of implied meanings relating to the interpersonal relationship of the interlocutors, their affective stance, and their role and identity in the social and cultural context of communication. As the primary purpose of communication is the conveyance of these meanings (rather than the linguistic precision with which these meanings are conveyed) and all language assessments, implicitly or explicitly, elicit semantic and pragmatic meanings, language testers need to address the role that meaning and the effective conveyance of these meanings play in assessments.

This workshop will introduce participants to the principles underlying the assessment of semantic and pragmatic knowledge. We will first discuss the importance of assessing for meaning and the implications this has for validity. We will then review how the assessment of semantic and pragmatic meanings has been conceptualized and operationalized in the language assessment literature. We will then define the constructs of semantic and pragmatic knowledge to be used in the workshop. Participants will then be asked to code tasks for these constructs. The afternoon will be devoted to the principles underlying the development of assessments designed to tap into semantic and pragmatic knowledge in a range of task types. Participants will then have the opportunity to develop tasks and present them to other participants. The second day will be devoted to issues revolving around scoring. Participants will have the opportunity to score language data for a range of semantic and pragmatic meanings and to discuss the challenges of assessing these constructs.

Dr. James E. Purpura is Associate Professor of Linguistics and Language Education at Teachers College, Columbia University, where he teaches courses in second language (L2) assessment and research methods. His research focuses on the assessment of grammar and pragmatics, the cognitive underpinnings of L2 performance, and the interface between L2 learning and assessment in instructed settings. He has written Strategy use and second language test performance (1999, CUP), Assessing grammar (2004, CUP), and is currently co-authoring a book on learning-oriented assessment in language classrooms (Routledge). In addition to being a senior associate editor of Language Assessment Quarterly, Dr. Purpura is co-editor of the series entitled, New Perspectives in Language Assessment (Routledge). Dr. Purpura served as the President of ILTA (2007–2009) and is a member of the TOEFL Committee of Examiners.

Dr. Kirby C. Grabowski is Lecturer of Linguistics and Language Education in the TESOL and Applied Linguistics Programs at Teachers College, Columbia University, where she teaches courses in second language assessment and language pedagogy. Her research interests include the assessment of grammatical and pragmatic knowledge, discourse analysis and program evaluation. Dr. Grabowski served as Managing Editor for Teachers College, Columbia University, Working Papers in TESOL and Applied Linguistics (2007–2009), and she is currently on the Editorial Advisory Board of Language Assessment Quarterly. She was also a consultant on the development and validation of the Oxford Online Placement Test (OOPT). Dr. Grabowski was a 2007 Spaan Fellow at the English Language institute, University of Michigan, and she was a 2007–2008 Research Fellow at the Office of Policy and Research at Teachers College, Columbia University. She was the 2011 Jacqueline Ross TOEFL Dissertation Award recipient from Educational Testing Service.


Sunday, April 1
9–9:30 a.m. Introduction to the assessment of semantic & pragmatic knowledge in language assessment
9:30–10:30 a.m. Defining the components of semantic & pragmatic knowledge
10:50 a.m.–12:15 p.m. Hands-on task: Coding items for semantic & pragmatic knowledge, review and discussion
1:15–3 p.m. Developing tasks to measure semantic & pragmatic meanings; hand-on task development activity
3:20–5 p.m. Review of tasks; discussion

Monday, April 2
9–10:30 a.m. Scoring considerations: task types, rubrics, rater agreement
10:50 a.m.–Noon Application of scoring rubrics to data; discussion of scores
Noon–12:15 p.m. Wrap-up & evaluation


Workshop 2
Latent Growth Modeling for Language Testing Research

Sunday, April 1, 2012
Instructor: Gregory R. Hancock, University of Maryland

This is an introductory workshop on modeling longitudinal data, with an emphasis on latent growth modeling, and is aimed at language testing professionals and graduate students in the field. Participants are expected to have a basic understanding of multiple regression and introductory structural equation modeling (SEM). The workshop will use the SIMPLIS language within LISREL (Scientific Software International), which is the first and still one of the most widely used SEM programs. The workshop will use the free student version of LISREL installed on all computers.

This workshop will start by reviewing the basic principles of SEM with measured and latent variables, illustrating the use of SIMPLIS for such models. Next, latent means models, which add a mean structure to typical covariance-based structural models, will be introduced and illustrated within SIMPLIS. Having laid the foundations, we will then cover the basic aspects of linear latent growth models, including different time centering, uneven and varied time points, and time-independent covariates. We will then proceed to more complex topics, including time-dependent covariates and models for assessing treatments and interventions. As time allows, we may also overview nonlinear models, multidomain and cohort-sequential models, second-order growth models, latent-difference score models, and the principles of growth mixtures and models with categorical data. The workshop will have hands-on exercises to allow participants to practice running and interpreting latent growth models using SIMPLIS.

Participants are encouraged to bring a laptop computer. If you do not already have a recent version of LISREL (8.8) on your laptop, then please download the free student version at the following link:

Dr. Gregory R. Hancock is Professor and Chair of the Measurement, Statistics and Evaluation program in the Department of Human Development and Quantitative Methodology at the University of Maryland, College Park, and Director of the Center for Integrated Latent Variable Research (CILVR). His research interests include structural equation modeling (SEM) and latent growth models, and the use of latent variables in (quasi) experimental design. His research has appeared in such journals as Psychometrika, Multivariate Behavioral Research, Structural Equation Modeling: A Multidisciplinary Journal, Psychological Bulletin, British Journal of Mathematical and Statistical Psychology, Journal of Educational and Behavioral Statistics, Educational and Psychological Measurement, Review of Educational Research, and Communications in Statistics: Simulation and Computation. He also co-edited with Ralph O. Mueller the volumes Structural Equation Modeling: A Second Course (2006) and The Reviewer's Guide to Quantitative Methods in the Social Sciences (2010), and with Karen M. Samuelsen the volume Advances in Latent Variable Mixture Models (2008). He is past chair (three terms) of the SEM special interest group of the American Educational Research Association (AERA), serves on the editorial board of a number of journals, is Associate Editor of Structural Equation Modeling: A Multidisciplinary Journal, has taught dozens of methodological workshops in the United States and abroad, and is a Fellow of AERA and the 2011 recipient of the Jacob Cohen Award for Distinguished Contributions to Teaching and Mentoring by Division 5 of the American Psychological Association. His former students are employed in academic positions around the U.S. and internationally, as well as at leading research agencies. Dr. Hancock holds a Ph.D. from the University of Washington.


Sunday, April 1
9–10:30 a.m. Structural Equation Modeling, Foundations
  • Review of structural equation modeling and introduction to the SIMPLIS language
10:45–11:30 a.m. Structural Equation Modeling, Foundations (Cont’d)
  • Introduction to measured and latent mean structure models
11:30 a.m.–12:15 p.m. Structural Equation Models for Longitudinal Designs: Group Level Inference
  • Measured and latent variable panel designs
  • Repeated measure latent mean designs
1–3 p.m. Introduction to Latent Growth Modeling
  • Basics of covariance and mean structures for linear growth
  • Centering, uneven and varied time points
  • Time-independent covariates
Hands-on activities
3:15–5 p.m. Advanced Topics (as time allows)
  • Models for assessing treatments and interventions
  • Time-dependent covariates
  • Nonlinear models
  • Multidomain and cohort-sequential models
  • Overview of growth mixtures and models with categorical data 
Hands-on activities


Workshop 3
Scaling and Equating Test Scores

Monday, April 2, 2012
Instructors: Samuel A. ("Skip") Livingston & Shuhong Li, Educational Testing Service

Why are raw scores from a test converted to scaled scores? Why do we need to equate scores across different forms of a test, and how can we do it? What are the issues that need to be taken into account in implementing these procedures? The answers to these questions can be found in this workshop.

Testing organizations generally report test takers' scores as "scaled scores" or "converted scores" that are not simply the number or percentage of questions answered correctly. Instead, the scores are computed by a procedure that compensates for the changes in the difficulty of the test that can occur as the test questions are replaced with new questions. The procedure is called "equating" the scores. This workshop will introduce the participants to the basic concepts underlying the choice of a score scale and the equating of the scores. Participants will learn why it is possible for a score scale to have too much precision and why it is important not to have too little. They will learn how equating differs from statistical prediction and why an equating adjustment can be determined for a group of test takers but not for each individual test taker. They will learn the advantages and limitations of several different designs for collecting the data needed to equate test scores.

The morning session will consist of a series of illustrated lectures, interspersed with written exercises to help the participants determine how well they have understood the concepts presented. The afternoon session will include a small-group exercise in which the participants will choose a score scale and an equating design for a hypothetical new test. It will conclude with a hands-on exercise in which participants determine an equating conversion from a set of data.

Dr. Samuel A. ("Skip") Livingston is a principal psychometrician in the Research & Development Division at Educational Testing Service in Princeton, NJ. Since coming to ETS in 1974, he has developed performance tests in the health professions, conducted research on methods of equating test scores, and coordinated statistical work for various types of tests, including college placement tests, college outcomes assessments, and teacher and school administrator licensing tests. He has served as a consultant to test users including professional associations, government agencies and private corporations. He has served on the board of advisory editors for the Journal of Educational Measurement and Applied Measurement in Education and has reviewed manuscripts for many other journals, including the Modern Language Journal. He is the author of Equating Test Scores (without IRT) and a co-author (with Michael Zieky and Marianne Perie) of Cutscores: A Manual for Setting Standards of Performance on Educational and Occupational Tests.

Dr. Shuhong Li is a psychometrician on the English Language Assessment team in the Research and Development Division of Educational Testing Service. She holds a doctoral degree in educational measurement from the University of Massachusetts Amherst. Since joining ETS in 2005, she has worked on the Internet-based Test of English of Foreign Language (TOEFL iBT®) and the paper-based TOEFL® (TOEFL PBT), and has recently assumed the role of Statistical Coordinator for TOEFL® Junior™ PBT. Dr. Li's current research interests include the application of statistical/psychometric models to testing problems, with a particular emphasis on the applications of item response theory and psychometric models and principles in the context of language assessment.


Monday, April 2
9–10:40 a.m.

Raw scores and scaled scores
Choosing a score scale
Basic concepts of test score equating
Linear equating and equipercentile equating

10:50 a.m.–12:30 p.m.

Data collection designs for equating
Equating through common items
Equating constructed-response tests and performance assessments

1:15–3 p.m. Small-group exercise: Choosing a score scale and an equating design
3:20–5 p.m. Hands-on task: equating scores on a new test form