angle-up angle-right angle-down angle-left close user menu open menu closed search globe bars phone store

Recent Publications in Statistics and Psychometrics by ETS Researchers

Below are some recent publications that our researchers have authored.


  • Psychometrics and Game-Based Assessment
    R. J. Mislevy, S. Corrigan, A. Oranje, K. DiCerbo, M. I. Bauer, A. A. von Davier, & M. John (2016)
    in F. Drasgow (ed) Technology and Testing: Improving Educational and Psychological Measurement, pp. 23–48

    The authors of this chapter discusses psychometric models and validity issues related to game-based assessments and design implications of linking assessment and psychometric methods with game design. The book is part of the NCME Applications of Educational Measurement and Assessment Book Series. Learn more

  • Agent-Based Modeling of Collaborative Problem Solving
    Y. Bergner, J. J. Andrews, M. Zhu, & J.E. Gonzales (2016)
    ETS Research Report, RR-16-27. Princeton, NJ: Educational Testing Service.

    The authors explore how agent-based modeling can model collaborative problem solving, test the sensitivity of outcomes to different population characteristics, and generate simulated data for refining and developing psychometric models. View citation record >

  • Exponential Family Distributions Relevant to IRT
    S. J. Haberman (2016)
    In W. J. van der Linden (ed.) Handbook of Item Response Theory, Volume Two: Statistical Tools. Boca Raton: Chapman and Hall/CRC, 2016, pp. 47–69

    Exponential families provide a statistical framework for development of customary models for analysis of item responses and for analysis of the statistical properties associated with these models. View citation record >

  • A Program for Nonparametric Raw-to-Scale Conversion
    S. J. Haberman (2016)
    ETS Research Memorandum, RM-16–03

    Raw-to-scale conversions are developed in which no assumptions are made concerning the distribution of the raw scores, but the target scale score distribution is based on a continuous exponential family and on specified moment constraints. View citation record >

  • Designing Tests to Measure Personal Attributes and Noncognitive Skills
    P.C. Kyllonen (2016)
    In S. Lane, M.R. Raymond, T.M. Haladyna, (eds.) Handbook of Test Development, Second Edition. New York: Routledge, 2016, pp.190–211

    The authors of this chapter discuss why noncognitive skills are important for both education and the workplace. They also explore various construct frameworks for these skills and different methods for measuring them. View citation record >


  • Uncovering Multivariate Structure in Classroom Observations in the Presence of Rater Errors
    D.F. McCaffrey, K. Yuan, T.D. Savitsky, J.R. Lockwood, M.O. Edelen (2015)
    Educational Measurement: Issues and Practice, v34, n2, pp. 34–46, Summer 2015

    The authors examine the factor structure of scores from the CLASS-S protocol obtained from observations of middle school classroom teaching. They demonstrate that errors in scores made by two raters on the same lesson have a factor structure that is distinct from the factor structure at the teacher level. They consider alternative hierarchical estimation approaches designed to prevent the contamination of estimated teacher-level factors. View the publisher's abstract >

  • Matching and Weighting with Functions of Error-Prone Covariates for Causal Inference
    J.R. Lockwood, & D.F. McCaffrey (2015)
    Journal of the American Statistical Association, 2015 (currently online only)

    The authors establish necessary and sufficient conditions for matching and weighting with functions of observed covariates to yield unconfounded causal effect estimators, generalizing results from the standard (i.e. no measurement error) case. View citation record >

  • An Alternative Way to Model Population Ability Distributions in Large-Scale Educational Surveys
    E. Wetzel, X. & Xu, M. von Davier (2015)
    Educational and Psychological Measurement, v75, n5, pp. 739–763, Oct 2015

    The authors explore an alternative way to model population ability distributions in large-scale educational surveys, where a latent regression model is often used to compensate for the shortage of cognitive information. In this article, the authors introduce an alternative approach to identify multiple groups that can account for the variation among students — conduct a Latent Class Analysis (LCA). View citation record >

  • Bayesian Networks in Educational Research
    R. G. Almond, R. J. Mislevy, L. Steinberg, D. Yan, & D. Williamson (2015)
    Statistics for Social and Behavioral Sciences

    The authors explain and illustrate how Bayesian Networks, which combine statistical methods and computer-based expert systems, can be used to develop models for educational assessment. Learn more

  • Prediction of True Test Scores from Observed Item Scores and Ancillary Data
    S. J. Haberman, L. Yao, & S. Sinharay (2015)
    British Journal of Mathematical and Statistical Psychology, Vol. 68, No. 2, pp. 363–385

    The authors develop new methods to evaluate performance of test takers when items are scored both by human raters and by computers. Learn more

  • Alternative Statistical Frameworks for Student Growth Percentile Estimation
    J. R. Lockwood & K. E. Castellano (2015)
    Statistics and Public Policy, Online First

    The authors describe two alternative statistical approaches for estimating student growth percentiles (SGP). The first estimates percentile ranks of current test scores conditional on past test scores directly. The second estimates SGP directly from longitudinal item-level data. Learn more >

  • An Application of Exploratory Data Analysis in the Development of Game-Based Assessments
    K. E. DiCerbo, M. Bertling, S. Stephenson, Y. Jia, R. J. Mislevy, M. I. Bauer, & T. Jackson (2015)
    Serious Games Analytics: Methodologies for Performance Measurement, Assessment, and Improvement, pp. 319–342
    Editors: C. S. Loh, Y. Sheng, & D. Ifenthaler

    The authors of this chapter discuss the use of exploratory data analysis (EDA) using the 4R’s (revelation, resistance, re-expression and residuals) to better understand players’ knowledge, skills and attributes (KSAs) in order to gain evidence that the authors suggest can be combined in a measurement model based on Bayesian Networks. Learn more

  • Assessing Collaborative Problem Solving with Simulation Based Tasks
    J. Hao, L. Liu, A. von Davier, & P. Kyllonen (2015)
    In Proceedings for the 11th International Conference on Computer Supported Collaborative Learning

    The authors discuss preliminary results from a project developed for assessing the collaborative problem-solving using web-based simulation. Two participants collaborated via a chat box in this experiment in order to complete a science task. Responses from 486 individuals and 278 teams (dyads) recruited from Amazon Mechanical Turk™ were compared. View citation record

  • Methodological Challenges in the Analysis of MOOC Data for Exploring the Relationship between Discussion Forum Views and Learning Outcomes
    Y. Bergner, D. Kerr, & D. E. Pritchard (2015)
    In Proceedings of the 8th International Conference on Educational Data Mining, pp. 234–241

    The article discusses methodological challenges and problems with missing data that researchers face when seeking to understand the diverse group of students that take massively open online courses (MOOCs). Solving these challenges is important to policymakers and providers of education. View citation record

  • Estimation of Ability from Homework Items When There are Missing and/or Multiple Attempts
    Y. Bergner, K. Colvin, & D. E. Pritchard (2015)
    In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 118–12

    Missing data and multiple attempts to answers by a user presents two challenges to scoring massively open online courses (MOOCs). The authors discuss these challenges in regards to ability estimation of homework items in a large-enrollment electrical engineering MOOC. View citation record

  • An Exploratory Study Using Social Network Analysis to Model Eye Movements in Mathematics Problem Solving
    M. Zhu & G. Feng (2015)
    In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pp. 383–387

    The paper applies techniques from social network analysis to eye movement patterns in mathematics problem-solving. The authors construct and visualize transition networks using eye-tracking data collected from 37 8th grade students while solving linear function problems. View citation record

  • Use of Jackknifing to Evaluate Effects of Anchor Item Selection on Equating With the Nonequivalent Groups With Anchor Test (NEAT) Design
    R. Lu, S. Haberman, H. Guo & J. Liu (2015)
    ETS Research Report No. RR-15-10.

    The authors evaluate the impact of anchor selection on equating stability, which can strongly influence equating results in the real world, even when large examinee samples are present. This can provide a major hazard to the practical use of equating. View citation record

  • Repeater Analysis for Combining Information From Different Assessments
    S. Haberman & L. Yao (2015)
    Journal of Educational Measurement, Vol. 52, No. 2, pp. 223–251

    The article discusses how information from several assessments, for example TOEFL iBT® test scores and GRE® revised General Test scores, can be combined in a rational way. It suggests principles for exploring how various assessments relate to each other. Augmentation approaches developed for individual tests are applied to provide an accurate evaluation of combined assessments. The proposed methodology can be applied to other situations involving multiple assessments. View citation record

  • Pseudo-Equivalent Groups and Linking
    S. Haberman (2015)
    Journal of Educational and Behavioral Statistics, Vol. 40, No. 3, pp. 254–273

    The author explores an approach to linking test forms in the case of a nonequivalent groups design with no satisfactory common items. He compares the reasonableness of results from pseudo-equivalent groups to results from kernel equating. View citation record

  • Analyzing Process Data from Game/Scenario-Based Tasks: An Edit Distance Approach
    J. Hao, Z. Shu, & A. von Davier (2015)
    Journal of Educational Data Mining, Vol. 7, No. 1, pp. 33–50

    In this paper the authors describe their research on evaluating students' performances in game/scenario-based tasks by comparing how far their action strings (a string of characters) are from the action string that corresponds to the best performance, where the proximity is quantified by the edit distance between the strings. View citation record

  • Gamification in Assessment: Do Points Affect Test Performance?
    Y. Attali & M. Arieli-Attali (2015)
    Computers & Education, Vol. 83, pp. 57–63

    The authors examine the premise that gamification promotes motivation and engagement and therefore supports the learning process. The authors examined the effects of points, which is a basic element of gamification, on performance in a computerized assessment of mastery and fluency of basic mathematics concepts. View citation record

  • The Changing Nature of Educational Assessment
    R. Bennett (2015)
    Review of Research in Education, Vol. 39, No. 1, pp. 370–407

    The author describes the evolution of technology-based assessment from an initial stage involving infrastructure building, a second stage of gradual qualitative change and efficiency improvement to a third stage of innovative assessments aligned with the two Common Core State Assessment (CCSA) consortia, the Partnership for the Assessment of Readiness for College and Careers (PARCC), and the Smarter Balanced Assessment Consortium (SBAC). The article proceeds to explore and discuss in-depth the emerging third stage of assessments. View citation record

  • A Comparison of IRT Proficiency Estimation Methods Under Adaptive Multistage Testing
    S. Kim, T. Moses, & H. Yoo (2015)
    Journal of Educational Measurement, Vol. 52, No. 1, pp. 70–79

    The article reports on an investigation of item response theory (IRT) proficiency estimators’ accuracy under multistage testing (MST). View citation record

  • The Impact of Measurement Error on the Accuracy of Individual and Aggregate SGP
    D. F. McCaffrey, K. E. Castellano, & J. R. Lockwood (2015)
    Educational Measurement: Issues and Practice, Vol. 34, No. 1, pp. 15–21

    The authors discuss a potential bias due to measurement errors. This bias can affect student growth percentiles (SGPs) for individual students as well as mean or median SGPs on the aggregate level. The authors discuss various techniques that seek to decrease this bias. View citation record



Find Other ETS-authored Statistics and Psychometrics Publications

ETS ReSEARCHER is a database that contains information on ETS-authored or -published works, such as ETS Research Reports, ETS Research Memorandums or publications written by ETS researchers and published by third parties, such as scholarly journals.

Find other publications now

Promotional Links

Find a Publication

Advanced Search