Abstract

The value of cross-country comparisons is at the heart of large-scale international surveys. Yet the validity of such comparisons is often challenged, particularly in the case of latent traits whose estimates are based on self-reported answers to a small number of questionnaire items. Many believe self-reports to be unreliable and not comparable, and indeed, formal statistical procedures very often reject the assumption that the questions are understood and answered in the same way in different countries (measurement invariance). A methodological conference on the comparability of questionnaire scales was hosted by the OECD on 8 and 9 November 2018. This meeting report summarises the discussions held at the conference about measurement invariance testing and instrument design. The report first provides a brief introduction to the measurement models and the accompanying invariance analyses typically used in the industry of large-scale international surveys and points to the main limitations of these current standard approaches. It then presents classical and novel ways to deal with imperfect comparability of measurements when scaling and reporting on continuous traits and on categorical latent variables. It finally discusses the extent to which item design can improve the cross-country comparability of the measured constructs (e.g. by adopting innovative item formats such as anchoring vignettes and situational judgement test items). It concludes with some general considerations for survey design and reporting on invariance analyses and survey results.

Highlights

  • The value of cross-country comparisons is at the heart of large-scale international surveys, including those piloted by the Organisation for Economic Co-operation and Development (OECD), such as the Programme for International Student Assessment (PISA), the Programme for the International Assessment of Adult Competencies (PIAAC), and the Teaching and Learning International Survey (TALIS)

  • International surveys may fall short of their objective to perform comparisons across countries. These issues of cross-cultural comparability were recently the focus of a methodological conference hosted at the OECD headquarters: How can different levels of Avvisati et al Measurement Instruments for the Social Sciences

  • The conference successfully stimulated an exchange between leading academic experts in cross-cultural measurement, industry representatives involved in the production of large-scale, cross-national survey data, and secondary users of these datasets. Opportunities to discuss these issues with a broad set of experts from different professions and disciplines are rare, and this exchange was highly appreciated by conference participants

Read more

Summary

Introduction

The value of cross-country comparisons is at the heart of large-scale international surveys, including those piloted by the Organisation for Economic Co-operation and Development (OECD), such as the Programme for International Student Assessment (PISA), the Programme for the International Assessment of Adult Competencies (PIAAC), and the Teaching and Learning International Survey (TALIS). The binary nature of the test still leaves practitioners with no idea about the extent to which misspecifications in the measurement model affect the secondary analyses of the latent trait, and the global nature of the test provides little information about the specific restrictions (groups and item parameters) that are responsible for the rejection In this situation, survey organisations may be tempted to increase the chances of instruments passing the tests by limiting participation to groups that are more similar or by including redundant items and limiting the variation in question types. This risk was well illustrated in the simulation study presented by Arthur Pokropek: fitting an “approximate invariance model” to situations where a few groups and items are affected by large bias (partial invariance) leads to bias in the estimation of latent means (Pokropek et al, 2019) Another undesirable property of these methods is that the “ideal” situation in which there is no variation in measurement parameters is, a limit case and a “corner solution” for the estimation procedure. On the other hand, when certain known features (such as writing system, level of development, climate zone) are expected to interfere with measurements in some predictable ways, this information can be incorporated in the priors used to estimate Bayesian random parameter models

Findings
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.