Abstract
International large-scale assessments like international computer and information literacy study (ICILS) (Fraillon et al. in International Association for the Evaluation of Educational Achievement (IEA), 2015) provide important empirically-based knowledge through the proficiency scales, of what characterizes tasks at different difficulty levels, and what that says about students at different ability levels. In international comparisons, one of the threats against validity is country differential item functioning (DIF), also called item-by-country interaction. DIF is a measure of how much harder or easier an item is for a respondent of a given group as compared to respondents from other groups of equal ability. If students from one country find a specific item much harder or easier than students from other countries, it can impair the comparison of countries. Therefore, great efforts are directed towards analyzing for DIF and removing or changing items that show DIF. From another angle, however, this phenomenon can be seen not only as a threat to validity, but also as an insight into what distinguishes students from different countries, and possibly their education, on a content level, providing even more pedagogically useful information. Therefore, in this paper, the data from ICILS 2013 is re-analyzed to address the research question: Which kinds of tasks do Danish, Norwegian, and German students find difficult and/or easy in comparison with students of equal ability from other countries participating in ICILS 2013? The analyses show that Norwegian and Danish students find items related to computer literacy easier than their peers from other countries. On the other hand, Danish and, to a certain degree, Norwegian students find items related to information literacy more difficult. Opposed to this, German students do not find computer literacy easier, but they do seem to be comparably better at designing and laying out posters, web pages etc. This paper shows that essential results can be identified by comparing the distribution of difficulties of items in international large-scale assessments. This is a more constructive approach to the challenge of DIF, but it does not eliminate the serious threat to the validity of the comparison of countries.
Highlights
International large-scale assessments like Programme for International Student Assessment (PISA), and the International Association for the Evaluation of Educational Achievement (IEA) studies progress in international reading literacy study (PIRLS) and international computer and information literacy study (ICILS) are most known for the so-called league tables, which provide information about the relative abilities of students across countries
For teachers, teacher educators, and developers of teaching material, they can provide much more important empirically based knowledge of what characterizes tasks at different difficulty levels, and what that says about students at different ability levels: What can they be expected to do what is their present zone of proximal development, and which tasks are they not yet able to perform? This knowledge is summed up in so-called described proficiency scales, which are developed on the basis of analyses of items of similar difficulty and detailed studies of tasks at a given difficulty interval (Fraillon et al 2015; OECD 2014)
The student responses found in the dataset from the international computer and information literacy study (ICILS) 2013 (Fraillon et al 2014) are re-analyzed using the Rasch model (Rasch 1960)
Summary
International large-scale assessments like Programme for International Student Assessment (PISA), and the International Association for the Evaluation of Educational Achievement (IEA) studies progress in international reading literacy study (PIRLS) and international computer and information literacy study (ICILS) are most known for the so-called league tables, which provide information about the relative abilities of students across countries. The constructor needs to assure that it measures the same way for different persons being measured. It means that the result of a test should not depend on anything else but the students’ proficiency in the area the test is intended to measure. It should not matter what background the student comes from, or on the specific items used to test this specific student
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.