Many-facet Rasch Measurement Research Articles

Testing listening comprehension skills is a difficult task because of the complex nature of the listening process. This complexity also makes the process of listening comprehension test design challenging. Overcoming such challenges was one of the major tasks in a project at a major Hungarian university, during which four test‑based listening practice booklets, targeting the C1 proficiency level, were designed. The present study focuses on the investigation of the reliability of the items in the first practice booklet by analysing and empirically validating the test scores of 98 test‑takers. The test in the first practice booklet contained 30 items with four different item formats. Data analysis follows a quantitative approach. The reliability of the test scores was examined with the help of Iteman software using the approaches of classical test theory and Facets software using Many‑Facet Rasch Measurement (MFRM). The latter method was also used to empirically validate the test scores by identifying misfitting test‑takers and items in the dataset. The empirical validation with MFRM offered a subtle way of strengthening the reliability of the test scores by artificially connecting the dataset. Even though the results of the present study could be improved by pre-testing the remaining three practice tests, and by removing altogether six misfitting test‑takers and four misfitting items from the present dataset, a higher degree of reliability has been reached as far as the fitting test items are concerned. The results indicate that applying MFRM for the empirical validation of test-scores might be beneficial not only for validating listening comprehension test scores but also for validating other types of test‑scores, especially in large‑scale testing.

Objective: The purpose of this study is to measure an instrument that could assess Early Science and Mathematics Standard in the domain of science and technology. Children’s measurement standards in early years include the performance in thinking skills. The design is a combination of the theoretical framework representing Early Sciences and Mathematics Standard and Zone of Proximal Development (ZPD). Method/Analysis: We developed a performance assessment standard and scoring set (rubric) for measuring children’s responses. Participants in this study were 30 children of 2-year old and 30 children who were 3 years old. These children were from several child-care centres. 21 items on scientific attitude, scientific skills, investigate the nature of life, pre-number experience, concepts of number, shapes and space, and construction were assessed. Mathematical models such as the Rasch model have provided useful representations of social science problems where it coordinates data with the requirements of a useful definition of measurement. Findings: The study shows the Many-facet Rasch Measurement (MFRM) an extended version of the Rasch Model techniques which is combined with children’s measurement standards to examine the validity and reliability of scores for the performance rating scale. Additionally, an infit statistic (.76) for 2 years and (1.46) for 3 years for higher responses support the validity of scores. For reliability, the person reliability is good at .99, rater reliability is at .97, domain reliability is good at 1.00 and the item reliability is at .96 for overall scoring measure. Application/Improvement: This paper provides educators and researchers with a useful tool to facilitate measurement in early childhood years. It gives a great recommendation where the instrument is valid and reliable.

Many-facet Rasch Measurement Research Articles

Related Topics

Articles published on Many-facet Rasch Measurement

Examining the validity of an analytic rating scale for a Spanish test for academic purposes using the argument-based approach to validation

Rater variability across examinees and rating criteria in paired speaking assessment

Investigating EFL listening comprehension skills: An empirical validation of C1 level test scores

The CEFR Rating Scale Functioning: An Empirical Study on Self- and Peer Assessments

Improving the Validity of L2 Performance Assessments: Use of Many-Facet Rasch Measurement

The Self, the Peer and the Teacher in the EFL Pronunciation Class: A Comparative Study on Assessment, Perceptions and Systematicity

Understanding the Role of Likeability in the Peer Assessments of University Students’ Oral Presentation Skills: A Latent Variable Approach

Investigating Central Tendency in Competency Assessment of Design Electronic Circuit: Analysis Using Many Facet Rasch Measurement (MFRM)

Measurement of Early Science and Mathematics Standard Instrument: Performance Assessment and psychometric setting using ZPD Concept

Calibrating Communication Competencies

The Multiple Mini-Interview as an Admission Tool for a PharmD Program Satellite Campus

Development and initial argument-based validation of a scoring rubric used in the assessment of L2 writing electronic portfolios

“I can see that”: Developing shared rubric category interpretations through score negotiation

A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes

Determining cloze item difficulty from item and passage characteristics across different learner backgrounds

Measuring the impact of rater negotiation in writing performance assessment

Topic and background knowledge effects on performance in speaking assessment

수학 창의력 문제 해결 검사에서의 일반화가능도 이론과 다국면 라쉬모형의 비교 연구

The Cross-Cultural Invariance of Creative Cognition: A Case Study of Creative Writing in U.S. and Russian College Students.

Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Many-facet Rasch Measurement Research Articles

Related Topics

Articles published on Many-facet Rasch Measurement

Examining the validity of an analytic rating scale for a Spanish test for academic purposes using the argument-based approach to validation

Rater variability across examinees and rating criteria in paired speaking assessment

Investigating EFL listening comprehension skills: An empirical validation of C1 level test scores

The CEFR Rating Scale Functioning: An Empirical Study on Self- and Peer Assessments

Improving the Validity of L2 Performance Assessments: Use of Many-Facet Rasch Measurement

The Self, the Peer and the Teacher in the EFL Pronunciation Class: A Comparative Study on Assessment, Perceptions and Systematicity

Understanding the Role of Likeability in the Peer Assessments of University Students’ Oral Presentation Skills: A Latent Variable Approach

Investigating Central Tendency in Competency Assessment of Design Electronic Circuit: Analysis Using Many Facet Rasch Measurement (MFRM)

Measurement of Early Science and Mathematics Standard Instrument: Performance Assessment and psychometric setting using ZPD Concept

Calibrating Communication Competencies

The Multiple Mini-Interview as an Admission Tool for a PharmD Program Satellite Campus

Development and initial argument-based validation of a scoring rubric used in the assessment of L2 writing electronic portfolios

“I can see that”: Developing shared rubric category interpretations through score negotiation

A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes

Determining cloze item difficulty from item and passage characteristics across different learner backgrounds

Measuring the impact of rater negotiation in writing performance assessment

Topic and background knowledge effects on performance in speaking assessment

수학 창의력 문제 해결 검사에서의 일반화가능도 이론과 다국면 라쉬모형의 비교 연구

The Cross-Cultural Invariance of Creative Cognition: A Case Study of Creative Writing in U.S. and Russian College Students.

Investigating the Application of Automated Writing Evaluation to Chinese Undergraduate English Majors: A Case Study of WriteToLearn