Rater Differences Research Articles

Rating scales are popular methods for generating quantitative data directly by persons rather than automated technologies. But scholars increasingly challenge their foundations. This article contributes epistemological and methodological analyses of the processes involved in person-generated quantification. They are crucial for measurement because data analyses can reveal information about study phenomena only if relevant properties were encoded systematically in the data. The Transdisciplinary Philosophy-of-Science Paradigm for Research on Individuals (TPS-Paradigm) is applied to explore psychological and social-science concepts of measurement and quantification, including representational measurement theory, psychometric theories and their precursors in psychophysics. These are compared to theories from metrology specifying object-dependence of measurement processes and subject-independence of outcomes as key criteria, which allow tracing data to the instances measured and the ways they were quantified. Separate histories notwithstanding, the article’s basic premise is that general principles of scientific measurement and quantification should apply to all sciences. It elaborates principles by which these metrological criteria can be implemented also in psychology and social sciences, while considering their research objects’ peculiarities. Application of these principles is illustrated by quantifications of individual-specific behaviors (‘personality’). The demands rating methods impose on data-generating persons are deconstructed and compared with the demands involved in other quantitative methods (e.g., ethological observations). These analyses highlight problematic requirements for raters. Rating methods sufficiently specify neither the empirical study phenomena nor the symbolic systems used as data nor rules of assignment between them. Instead, pronounced individual differences in raters’ interpretation and use of items and scales indicate considerable subjectivity in data generation. Together with recoding scale categories into numbers, this introduces a twofold break in the traceability of rating data, compromising interpretability of findings. These insights question common reliability and validity concepts for ratings and provide novel explanations for replicability problems. Specifically, rating methods standardize only data formats but not the actual data generation. Measurement requires data generation processes to be adapted to the study phenomena’s properties and the measurement-executing persons’ abilities and interpretations, rather than to numerical outcome formats facilitating statistical analyses. Researchers must finally investigate how people actually generate ratings to specify the representational systems underlying rating data.

Read full abstract

Childhood aggression and its resulting consequences inflict a huge burden on affected children, their relatives, teachers, peers and society as a whole. Aggression during childhood rarely occurs in isolation and is correlated with other symptoms of childhood psychopathology. In this paper, we aim to describe and improve the understanding of the co-occurrence of aggression with other forms of childhood psychopathology. We focus on the co-occurrence of aggression and other childhood behavioural and emotional problems, including other externalising problems, attention problems and anxiety–depression. The data were brought together within the EU-ACTION (Aggression in Children: unravelling gene-environment interplay to inform Treatment and InterventiON strategies) project. We analysed the co-occurrence of aggression and other childhood behavioural and emotional problems as a function of the child’s age (ages 3 through 16 years), gender, the person rating the behaviour (father, mother or self) and assessment instrument. The data came from six large population-based European cohort studies from the Netherlands (2x), the UK, Finland and Sweden (2x). Multiple assessment instruments, including the Child Behaviour Checklist (CBCL), the Strengths and Difficulties Questionnaire (SDQ) and Multidimensional Peer Nomination Inventory (MPNI), were used. There was a good representation of boys and girls in each age category, with data for 30,523 3- to 4-year-olds (49.5% boys), 20,958 5- to 6-year-olds (49.6% boys), 18,291 7- to 8-year-olds (49.0% boys), 27,218 9- to 10-year-olds (49.4% boys), 18,543 12- to 13-year-olds (48.9% boys) and 10,088 15- to 16-year-olds (46.6% boys). We replicated the well-established gender differences in average aggression scores at most ages for parental ratings. The gender differences decreased with age and were not present for self-reports. Aggression co-occurred with the majority of other behavioural and social problems, from both externalising and internalising domains. At each age, the co-occurrence was particularly prevalent for aggression and oppositional and ADHD-related problems, with correlations of around 0.5 in general. Aggression also showed substantial associations with anxiety–depression and other internalizing symptoms (correlations around 0.4). Co-occurrence for self-reported problems was somewhat higher than for parental reports, but we found neither rater differences, nor differences across assessment instruments in co-occurrence patterns. There were large similarities in co-occurrence patterns across the different European countries. Finally, co-occurrence was generally stable across age and sex, and if any change was observed, it indicated stronger correlations when children grew older. We present an online tool to visualise these associations as a function of rater, gender, instrument and cohort. In addition, we present a description of the full EU-ACTION projects, its first results and the future perspectives.

Read full abstract

Rater Differences Research Articles

Articles published on Rater Differences

The raters' differences in Arabic writing rubrics through the Many-Facet Rasch measurement model.

Dynamics of Rater Differences in Assessing the Age Appropriateness of Media Content: A Multilevel Moderated Mediation Analysis

Higher depressive symptoms in early adolescents with Autism Spectrum Disorder by self- and parent-report compared to typically-developing peers

Rater Negotiation Scheme: How writing raters resolve score discrepancies

Rater experience and the predictive validity of Psychopathy Checklist: Youth Version scores

Clinical Utility of Preoperative Bilingual Language fMRI Mapping in Patients with Brain Tumors.

Combining Dual Scaling with Semi-Structured Interviews to Interpret Rating Differences

The effect of training and rater differences on oral proficiency assessment

Parent versus teacher ratings on the BRIEF-preschool version in children with and without ADHD

Applying many-facet Rasch modeling in the assessment of creativity.

INTERRATER AND TEST-RETEST RELIABILITY OF THE Y BALANCE TEST IN HEALTHY, EARLY ADOLESCENT FEMALE ATHLETES

Quantitative Data From Rating Scales: An Epistemological and Methodological Enquiry.

Mastery motivation, parenting, and school achievement among Hungarian adolescents

Childhood aggression and the co-occurrence of behavioural and emotional problems: results across ages 3\u201316\xa0years from multiple raters in six cohorts in the EU-ACTION project

Sequential Effects in Essay Ratings: Evidence of Assimilation Effects Using Cross-Classified Models.

Evaluating Differential Rater Functioning Over Time in the Context of Solo Music Performance Assessment

Problem behaviour and psychosocial functioning in young children with Williams syndrome: parent and teacher perspectives.

Markers\u2019 criteria in assessing English essays: an exploratory study of the higher secondary school certificate (HSCC) in the Punjab province of Pakistan

Informant Discrepancies in the Assessment of Attention-Deficit/Hyperactivity Disorder

Measuring Accountability in the Performance Appraisal Context: Rater Status and Organization Culture as Determinants of Rater Accountability

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Rater Differences Research Articles

Articles published on Rater Differences

The raters' differences in Arabic writing rubrics through the Many-Facet Rasch measurement model.

Dynamics of Rater Differences in Assessing the Age Appropriateness of Media Content: A Multilevel Moderated Mediation Analysis

Higher depressive symptoms in early adolescents with Autism Spectrum Disorder by self- and parent-report compared to typically-developing peers

Rater Negotiation Scheme: How writing raters resolve score discrepancies

Rater experience and the predictive validity of Psychopathy Checklist: Youth Version scores

Clinical Utility of Preoperative Bilingual Language fMRI Mapping in Patients with Brain Tumors.

Combining Dual Scaling with Semi-Structured Interviews to Interpret Rating Differences

The effect of training and rater differences on oral proficiency assessment

Parent versus teacher ratings on the BRIEF-preschool version in children with and without ADHD

Applying many-facet Rasch modeling in the assessment of creativity.

INTERRATER AND TEST-RETEST RELIABILITY OF THE Y BALANCE TEST IN HEALTHY, EARLY ADOLESCENT FEMALE ATHLETES

Quantitative Data From Rating Scales: An Epistemological and Methodological Enquiry.

Mastery motivation, parenting, and school achievement among Hungarian adolescents

Childhood aggression and the co-occurrence of behavioural and emotional problems: results across ages 3\u201316\xa0years from multiple raters in six cohorts in the EU-ACTION project

Sequential Effects in Essay Ratings: Evidence of Assimilation Effects Using Cross-Classified Models.

Evaluating Differential Rater Functioning Over Time in the Context of Solo Music Performance Assessment

Problem behaviour and psychosocial functioning in young children with Williams syndrome: parent and teacher perspectives.

Markers\u2019 criteria in assessing English essays: an exploratory study of the higher secondary school certificate (HSCC) in the Punjab province of Pakistan

Informant Discrepancies in the Assessment of Attention-Deficit/Hyperactivity Disorder

Measuring Accountability in the Performance Appraisal Context: Rater Status and Organization Culture as Determinants of Rater Accountability