Number Of Scale Categories Research Articles

Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of essays rated. This research used Monte Carlo methods to draw samples from known population models to examine the accuracy of select estimators of interrater reliability between two raters. In addition to the estimates shown above, we included the polychoric correlation coefficient based on its alignment with the context in which student language assessments are rated. Although each estimator produced an estimate close to the population parameter, polychoric correlations provided the closest estimates with mean and median bias equal to 0.00 (SD = 0.05) across all conditions. The use of Pearson product-moment and Spearman rank-order correlation coefficients might result in the underestimation of interrater reliability by as much as a third.

Green and Rao, using the criterion of information recovery and simulated data, concluded that sevenpoint scales were best [3]. Benson, pointing out that their criterion was not always applicable, suggested that wherever only averages are required, dichotomous or trichotomous scales suffice [1]. Jacoby and Matell reached similar conclusions using summary rating indices [9]. Lehmann and Hulbert's simulation concluded that a minimum of five categories was desirable unless there was averaging, either across individuals or scales [10]. Most research on the number of scale categories has focused on the desirable lower limit. At the upper limit, there are possible costs of respondent confusion and increased questionnaire length associated with increasing the number of categories [7]. Apart from this type of measurement error, however, Miller has found that over a wide range of unidimensional sensory variables, humans are incapable of discriminating among more than about 10 discrete categories. He concluded that this resulted from limited information processing capacity which appeared reasonably invariant over a number of different sensory attributes [11]. If this phenomenon were to apply to such common marketing measurement problems as rating stimuli on dimensions, then it would raise legitimate

Number Of Scale Categories Research Articles

Related Topics

Articles published on Number Of Scale Categories

Interrater Reliability Estimators Commonly Used in Scoring Language Assessments: A Monte Carlo Investigation of Estimator Accuracy

Investigating With IRT and MDS Approaches Translation and Adaptation of Rating Scales for Spanish-Speaking Populations

Information Processing Capacity and Attitude Measurement

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Number Of Scale Categories Research Articles

Related Topics

Articles published on Number Of Scale Categories

Interrater Reliability Estimators Commonly Used in Scoring Language Assessments: A Monte Carlo Investigation of Estimator Accuracy

Investigating With IRT and MDS Approaches Translation and Adaptation of Rating Scales for Spanish-Speaking Populations

Information Processing Capacity and Attitude Measurement