Abstract

The choice of the number m of response categories is a crucial issue in categorization of a continuous response. The paper exploits the Proportional Odds Models' property which allows to generate ordinal responses with a different number of categories from the same underlying variable. It investigates the asymptotic efficiency of the estimators of the regression coefficients and the accuracy of the derived inferential procedures when m varies. The analysis is based on models with closed-form information matrices so that the asymptotic efficiency can be analytically evaluated without need of simulations. The paper proves that a finer categorization augments the information content of the data and consequently shows that the asymptotic efficiency and the power of the tests on the regression coefficients increase with m. The impact of the loss of information produced by merging categories on the efficiency of the estimators is also considered, highlighting its risks especially when performed in its extreme form of dichotomization. Furthermore, the appropriate value of m for various sample sizes is explored, pointing out that a large number of categories can offset the limited amount of information of a small sample by a better quality of the data. Finally, two case studies on the quality of life of chemotherapy patients and on the perception of pain, based on discretized continuous scales, illustrate the main findings of the paper.

Highlights

  • The analysis is based on models with closed-form information matrices so that the asymptotic efficiency can be analytically evaluated without need of simulations

  • The impact of the loss of information produced by merging categories on the efficiency of the estimators is considered, highlighting its risks especially when performed in its extreme form of dichotomization

  • A critical point in surveys with rating questions is the choice of the number m of response categories to use in the discretization of a measurement obtained on a continuous scale

Read more

Summary

Introduction

A critical point in surveys with rating questions is the choice of the number m of response categories to use in the discretization of a measurement obtained on a continuous scale (in which the only marks are those related to the minimum and the maximum level). The current paper investigates the impact that the choice of m has on the efficiency of the estimators in case of discretization of a continuous response variable in data analysis (in Section 4) performed through a proportional odds model (POM) [12, 13]. Another critical point in data analysis concerns the appropriate number of categories with respect to a given sample size n [21].

Theoretical background
The models
The asymptotic efficiency of the estimators
Hypothesis testing
Merging categories
Choice of m for given n
Case studies
Quality of life measured on linear analogue self-assessment scale
Pain measured on visual analog pain scale
Findings
Final remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call