Abstract
The choice of the number m of response categories is a crucial issue in categorization of a continuous response. The paper exploits the Proportional Odds Models' property which allows to generate ordinal responses with a different number of categories from the same underlying variable. It investigates the asymptotic efficiency of the estimators of the regression coefficients and the accuracy of the derived inferential procedures when m varies. The analysis is based on models with closed-form information matrices so that the asymptotic efficiency can be analytically evaluated without need of simulations. The paper proves that a finer categorization augments the information content of the data and consequently shows that the asymptotic efficiency and the power of the tests on the regression coefficients increase with m. The impact of the loss of information produced by merging categories on the efficiency of the estimators is also considered, highlighting its risks especially when performed in its extreme form of dichotomization. Furthermore, the appropriate value of m for various sample sizes is explored, pointing out that a large number of categories can offset the limited amount of information of a small sample by a better quality of the data. Finally, two case studies on the quality of life of chemotherapy patients and on the perception of pain, based on discretized continuous scales, illustrate the main findings of the paper.
Highlights
The analysis is based on models with closed-form information matrices so that the asymptotic efficiency can be analytically evaluated without need of simulations
The impact of the loss of information produced by merging categories on the efficiency of the estimators is considered, highlighting its risks especially when performed in its extreme form of dichotomization
A critical point in surveys with rating questions is the choice of the number m of response categories to use in the discretization of a measurement obtained on a continuous scale
Summary
A critical point in surveys with rating questions is the choice of the number m of response categories to use in the discretization of a measurement obtained on a continuous scale (in which the only marks are those related to the minimum and the maximum level). The current paper investigates the impact that the choice of m has on the efficiency of the estimators in case of discretization of a continuous response variable in data analysis (in Section 4) performed through a proportional odds model (POM) [12, 13]. Another critical point in data analysis concerns the appropriate number of categories with respect to a given sample size n [21].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.