Abstract

The person-centered approach in categorical data analysis is introduced as a complementary approach to the variable-centered approach. The former uses persons, animals, or objects on the basis of their combination of characteristics which can be displayed in multiway contingency tables. Configural Frequency Analysis (CFA) and log-linear modeling (LLM) are the two most prominent (and related) statistical methods. Both compare observed frequencies (foi…k) with expected frequencies (fei…k). While LLM uses primarily a model-fitting approach, CFA analyzes residuals of non-fitting models. Residuals with significantly more observed than expected frequencies (foi…k>fei…k) are called types, while residuals with significantly less observed than expected frequencies (foi…k<fei…k) are called antitypes. The R package confreq is presented and its use is demonstrated with several data examples. Results of contingency table analyses can be displayed in tables but also in graphics representing the size and type of residual. The expected frequencies represent the null hypothesis and different null hypotheses result in different expected frequencies. Different kinds of CFAs are presented: the first-order CFA based on the null hypothesis of independence, CFA with covariates, and the two-sample CFA. The calculation of the expected frequencies can be controlled through the design matrix which can be easily handled in confreq.

Highlights

  • Data that include categorical variables are often seen in the social sciences and psychological research

  • The term categorical variables typically refers to variables that, according to Steven’s [1] influential taxonomy of scale levels, have at least a nominal or ordinal scale level

  • Steven’s taxonomy was already criticized almost at the same time of its introduction, see, e.g., in [2], but see in [3], and can be regarded as the initial spark for a controversy about scale levels and measurement of social science variables as such, e.g., in [3,4,5,6,7], it can at least provide a useful heuristic for the practice of data analysis

Read more

Summary

Introduction

Data that include categorical variables are often seen in the social sciences and psychological research. Steven’s taxonomy was already criticized almost at the same time of its introduction, see, e.g., in [2], but see in [3], and can be regarded as the initial spark for a (still ongoing) controversy about scale levels and measurement of social science variables as such, e.g., in [3,4,5,6,7], it can at least provide a useful heuristic for the practice of data analysis From such a practice perspective, the term categorical variables can be used to characterize variables that comprise few distinct trait expressions or attributes that result from the classification of any type of observation into “one of a set of mutually exclusive and collectively exhaustive categories” [8] p. Analysis (CFA), and provides a link to the R package vcd [14,15] for the visualization of cross-tabulated categorical data

A Person-Centered Perspective on Data
Introduction to the confreq Framework in R
Working with confreq
A First Look on a Classical Data Example
The CFA Main Effect Model of Independency
Modifying the CFA-Model Design Matrices
Introducing Covariates into the CFA-Model
Comparing Pattern Frequencies for Two Samples with CFA
Summary and Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.