Abstract

One practical challenge in observational studies and quasi-experimental designs is selection bias. The issue of selection bias becomes more concerning when data are non-normal and contain missing values. Recently, a Bayesian robust two-stage causal modeling with instrumental variables was developed and has the advantages of addressing selection bias and handle non-normal data and missing data simultaneously in one model. The method provides reliable parameter and standard error estimates when missing data and outliers exist. The modeling technique can be widely applied to empirical studies particularly in social, psychological and behavioral areas where any of the three issues (e.g., selection bias, data with outliers and missing data) is commonly seen. To implement this method, we developed an R package named ALMOND (Analysis of LATE (Local Average Treatment Effect) for Missing Or/and Nonnormal Data). Package users have the flexibility to directly apply the Bayesian robust two-stage causal models or write their own Bayesian models from scratch within the package. To facilitate the application of the Bayesian robust two-stage causal modeling technique, we provide a tutorial for the ALMOND package in this article, and illustrate the application with two examples from empirical research.

Highlights

  • Specialty section: This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology

  • To facilitate the application of the Bayesian robust two-stage causal modeling technique, we provide a tutorial for the ALMOND package in this article, and illustrate the application with two examples from empirical research

  • Incorporating instrumental variables (InsV) in the analytic model is a frequently used and effective way to separate the two variations in the causal treatment: InsVs separate the variations of the treatment effects that are associated with the causal outcome from the variations in the treatment that are associated with the model residuals

Read more

Summary

BAYESIAN ROBUST TWO-STAGE CAUSAL MODELING WITH MISSING DATA

A two-stage modeling procedure is used to incorporate InsVs. Let Xi and Yi be the treatment and the outcome for individual i (i = 1, . . . , N), respectively, and Zi = (Zi1, . . . , ZiJ)′ be a vector of InsVs. A Bayesian robust two-stage causal modeling approach has been recently proposed (Shi and Tong, accepted) and the framework has two general types of linear models, which accommodate the continuous and categorical (i.e., dichotomous) treatment variables, respectively. Depending on whether the outcome data are normally distributed or not, the traditional two-stage causal models can be extended to robust models by assuming the error term in stage two follows Student’s t distributions. A probit link function is added to the second stage of the model so that the missingness in the outcome variable is explained by the link function. Selection models can be added to the traditional twostage model and the robust model using t distributions, denoted as cont-normal-selection model (M5) and con-robust-selection model (M6), respectively.

OVERVIEW OF THE ALMOND PACKAGE
Components of the Package
Gibbs Sampling Algorithm for the Computation
Example 1–Early Childhood Reading
Example 2–Public Housing Voucher Program
DATA AVAILABILITY STATEMENT
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call