Abstract

Estimating fixed effects models can be challenging with rare events data. Researchers often face difficult trade-offs when selecting between the Linear Probability Model (LPM), logistic regression with group intercepts and the conditional logit. In this paper, I survey these tradeoffs and argue that, in fact, the LPM with fixed effects produces more accurate estimates and predicted probabilities than maximum likelihood specifications when the dependent variable has less than 25 percent of ones. I use Monte Carlo simulations to show when the LPM with fixed effects should be preferred. I perform these simulations on common time-series cross-sectional (TSCS) data structures found in the literature as well as big data. This paper provides clarity around fixed effects models in TSCS data and a novel technique to identify which one to use as a function of the frequency of events in y.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call