Abstract

The presidential election results of 2016 surprised many poll-watchers, suggesting possible biases in estimated support for the major party candidates and posing a challenge for poll aggregation as a prediction tool. Using data from earlier elections and the 2016 campaign, we conducted an evaluation of poll aggregation and state-level error in estimating the percentage spread between the two major candidates. We find that state-level estimates of the error magnitude for the FiveThirtyEight and Upshot models were approximately correct in 2016. However, a proportional bias that we term “prediction shrinkage,” due to non-major party preference during polling, had a large impact on state-level estimates. We suggest that prediction shrinkage may be largely avoided by using log-ratios of candidate preferences instead of percentage spread. We present a statistical rationale for simulation-based assessments of election probabilities, discussing aspects that may be understood by practitioners but not fully explicated in the literature. For 2016, we fit a smoothing mixed effects model that is sensitive to both national and state-specific trends and requires data from only a single election year. The model outperformed all the major prediction site estimates. Simulations of electoral college outcomes indicate that, on the eve of the election, the probability of a Trump victory was about 50%. The results do not support the contention that the poll averages were highly biased, but suggest that standard poll aggregation techniques were poorly equipped to respond to a late change in relative support for the candidates. We suggest that an increased emphasis on fundamental statistical tradeoffs of bias and variance, prior to focusing on poll adjustments or demographic behavior, may be the key to improved prediction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call