Abstract Ecological inference methods are devised to estimate unknown inner-cells of 2-way contingency tables by inferring conditional distribution probabilities. This outlines one of the more long-standing social science problems, chiefly frequent in political science and sociology. To solve the problem, ecological inference algorithms consider an asymmetric relationship, with a main characteristic (e.g. race or social class) mapped to rows impacting on a dependent variable, usually the vote, mapped to columns. The problem arises because different solutions are reached depending on how variables are assigned to rows and columns. The models are asymmetric. In this paper, we propose 2 new sets of ecological inference algorithms and explore if accuracy could be improved by handling the problem in a symmetric way. We assess the accuracy of the proposed methods using real data from more than 550 concurrent elections where the true district-level cross-classifications of votes (straight- and split-tickets) are known. Our empirical assessment clearly identifies the symmetric solutions as more accurate. They outperform asymmetric methods 90% of the time and reduce error, on average, by 11%. Our results are based on data from simultaneous elections, so further research is required to see whether our conclusions can be maintained in other ecological inference contexts. Interested readers can easily use the proposed methods as they are implemented in the R package lphom.
Read full abstract