Abstract

Scholars estimating policy positions from political texts typically code words or sentences and then build left‐right policy scales based on the relative frequencies of text units coded into different categories. Here we reexamine such scales and propose a theoretically and linguistically superior alternative based on the logarithm of odds‐ratios. We contrast this scale with the current approach of the Comparative Manifesto Project (CMP), showing that our proposed logit scale avoids widely acknowledged flaws in previous approaches. We validate the new scale using independent expert surveys. Using existing CMP data, we show how to estimate more distinct policy dimensions, for more years, than has been possible before, and make this dataset publicly available. Finally, we draw some conclusions about the future design of coding schemes for political texts.

Highlights

  • Estimates of legislators’ ideal points that are distinct from aggregate policy stances of the parties to which they belong

  • Text is a direct by-product of political activity by the political actors whose positions we wish to estimate, whether this text takes the form of speeches, debates, written submissions, written rulings, or—by far the most commonly used in the profession for estimating party policy positions—election manifestos issued by political parties

  • The wide availability of these materials in electronic form has led to a large number of automated and semiautomated methods for scaling positions from political texts based on the statistical analysis of word patterns (e.g., Bara, Weale, and Biquelet 2007; Benoit and Laver 2003; Hilliard, Purpura, and Wilkerson 2007; Hopkins and King 2010; Klemmensen, Hobolt, and Hansen 2007; Laver and Garry 2000; Lowe 2008; Martin and Vanberg 2007; Monroe and Maeda 2004; Pennings and Keman 2002; Quinn et al 2010; Slapin and Proksch 2008; Yu, Kaufmann, and Diermeier 2008)

Read more

Summary

Scaling Policy Preferences from Coded Political Texts

Scholars estimating policy positions from political texts typically code words or sentences and build left-right policy scales based on the relative frequencies of text units coded into different categories. CMP data form the basis for hundreds of published studies by third-party authors and are almost always used to estimate policy positions for political parties on left-right scales.Almost everyone using CMP data does so for the same reason: they want to estimate positions of parties on different common policy dimensions Doing this typically implies assuming that a set of party positions, whether a cross-section or a time series, can be located on some (continuously defined) metric scale. Spatial theories of policy preferences typically assume that party positions exist on a continuous scale, usually an interval scale, content coding schemes such as the CMP record only absolute and relative category counts of discrete text units. By justifying and demonstrating what types of coding categories are best compared to create continuous scales, our findings provide direct lessons for the future design of improved political text coding schemes

How Should Policy Mentions Be Counted?
Previous Approaches to Scaling Policy Measures
Previous Scaling Procedures
The Logit Scale of Position
Estimating Scale Uncertainty
New Policy Scales
CMP Category
Comparing Scales
Environmental Protection State Involvement in Economy
Comparisons to Expert Surveys of Policy
Findings
Relative Proportional Scale
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call