Abstract
We consider Poisson reduced rank models where parameters depend on time. Our main motivation comes from studies in comparative politics where one wants to locate party positions in a certain political space. For this purpose, several empirical methods have been proposed using text as data sources. As the data structure of texts is quite complex, its analysis to extract information is generally a difficult task. Furthermore, political texts usually contain a large number of words such that a simultaneous analysis of word counts becomes challenging. In this paper, we consider Poisson models for each word count simultaneously and provide a statistical method suitable to analyze political text data. We consider a novel model which allows the political lexicon to change over time and develop an estimation procedure based on LASSO and fused LASSO penalization techniques to address high-dimensionality via significant dimension reduction. This model gives insights into the potentially changing use of words by left and right-wing parties over time. The procedure allows to identify automatically those words having a discriminating effect between party positions. To address the dependence structure of word counts over time, we propose integer-valued time series processes to implement a suitable bootstrap method for constructing confidence intervals for the model parameters. We apply our approach to party manifesto data from German parties over seven federal elections after German reunification. The approach does not require any a priori information nor expert knowledge to process the data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.