Abstract

In the Harmonization Project we use data from 22 cross-national survey projects that cover a total of 142 countries or territories over a time span of almost 50 years (1966–2013). The large volume of the data, their multilevel structure as well as the large number of source files encouraged us to develop a set of custom tools for extracting, transforming, and loading data into a common database that allows for efficient automation of repeatable and otherwise manual routines. We created an environment based on freeware and open-source software that constitutes an alternative to statistical packages typically used for such purposes in social science research. This platform allows us to store and manage data in a single place, and—what is crucial—enables us to easily manipulate data in order to prepare the harmonized data set for use in substantive analyses. This article presents our motivation for choosing a custom programming and database environment, describes the principles guiding our software choices, and outlines the stages of data processing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call