Abstract

Open and reproducible research receives more and more attention in the research community. Whereas empirical research may benefit from research data centres or scientific use files that foster using data in a safe environment or with remote access, methodological research suffers from the availability of adequate data sources. In economic and social sciences, an additional drawback results from the presence of complex survey designs in the data generating process, that has to be considered when developing and applying estimators.In the present paper, we present a synthetic but realistic dataset based on social science data, that fosters evaluating and developing estimators in social sciences. The focus is on supporting comparable and reproducible research in a realistic framework providing individual and household data. The outcome is provided as an open research data resource.

Highlights

  • Statistical applications using individual and household data in general are split into the two areas design-based and model-based inference

  • The present paper presents the AMELIA dataset which provides a realistic framework for open and reproducible research based on EU-SILC data

  • Additional to mimicking the original distribution to provide one dataset, we aim to provide a realistic dataset that is safe in terms of anonymity but supports methodological research in economic and social sciences and social statistics considering the items above

Read more

Summary

Introduction

Statistical applications using individual and household data in general are split into the two areas design-based and model-based inference. Official statistics is mainly interested in parameters of a finite population like totals, means, and proportions. The adequate underlying concept of inference is design-based with respect to the underlying sampling process. Empirical researchers using household- and individuallevel data are mainly interested in statistical models which are based on model inference. Kalton (2002) stresses the importance of design-based inference and points out that new model-based methods like imputation for handling missing values and small area statistics urge the needs of considering both types of inference. Designand model-based inference are certainly using the same data which in social sciences and humanities are mainly based on complex samples. In Europe, the major data source is the European Union Statistics on Income and Living Conditions (EU-SILC, http://ec.europa.eu/eurostat/web/microdata/ european-union-statistics-on-income-and-living-conditions)

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.