Abstract

In his Perspective “Ensuring the data-rich future of the social sciences” (special section on Dealing with Data, 11 February, p. [719][1]), G. King discusses the potential to transform the future of social science using existing “huge quantities of digital information about people.” As King states, the business sector has proven that powerful informatics applied to data about people can support better decision-making and foster innovation. The same could be done to tackle difficult social problems once we figure out how to assemble and share large data sets while protecting privacy. King suggests building data archives to facilitate reuse of data. Large data archives already exist on survey data ([ 1 ][2]–[ 5 ][3]), but little attention has been given to government administrative data, which is a key source of information about our society. From the day we are born until we die, most of our activities leave traces in various government databases. Indeed, these government information systems continuously generate data about all aspects of our society, much like the satellites we use to monitor our physical surroundings. Unfortunately, most administrative data are left to languish in legacy databases after their original use. We must invest in efforts to build a systematic pipeline to extract data from these systems for secondary analysis. A well-integrated federated data system of administrative databases updated on an ongoing basis could hold a collective representation of our society. Designed, built, and properly managed by experts under appropriate protocols and oversight, such a data infrastructure could transform the social sciences and still protect individual privacy, just as well-maintained satellite data have transformed astronomy. A secure data infrastructure is critical for government decision support systems, transparency and accountability in government, and ultimately computational social science research. The Census Bureau ([ 5 ][3]) and others ([ 6 ][4], [ 7 ][5]) have already demonstrated the enormous potential of federated administrative data systems. 1. [↵][6] The Centers for Disease Control and Prevention (CDC) National Center for Health Statistics Research Data Center ([www.cdc.gov/rdc/][7]). 2. The Dataverse Network Project ( ). 3. NORC Data Enclave ([www.norc.uchicago.edu/DataEnclave/default.htm][8]). 4. Odum Institute Dataverse Network Project ( ). 5. [↵][9] U.S. Census Bureau Longitudinal Employer Household Dynamics (LEHD) ( ). 6. [↵][10] U.S. Environmental Protection Agency, Central Data Exchange ([www.epa.gov/cdx/index.htm][11]). 7. [↵][12] 1. H. C. Kum, 2. D. F. Duncan, 3. C. J. Stewart , Gov. Inform. Q. 26, 295 (2009). [OpenUrl][13][CrossRef][14] [1]: /lookup/doi/10.1126/science.1197872 [2]: #ref-1 [3]: #ref-5 [4]: #ref-6 [5]: #ref-7 [6]: #xref-ref-1-1 View reference 1 in text [7]: http://www.cdc.gov/rdc/ [8]: http://www.norc.uchicago.edu/DataEnclave/default.htm [9]: #xref-ref-5-1 View reference 5 in text [10]: #xref-ref-6-1 View reference 6 in text [11]: http://www.epa.gov/cdx/index.htm [12]: #xref-ref-7-1 View reference 7 in text [13]: {openurl}?query=rft.jtitle%253DGov.%2BInform.%2BQ.%26rft.volume%253D26%26rft.spage%253D295%26rft_id%253Dinfo%253Adoi%252F10.1016%252Fj.giq.2008.12.009%26rft.genre%253Darticle%26rft_val_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Ajournal%26ctx_ver%253DZ39.88-2004%26url_ver%253DZ39.88-2004%26url_ctx_fmt%253Dinfo%253Aofi%252Ffmt%253Akev%253Amtx%253Actx [14]: /lookup/external-ref?access_num=10.1016/j.giq.2008.12.009&link_type=DOI

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call