Challenges of Large-Scale Data Processing in the 1990s: The IPUMS Experience.

Diana L Magnuson,Steven Ruggles

doi:10.1109/mahc.2022.3214736

Abstract

When it was launched in 1991, the Integrated Public Use Microdata Series (IPUMS) project faced a challenging environment and limited resources. Few datasets were interoperable and much data collected at great public expense was inaccessible to most researchers. Documentation of datasets was nonstandardized, incomplete, and inadequate for automated processing. With insufficient attention to preservation, valuable scientific data were disappearing (see Bogue et al., 1976). IPUMS was established to address these critical issues. At the outset, IPUMS faced daunting barriers of inadequate data processing, storage, and network capacity. This anecdote describes the improvised computational infrastructure developed in the decade from 1989 to 1999 to process, manage, and disseminate the world's largest population datasets. We use a combination of archival sources, interviews, and our own memories to trace the development of the IPUMS computing environment during a period of explosive technical innovation. The development of IPUMS is part of a larger story of the development of social science infrastructure in the late 20th century and its contribution to democratizing data access.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Challenges of Large-Scale Data Processing in the 1990s: The IPUMS Experience.

Abstract

Talk to us

Similar Papers

More From: IEEE annals of the history of computing

Lead the way for us

Journal: IEEE annals of the history of computing	Publication Date: Oct 1, 2022
Citations: 2

Similar Papers

Interoperable and accessible census and survey data from IPUMS
Tracy A Kugler ... Catherine A Fitch
Scientific Data | VOL. 5
Tracy A Kugler, et. al.Tracy A Kugler ... Catherine A Fitch
27 Feb 2018
Scientific Data | VOL. 5

IPUMS Redesign
Steven Ruggles ... Catherine A Fitch
Historical Methods: A Journal of Quantitative and Interdisciplinary History | VOL. 36
Steven Ruggles, et. al.Steven Ruggles ... Catherine A Fitch
01 Jan 2003
Historical Methods: A Journal of Quantitative and Interdisciplinary History | VOL. 36

IPUMS-CPS: An Integrated Version of the March Current Population Survey, 1962–2002
Miriam L King ... Michele Tertilt
Historical Methods: A Journal of Quantitative and Interdisciplinary History | VOL. 36
Miriam L King, et. al.Miriam L King ... Michele Tertilt
01 Jan 2003
IPUMS-CPS: An Integrated Version of the March Current Population Survey, 1962–2002
Miriam L King ... Michele Tertilt

Persistent Advantage or Disadvantage?: Evidence in Support of the Intergenerational Drag Hypothesis
William Darity ... Jason Dietrich
The American Journal of Economics and Sociology | VOL. 60
William Darity, et. al.William Darity ... Jason Dietrich
01 Apr 2001
The American Journal of Economics and Sociology | VOL. 60

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Challenges of Large-Scale Data Processing in the 1990s: The IPUMS Experience.

Abstract

Talk to us

Similar Papers

More From: IEEE annals of the history of computing