Abstract

We present performance data relating to the use of migration in a system we are creating to provide web access to heterogeneous document collections in legacy formats. Our goal is to enable sustained access to collections such as these when faced with increasing obsolescence of the necessary supporting applications and operating systems. Our system allows searching and browsing of the original files within their original contexts utilizing binary images of the original media. The system uses static and dynamic file migration to enhance collection browsing, and emulation to support both the use of legacy programs to access data and long-term preservation of the migration software. While we provide an overview of the architectural issues in building such a system, the focus of this paper is an in-depth analysis of file migration using data gathered from testing our software on 1,885 CD-ROMs and DVDs. These media are among the thousands of collections of social and scientific data distributed by the United States Government Printing Office (GPO) on legacy media (CD-ROM, DVD, floppy disk) under the Federal Depository Library Program (FDLP) over the past 20 years.

Highlights

  • The electronic publication of important scientific and social data and documents over the past 20 years has created a preservation crisis because the software necessary to access them is facing rapid obsolescence (Hedstrom, 2003; Rothenberg, 1999)

  • Ensuring that our grandchildren will have continued access to these data and documents requires the creation of new software systems that enable their conversion to modern renditions where feasible, and the execution of obsolete software where necessary (Department of Defense, 2007; National Archives and Records Administration [NARA], 2003)

  • We describe our efforts to create a software system to enable web access to a large heterogeneous document collection

Read more

Summary

Migration Performance for Legacy Data Access

We present performance data relating to the use of migration in a system we are creating to provide web access to heterogeneous document collections in legacy formats. The system uses static and dynamic file migration to enhance collection browsing, and emulation to support both the use of legacy programs to access data and long-term preservation of the migration software. While we provide an overview of the architectural issues in building such a system, the focus of this paper is an in-depth analysis of file migration using data gathered from testing our software on 1,885 CD-ROMs and DVDs. While we provide an overview of the architectural issues in building such a system, the focus of this paper is an in-depth analysis of file migration using data gathered from testing our software on 1,885 CD-ROMs and DVDs These media are among the thousands of collections of social and scientific data distributed by the United States Government Printing Office (GPO) on legacy media (CD-ROM, DVD, floppy disk) under the Federal Depository Library Program (FDLP) over the past 20 years.

Introduction
System Overview
Document Migration
MS PowerPoint
Migration Tools
Successful conversions
Findings
Converted Total
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call