Abstract

The Intermediate Data Structure (IDS) provides a standard format for storing and sharing individual-level longitudinal life-course data (Alter and Mandemakers 2014; Alter, Mandemakers and Gutmann 2009). Once the data are in the IDS format, a standard set of programs can be used to extract data for analysis, facilitating the analysis of data across multiple databases. Currently, life-course databases store information in a variety of formats, and the process of translating data into IDS can be long and tedious. The IDS Transposer is a software tool that automates this process for source data in any format, allowing database administrators to specify how their datasets are to be represented in IDS. This article describes how the IDS Transposer works, first by going through an example step-by-step, and then by discussing each part of the process and potential options and exceptions in detail.

Highlights

  • It is important to note at the outset that there is no single correct way to represent a given dataset in Intermediate Data Structure (IDS).1 The process of translating a dataset from its native format into the IDS standard involves numerous decisions about how the particular dataset will be represented in IDS

  • The IDS Transposer retains all of the flexibility inherent in the IDS standard; it allows the user to specify exactly how a given dataset should be represented in IDS

  • The IDS Transposer is a powerful tool for moving data into the IDS standard

Read more

Summary

INTRODUCTION

It is important to note at the outset that there is no single correct way to represent a given dataset in IDS. The process of translating a dataset from its native format into the IDS standard involves numerous decisions about how the particular dataset will be represented in IDS. The IDS Transposer takes two types of files, all in .csv (comma-separated values) or tab-delimited format: the input data files, which are created from but not necessarily identical to the tables of the original dataset; and two mapping files, titled ENTITY and RELATIONSHIP, which the user prepares to indicate how each element of the input data files is to be represented in the resulting IDS dataset. Following the IDS standard, file names, dataset names, and table names are in all uppercase (e.g. INDIVIDUAL), and field (column) names are in title case (e.g. Type). Table names and fields/columns in IDS tables are in bold (e.g. Id_D) For this example, we will use a synthetic dataset titled FAMREC, which was created to resemble datasets produced through family reconstitution. Jane Wang and Ashok Bhargav are the authors of the IDS Transposer software

DESIRED IDS OUTPUT
DATA PREPARATION
MAPPING FILES
ENTITY MAPPING FILE
RELATIONSHIP MAPPING FILE
IDS TRANSPOSER
AN ALTERNATIVE DATA STRUCTURE
CONCLUSION
F A M - 10001 REC F A M - 10002 REC F A M - 100001 REC
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call