Design and implementation of a generalized laboratory data model.

Michael C Wendl,Asif T Chinwalla,Todd Hepler,Scott Smith,Kevin Crouse,Benjamin J Oberkfell,Craig S Pohl,Mike Nhan,Lynn Carmichael,Shin Leong,David J Dooling,Richard K Wilson,Elaine R Mardis,Ladeana W Hillier

doi:10.1186/1471-2105-8-362

Michael C Wendl, Asif T Chinwalla + Show 12 more

Open Access

https://doi.org/10.1186/1471-2105-8-362

Copy DOI

Journal: BMC bioinformatics	Publication Date: Sep 26, 2007
Citations: 31	License type: CC BY 2.0

Affiliation: University of Washington

Abstract

BackgroundInvestigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable.ResultsWe describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in ad hoc ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions.ConclusionThe implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments.

Highlights

Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data
Processing volumes continue to grow, processes change almost fluidly, and evolving research directions dictate increasing degrees of heterogeneity in the data. The latter point is well-illustrated by the trend toward maintaining both the traditional genomic DNA sequencing pipelines as well as medical/patient sequencing pipelines, simultaneously
These factors place enormous demands on a data-tracking system and it is only a slight exaggeration to say that an inferior laboratory information management system (LIMS) can threaten a lab's very viability

Summary

Introduction

Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. The methods themselves evolve in a rapid and fluid manner These observations point to the importance of robust information management systems in the modern laboratory. In a number of cases, the rate at which data can be generated has increased by several orders of magnitude This scale-up has contributed to the rise of "big biology" projects of the type that could not have been realistically undertaken only a generation ago, e.g., the Human Genome Project [1]. Such dramatic expansions in throughput have largely been enabled by engineering innovation, e.g., hardware advancements and automation. Biologists have steadily been adopting the automated and flexible manufacturing paradigms already established in industry to increase production, as well as to reduce costs and errors

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Design and implementation of a generalized laboratory data model.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Recursive data models for non-conventional database applications
Winfried Lamersdorf
-
Winfried LamersdorfWinfried Lamersdorf
01 Apr 1984
01 Apr 1984

A mechanistic model supported by data-based classification models for batch hydrogen production with an immobilized photo-bacteria consortium
Isaac Monroy ... Germán Buitrón
International Journal of Hydrogen Energy | VOL. 41
Isaac Monroy, et. al.Isaac Monroy ... Germán Buitrón
09 Nov 2016
International Journal of Hydrogen Energy | VOL. 41

Dynamic Spatio-Temporal Modeling for Enhanced Air Quality Prediction: Implications for Information Management and Public Health Decision Support Systems
Harna M Bodele ... Dr Kiran G Asutkar
South Eastern European Journal of Public Health | VOL. -
Harna M Bodele, et. al. Harna M Bodele ... Dr Kiran G Asutkar
03 Oct 2024
South Eastern European Journal of Public Health | VOL. -

ELECTRONIC DOCUMENT, INFORMATION, AND ARCHIVE MANAGEMENT SYSTEMS IN ECONOMIC INSTITUTIONS: A DESCRIPTIVE STUDY OF THE ONBASE SYSTEM
Hayat Benmakhlouf ... Abdelbasset Chouaou
International Journal of Professional Business Review | VOL. 9
Hayat Benmakhlouf, et. al.Hayat Benmakhlouf ... Abdelbasset Chouaou
04 Jun 2024
International Journal of Professional Business Review | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Design and implementation of a generalized laboratory data model.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics