VarioML framework for comprehensive variation data representation and exchange

Myles Byrne,Juha Muilu,Raymond Dalgleish,Anni Ahonen-Bishopp,Anthony J Brookes,Andrew Devereau,Ivo Fac Fokkema,Peter Em Taschner,David Atlan,George P Patrinos,Gudmundur A Thorisson,Morris A Swertz,Michael Cornell,Mauno Vihinen,Tomasz Adamusiak,Owen Lancaster,Christophe Béroud

doi:10.1186/1471-2105-13-254

Abstract

BackgroundSharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Complex standards have proven too time-consuming to implement.ResultsThe GEN2PHEN project addressed these difficulties by developing a comprehensive data model for capturing biomedical observations, Observ-OM, and building the VarioML format around it. VarioML pairs a simplified open specification for describing variants, with a toolkit for adapting the specification into one's own research workflow. Straightforward variant data can be captured, federated, and exchanged with no overhead; more complex data can be described, without loss of compatibility. The open specification enables push-button submission to gene variant databases (LSDBs) e.g., the Leiden Open Variation Database, using the Cafe Variome data publishing service, while VarioML bidirectionally transforms data between XML and web-application code formats, opening up new possibilities for open source web applications building on shared data. A Java implementation toolkit makes VarioML easily integrated into biomedical applications. VarioML is designed primarily for LSDB data submission and transfer scenarios, but can also be used as a standard variation data format for JSON and XML document databases and user interface components.ConclusionsVarioML is a set of tools and practices improving the availability, quality, and comprehensibility of human variation information. It enables researchers, diagnostic laboratories, and clinics to share that information with ease, clarity, and without ambiguity.

Highlights

Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult
Elements can be extended by adding new schema elements: Phenotype is an example of an observation element which reuse properties from the ontology term element
JSON is the common data serialization format recognized as the lingua franca for data exchange over the web, proven to be faster and consume fewer resources than XML [61]

Summary

Introduction

Sharing of data about variation and the associated phenotypes is a critical need, yet variant information can be arbitrarily complex, making a single standard vocabulary elusive and re-formatting difficult. Cost-effective sequencing, paired with variant discovery, promises to make early detection and intervention accessible for the millions of individuals with genetic diseases. Realizing this potential is blocked by the problem of integrating and coordinating the steps towards “a pipeline leading from discovery to delivery” [4]. The GEN2PHEN project was initiated in 2008 to unify human and model organism genetic variation databases, and remove the obstacles to translation of variant data from laboratory to clinic to public [5]. This has involved attempting to unify the divergent data representations of various database communities

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Oct 3, 2012
Citations: 35	License type: cc-by

R Discovery Prime

R Discovery Prime

VarioML framework for comprehensive variation data representation and exchange

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Experiences of Using Linked Data and Ontologies for Operational Data Sharing in Systems-of-Systems
Jakob Axelsson
-
Jakob AxelssonJakob Axelsson
01 Jan 2019
01 Jan 2019

Framework and realization of ship product data exchange
Jinghua Li
-
Jinghua LiJinghua Li
01 Jun 2010
01 Jun 2010

A Generalized Framework for Multi-party Data Exchange for IoT Systems
Jan Sliwa
-
Jan SliwaJan Sliwa
01 Mar 2016
01 Mar 2016

PANEL: analog intellectual property: now? or never?
S Ohr ... L Marchant
-
S Ohr, et. al.S Ohr ... L Marchant
01 Jan 2002
01 Jan 2002

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

VarioML framework for comprehensive variation data representation and exchange

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics