E!DAL--a framework to store, share and publish research data.

Daniel Arend,Steffen Flemming,Christian Colmsee,Jinbo Chen,Matthias Lange,Denny Hecht,Uwe Scholz

doi:10.1186/1471-2105-15-214

Daniel Arend, Steffen Flemming + Show 5 more

Open Access

https://doi.org/10.1186/1471-2105-15-214

Copy DOI

Abstract

BackgroundThe life-science community faces a major challenge in handling “big data”, highlighting the need for high quality infrastructures capable of sharing and publishing research data. Data preservation, analysis, and publication are the three pillars in the “big data life cycle”. The infrastructures currently available for managing and publishing data are often designed to meet domain-specific or project-specific requirements, resulting in the repeated development of proprietary solutions and lower quality data publication and preservation overall.Resultse!DAL is a lightweight software framework for publishing and sharing research data. Its main features are version tracking, metadata management, information retrieval, registration of persistent identifiers (DOI), an embedded HTTP(S) server for public data access, access as a network file system, and a scalable storage backend. e!DAL is available as an API for local non-shared storage and as a remote API featuring distributed applications. It can be deployed “out-of-the-box” as an on-site repository.Conclusionse!DAL was developed based on experiences coming from decades of research data management at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK). Initially developed as a data publication and documentation infrastructure for the IPK’s role as a data center in the DataCite consortium, e!DAL has grown towards being a general data archiving and publication infrastructure. The e!DAL software has been deployed into the Maven Central Repository. Documentation and Software are also available at: http://edal.ipk-gatersleben.de.

Highlights

The life-science community faces a major challenge in handling “big data”, highlighting the need for high quality infrastructures capable of sharing and publishing research data
The DataCite consortium [10] was founded to support data citation, providing a means to increase the acceptance of research data as legitimate contributions to scholarly records
To implement the data life cycle, all data and metadata updates are recorded as individual versions

Summary

Background

The availability of cross-domain data has increased dramatically over the last decade, driven by forces including systems biology, imaging for phenomics, and high-throughput technologies such as next-generation sequencing (NGS). Standalone data-repository server In order to support shared and collaborative access, e!DAL has a server module, which provides an RMI service to handle native JAVA clients, a WebDAV server to offer access as a network file system, and an HTTP server to support access by any web browser. According to the client capabilities, the supported e!DAL features range from browsing and downloading published data (HTTP), to providing a metadata-aware and version-aware remote file system (WebDAV), to providing full-featured API access (RMI) This wide range of functionality has been implemented to support application scenarios and desktop users that need data access in a file browser. Development, scalability and code-quality control Besides platform independence, the major advantage to implementing e!DAL in JAVA is the availability of open-source standard frameworks, like authentication services (JAAS), persistence frameworks (Hibernate), code-weaving tools (AspectJ), and build-management and dependency-management systems (Maven). The support of this platform-independent protocol will enable direct access to the e!DAL-API for a wide spectrum of programing languages and infrastructures

Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben

Data security