Abstract

BackgroundThe life-science community faces a major challenge in handling “big data”, highlighting the need for high quality infrastructures capable of sharing and publishing research data. Data preservation, analysis, and publication are the three pillars in the “big data life cycle”. The infrastructures currently available for managing and publishing data are often designed to meet domain-specific or project-specific requirements, resulting in the repeated development of proprietary solutions and lower quality data publication and preservation overall.Resultse!DAL is a lightweight software framework for publishing and sharing research data. Its main features are version tracking, metadata management, information retrieval, registration of persistent identifiers (DOI), an embedded HTTP(S) server for public data access, access as a network file system, and a scalable storage backend. e!DAL is available as an API for local non-shared storage and as a remote API featuring distributed applications. It can be deployed “out-of-the-box” as an on-site repository.Conclusionse!DAL was developed based on experiences coming from decades of research data management at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK). Initially developed as a data publication and documentation infrastructure for the IPK’s role as a data center in the DataCite consortium, e!DAL has grown towards being a general data archiving and publication infrastructure. The e!DAL software has been deployed into the Maven Central Repository. Documentation and Software are also available at: http://edal.ipk-gatersleben.de.

Highlights

  • The life-science community faces a major challenge in handling “big data”, highlighting the need for high quality infrastructures capable of sharing and publishing research data

  • The DataCite consortium [10] was founded to support data citation, providing a means to increase the acceptance of research data as legitimate contributions to scholarly records

  • To implement the data life cycle, all data and metadata updates are recorded as individual versions

Read more

Summary

Background

The availability of cross-domain data has increased dramatically over the last decade, driven by forces including systems biology, imaging for phenomics, and high-throughput technologies such as next-generation sequencing (NGS). Standalone data-repository server In order to support shared and collaborative access, e!DAL has a server module, which provides an RMI service to handle native JAVA clients, a WebDAV server to offer access as a network file system, and an HTTP server to support access by any web browser. According to the client capabilities, the supported e!DAL features range from browsing and downloading published data (HTTP), to providing a metadata-aware and version-aware remote file system (WebDAV), to providing full-featured API access (RMI) This wide range of functionality has been implemented to support application scenarios and desktop users that need data access in a file browser. Development, scalability and code-quality control Besides platform independence, the major advantage to implementing e!DAL in JAVA is the availability of open-source standard frameworks, like authentication services (JAAS), persistence frameworks (Hibernate), code-weaving tools (AspectJ), and build-management and dependency-management systems (Maven). The support of this platform-independent protocol will enable direct access to the e!DAL-API for a wide spectrum of programing languages and infrastructures

Leibniz Institute of Plant Genetics and Crop Plant Research Gatersleben
Data security
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call