Abstract

BackgroundIn recent years data integration has become an everyday undertaking for life sciences researchers. Aggregating and processing data from disparate sources, whether through specific developed software or via manual processes, is a common task for scientists. However, the scope and usability of the majority of current integration tools fail to deal with the fast growing and highly dynamic nature of biomedical data.ResultsIn this work we introduce a reactive and event-driven framework that simplifies real-time data integration and interoperability. This platform facilitates otherwise difficult tasks, such as connecting heterogeneous services, indexing, linking and transferring data from distinct resources, or subscribing to notifications regarding the timeliness of dynamic data. For developers, the framework automates the deployment of integrative and interoperable bioinformatics applications, using atomic data storage for content change detection, and enabling agent-based intelligent extract, transform and load tasks.ConclusionsThis work bridges the gap between the growing number of services, accessing specific data sources or algorithms, and the growing number of users, performing simple integration tasks on a recurring basis, through a streamlined workspace available to researchers and developers alike.

Highlights

  • In recent years data integration has become an everyday undertaking for life sciences researchers

  • Further details are available in the framework documentation, online at https://bioinformatics.ua.pt/i2x/docs. This framework brings a new perspective to the scientific data integration landscape, summarised in three main features, discussed in detail

  • Automated real-time data integration is achieved through the deployment of intelligent agents, which can operate remotely, to monitor data sources

Read more

Summary

Introduction

In recent years data integration has become an everyday undertaking for life sciences researchers. The scale of information available for life sciences research is growing rapidly, bringing increasing challenges in hardware and software [1, 2]. The value of these raw data can only be proved if adequately exploited by endusers. From generation sequencing hardware [6] to the growing availability of biomedical sensors, tapping this on-going data stream is an unwieldy mission [7].

Objectives
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call