Abstract
Vast amounts of clinical and biomedical research data are produced daily. These data can help enable data driven healthcare through novel biomedical discoveries, improved diagnostics processes, epidemiology, and education. However, finding, and gaining access to these data and relevant metadata that are necessary to achieve these goals remains a challenge. Furthermore, data management and enabling widespread, albeit controlled, use poses a major challenge for data producers. These data sources are often geographically distributed, with diverse characteristics, and are controlled by a host of logistical and legal factors that require appropriate governance and access control guarantees. To overcome these obstacles, a set of guiding principles under the term FAIR has been previously introduced. The primary desirable dataset properties are thus that the data should be Findable, Accessible, Interoperable, and Reusable (FAIR). In this paper, we introduce and describe an abstract framework that models these ideal goals, and could be a step toward supporting data driven research. We also develop a system instantiated on our framework called the Data integration and indexing System (DiiS). The system provides an integration model for making healthcare data available on a global scale. Our research work describes the challenges inhibiting data producers, data stewards, and data brokers in achieving FAIR goals for sharing biomedical data. We attempt to address some of the key challenges through the proposed system. We evaluated our framework using the software architecture testing technique and also looked at how different challenges in data integration are addressed by our system. Our evaluation shows that the DiiS framework is a user friendly data integration system that would greatly contribute to biomedical research.
Highlights
The growing amount of available biomedical data poses new challenges for data management.Data re-usability is a highly desirable goal, both for advancing science as well as replicating or validating results of previous studies
We primarily focus on three types of data (a) Radiology teaching files or teaching files used by doctors and radiologists; (b) Electronic health records; (c)
A distributed and integrated data repository is key to advancing biomedical research
Summary
Data re-usability is a highly desirable goal, both for advancing science as well as replicating or validating results of previous studies. Recognizing this need, publishers as well as funding bodies may require researchers to submit data generated as a result of their work and make it available to the research community. The National Institutes of Health (NIH) is encouraging funded investigators to use cloud computing to conduct research and make their work accessible to larger audiences: “The. Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) initiative establishes partnerships with commercial cloud service providers (CSPs) to reduce economic and technological barriers to accessing and computing on large biomedical datasets. Several studies have highlighted the need for integration of healthcare data [2,3,4]. We focus on the research in the field of biomedical data integration and requirements thereof
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.