GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare.

Rahman Ali,Muhammad Siddiqi,Eui-Nam Huh,Shujaat Hussain,Byeong Kang,Taqdir Ali,Muhammad Idris,Sungyoung Lee

doi:10.3390/s150715772

Abstract

A wide array of biomedical data are generated and made available to healthcare experts. However, due to the diverse nature of data, it is difficult to predict outcomes from it. It is therefore necessary to combine these diverse data sources into a single unified dataset. This paper proposes a global unified data model (GUDM) to provide a global unified data structure for all data sources and generate a unified dataset by a “data modeler” tool. The proposed tool implements user-centric priority based approach which can easily resolve the problems of unified data modeling and overlapping attributes across multiple datasets. The tool is illustrated using sample diabetes mellitus data. The diverse data sources to generate the unified dataset for diabetes mellitus include clinical trial information, a social media interaction dataset and physical activity data collected using different sensors. To realize the significance of the unified dataset, we adopted a well-known rough set theory based rules creation process to create rules from the unified dataset. The evaluation of the tool on six different sets of locally created diverse datasets shows that the tool, on average, reduces 94.1% time efforts of the experts and knowledge engineer while creating unified datasets.

Highlights

A successful decision support system relies on high quality information created either by a knowledge engineer or automatically generated from the data
This article describes the problem of fusing multiple heterogeneous datasets into a unified dataset for different types of high-level analysis, knowledge acquisition and reasoning
An expert-centric priority-based approach has been proposed and implemented as the “data modeler” tool. This application has an extensible framework with an easy to use GUI that allows knowledge engineers to import multiple heterogeneous datasets using its import manager and combines them together to obtain the unified dataset

Summary

Introduction

A successful decision support system relies on high quality information created either by a knowledge engineer or automatically generated from the data. A huge volume of human-centric personal data is available but integrating them from various sources into a unified dataset is challenging. The integration of multiple heterogeneous data sources is an important research issue that is not limited to the healthcare arena. To enable the use of healthcare data in clinical decisions, automatic generation of a single unified dataset is desirable [1]. This task is very challenging due to a number of technical issues, such as semantic heterogeneity, different naming conventions, resolving attributes’ values conflicts, finding intrinsic relationships, handling missing values and overlapping information and converting local datasets into global unified data model [2,3]. This paper focuses on the last four challenges and leaves the rest as future work

Objectives

Methods

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors	Publication Date: Jul 2, 2015
Citations: 57	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

Towards Standardization of Deregulated Electricity Market Communications in Nigeria
Joseph O
International Journal of Computer Applications | VOL. 130
Joseph OJoseph O
17 Nov 2015
International Journal of Computer Applications | VOL. 130

Building a Unified Spatio-Temporal Data Model for Grid Resources Based on Microservice Architecture
Haoqi Dai ... Yuxu Chen
Journal of Physics: Conference Series | VOL. 2404
Haoqi Dai, et. al.Haoqi Dai ... Yuxu Chen
01 Dec 2022
Journal of Physics: Conference Series | VOL. 2404

The OneGraph vision: Challenges of breaking the graph model lock-in1
Ora Lassila ... Brad Bebee
Semantic Web | VOL. 14
Ora Lassila, et. al.Ora Lassila ... Brad Bebee
30 Nov 2022
Semantic Web | VOL. 14

Developing High Quality Data Models

-

01 Jan 2010
01 Jan 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GUDM: Automatic Generation of Unified Datasets for Learning and Reasoning in Healthcare.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors