Abstract

Present electronic world produces enormous amount of data every second in various formats, especially in healthcare units. To efficiently utilize the available data by representing it in the machine readable form, the concept of Semantic web stepped in progressing towards automated knowledge discovery process. In this paper, comprehensive pre-processing techniques have been proposed for preparing the raw data to be presentable in structured format so as to construct the onto-graph for selected features in a health care domain. Cluster based Missing Value Imputation Algorithm (CMVI) has been proposed to enhance the quality of the imputed data which is the most important step during data pre-processing. Missing values were randomly induced into the Pima Indian Diabetic dataset with the missing ratio of 1%, 3% and 5% for each attribute up to 50% of the attributes in the original diabetic dataset. The experimental observations reveal that the quality of the pre-processed data is better compared to raw, unprocessed data in terms of imputation accuracy measured against coefficient of determination (R2), Index of agreement (d2) and Root Mean Square Error (RMSE).Documented results proved that the proposed techniques are comparatively superior than the traditional approaches with increased R2 & d2 and decreased RMSE scores. Further, importance of knowledge graph and various ontological representation types are discussed in short as construction of .owl file is the first step towards automation in semantic web.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.