Abstract

In the current biomedical data movement, numerous efforts have been made to convert and normalize a large number of traditional structured and unstructured data (e.g., EHRs, reports) to semi-structured data (e.g., RDF, OWL). With the increasing number of semi-structured data coming into the biomedical community, data integration and knowledge discovery from heterogeneous domains become important research problem. In the application level, detection of related concepts among medical ontologies is an important goal of life science research. It is more crucial to figure out how different concepts are related within a single ontology or across multiple ontologies by analysing predicates in different knowledge bases. However, the world today is one of information explosion, and it is extremely difficult for biomedical researchers to find existing or potential predicates to perform linking among cross domain concepts without any support from schema pattern analysis. Therefore, there is a need for a mechanism to do predicate oriented pattern analysis to partition heterogeneous ontologies into closer small topics and do query generation to discover cross domain knowledge from each topic. In this paper, we present such a model that predicates oriented pattern analysis based on their close relationship and generates a similarity matrix. Based on this similarity matrix, we apply an innovated unsupervised learning algorithm to partition large data sets into smaller and closer topics and generate meaningful queries to fully discover knowledge over a set of interlinked data sources. We have implemented a prototype system named BmQGen and evaluate the proposed model with colorectal surgical cohort from the Mayo Clinic.

Highlights

  • Researchers and health care practitioners prefer to conduct research in an evidence-based practice by using available research results when making decisions in health care

  • Semantic Web is able to provide a platform of information exchange for biomedical knowledge

  • Our research focuses on predicates oriented pattern analysis

Read more

Summary

Introduction

Researchers and health care practitioners prefer to conduct research in an evidence-based practice by using available research results when making decisions in health care. The Semantic Web Health Care and Life Sciences Interest Group (HCLSIG) [2] is aimed at utilizing Semantic Web technologies for innovative research and collaboration in the health care and life science domains In this drive, large amounts of medical data have been specified and shared via machinereadable formats, such as the Resource Description Framework (RDF) and Ontology Web Language (OWL). The ontologies are developed to extend the work of others and share across different domains These Semantic Web technologies make it easier and more practical to integrate, query, and analyze the full scale of relevant biomedical and healthcare data, as well as EHRs for cost effective health care systems [3]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call