Abstract

A large number of ontologies have been introduced by the biomedical community in recent years. Knowledge discovery for entity identification from ontology has become an important research area, and it is always interesting to discovery how associations are established to connect concepts in a single ontology or across multiple ontologies. However, due to the exponential growth of biomedical big data and their complicated associations, it becomes very challenging to detect key associations among entities in an inefficient dynamic manner. Therefore, there exists a gap between the increasing needs for association detection and large volume of biomedical ontologies. In this paper, to bridge this gap, we presented a knowledge discovery framework, the BioBroker, for grouping entities to facilitate the process of biomedical knowledge discovery in an intelligent way. Specifically, we developed an innovative knowledge discovery algorithm that combines a graph clustering method and an indexing technique to discovery knowledge patterns over a set of interlinked data sources in an efficient way. We have demonstrated capabilities of the BioBroker for query execution with a use case study on a subset of the Bio2RDF life science linked data.

Highlights

  • We found that K-Means yielded the highest Silhouette Width (SW) score as 0.9, Hierarchical Fuzzy C-Means (HFCM) produced the suboptimal performance as 0.88, and the other three algorithms contributed to a same SW score as 0.76

  • We selected HFCM as the optimal algorithm since it is able to provide additional soft partition capabilities, which was useful for distributed query processing

  • This paper presents a predicate pattern based model equipped with index technique for query suggestion, visualization, scalable query and reasoning with large biomedical ontology schema and data

Read more

Summary

Introduction

Since there lacks appropriate tools and computational infrastructure that can be fully understood and utilized by involved personnel, very few capacities can be found to carry out analyses of these datasets [1]. As the demand for the integration and analysis of data has been growing steadily, the first effort toward connecting scattered biomedical data materialized as a data movement by the biomedical community (i.e., the Linked Data) [2]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call