Abstract

BackgroundFulfilling the vision of Semantic Web requires an accurate data model for organizing knowledge and sharing common understanding of the domain. Fitting this description, ontologies are the cornerstones of Semantic Web and can be used to solve many problems of clinical information and biomedical engineering, such as word sense disambiguation, semantic similarity, question answering, ontology alignment, etc. Manual construction of ontology is labor intensive and requires domain experts and ontology engineers. To downsize the labor-intensive nature of ontology generation and minimize the need for domain experts, we present a novel automated ontology generation framework, Linked Open Data approach for Automatic Biomedical Ontology Generation (LOD-ABOG), which is empowered by Linked Open Data (LOD). LOD-ABOG performs concept extraction using knowledge base mainly UMLS and LOD, along with Natural Language Processing (NLP) operations; and applies relation extraction using LOD, Breadth first Search (BSF) graph method, and Freepal repository patterns.ResultsOur evaluation shows improved results in most of the tasks of ontology generation compared to those obtained by existing frameworks. We evaluated the performance of individual tasks (modules) of proposed framework using CDR and SemMedDB datasets. For concept extraction, evaluation shows an average F-measure of 58.12% for CDR corpus and 81.68% for SemMedDB; F-measure of 65.26% and 77.44% for biomedical taxonomic relation extraction using datasets of CDR and SemMedDB, respectively; and F-measure of 52.78% and 58.12% for biomedical non-taxonomic relation extraction using CDR corpus and SemMedDB, respectively. Additionally, the comparison with manually constructed baseline Alzheimer ontology shows F-measure of 72.48% in terms of concepts detection, 76.27% in relation extraction, and 83.28% in property extraction. Also, we compared our proposed framework with ontology-learning framework called “OntoGain” which shows that LOD-ABOG performs 14.76% better in terms of relation extraction.ConclusionThis paper has presented LOD-ABOG framework which shows that current LOD sources and technologies are a promising solution to automate the process of biomedical ontology generation and extract relations to a greater extent. In addition, unlike existing frameworks which require domain experts in ontology development process, the proposed approach requires involvement of them only for improvement purpose at the end of ontology life cycle.

Highlights

  • Fulfilling the vision of Semantic Web requires an accurate data model for organizing knowledge and sharing common understanding of the domain

  • [23] Legaz-García et al use agglomerative clustering to construct concept hierarchies and generate formal specification output that complies with an Web Ontology Language (OWL) format by using ontology alignment while Missikoff et al [24] proposed an unsupervised approach that combines a linguistic and statistics-based method to perform automated ontology generation tasks from texts

  • We propose an automated framework called “Linked Open Data-Based Framework for Automatic Biomedical Ontology Generation” (LOD-ABOG) that resolves each of the aforementioned challenges at once; to overcome the high cost of the manual construction of a domain-specific ontology, transform large volume of data, achieve domain independency, and achieve high degree of domain coverage

Read more

Summary

Introduction

Fulfilling the vision of Semantic Web requires an accurate data model for organizing knowledge and sharing common understanding of the domain. The rule-based approach involves a manually crafted set of rules formed to represent knowledge that decide what to do or conclude across various scenarios It achieves a very high level of precision, but quite low recall. Machine learning-based approaches use various supervised and unsupervised methods for automating ontology generation tasks. Huang et al [22] proposed a supervised approach that uses predefine syntactic patterns and machine learning to detect relations between two entities from Wikipedia Texts. Few unsupervised approaches in [23, 24] were proposed: in [23] Legaz-García et al use agglomerative clustering to construct concept hierarchies and generate formal specification output that complies with an OWL format by using ontology alignment while Missikoff et al [24] proposed an unsupervised approach that combines a linguistic and statistics-based method to perform automated ontology generation tasks from texts

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call