Ontology-guided data preparation for discovering genotype-phenotype relationships

Adrien Coulet,Malika Smaïl-Tabbone,Pascale Benlian,Marie-Dominique Devignes,Amedeo Napoli

doi:10.1186/1471-2105-9-s4-s3

Abstract

BackgroundComplexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bio-ontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning.ResultsThis paper presents a method that exploits bio-ontologies for guiding data selection within the preparation step of the KDD process. We propose three scenarios in which domain knowledge and ontology elements such as subsumption, properties, class descriptions, are taken into account for data selection, before the data mining step. Each of these scenarios is illustrated within a case-study relative to the search of genotype-phenotype relationships in a familial hypercholesterolemia dataset. The guiding of data selection based on domain knowledge is analysed and shows a direct influence on the volume and significance of the data mining results.ConclusionsThe method proposed in this paper is an efficient alternative to numerical methods for data selection based on domain knowledge. In turn, the results of this study may be reused in ontology modelling and data integration.

Highlights

Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences
The National Center for Biomedical Ontology (NCBO) has recently developed Bioportal that offers a unified panorama on available bioontologies [4,5]
A dataset is defined as a relation between set of objects and set of attributes

Summary

Introduction

Complexity and amount of post-genomic data constitute two major factors limiting the application of Knowledge Discovery in Databases (KDD) methods in life sciences. Bioontologies may nowadays play key roles in knowledge discovery in life science providing semantics to data and to extracted units, by taking advantage of the progress of Semantic Web technologies concerning the understanding and availability of tools for knowledge representation, extraction, and reasoning. The Knowledge Discovery in Databases (KDD) process is based on three main operations: data preparation, data mining, and interpretation of the extracted units. This process is guided and controlled by an expert of the concerned domain. The National Center for Biomedical Ontology (NCBO) has recently developed Bioportal that offers a unified panorama on available bioontologies [4,5]

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Apr 1, 2008
Citations: 48	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Ontology-guided data preparation for discovering genotype-phenotype relationships

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Contributions of KDD to the Knowledge Management Process
Hércules Antonio Do Prado ... Eduardo Amadeu Moresi
CLEI Electronic Journal | VOL. 7
Hércules Antonio Do Prado, et. al.Hércules Antonio Do Prado ... Eduardo Amadeu Moresi
06 Sep 2018
CLEI Electronic Journal | VOL. 7

Application of knowledge discovery in database (KDD) techniques in cost overrun of construction projects
Mai Monir Ghazal ... Ahmed Hammad
International Journal of Construction Management | VOL. ahead-of-print
Mai Monir Ghazal, et. al.Mai Monir Ghazal ... Ahmed Hammad
12 Mar 2020
International Journal of Construction Management | VOL. ahead-of-print

Knowledge Discovery in Spatial Databases
Martin Ester ... Hans-Peter Kriegel
-
Martin Ester, et. al.Martin Ester ... Hans-Peter Kriegel
01 Jan 1998
01 Jan 1998

Application of data mining techniques in pharmacovigilance.
Andrew M Wilson ... Anne Holbrook
British journal of clinical pharmacology | VOL. 57
Andrew M Wilson, et. al.Andrew M Wilson ... Anne Holbrook
30 Sep 2003
British journal of clinical pharmacology | VOL. 57

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ontology-guided data preparation for discovering genotype-phenotype relationships

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics