Abstract

BackgroundSystems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. We previously developed an RDF-based resource called Chem2Bio2RDF that enabled querying of such data using the SPARQL query language. Whilst this work has proved useful in its own right as one of the first major resources in these disciplines, its utility could be greatly improved by the application of an ontology for annotation of the nodes and edges in the RDF graph, enabling a much richer range of semantic queries to be issued.ResultsWe developed a generalized chemogenomics and systems chemical biology OWL ontology called Chem2Bio2OWL that describes the semantics of chemical compounds, drugs, protein targets, pathways, genes, diseases and side-effects, and the relationships between them. The ontology also includes data provenance. We used it to annotate our Chem2Bio2RDF dataset, making it a rich semantic resource. Through a series of scientific case studies we demonstrate how this (i) simplifies the process of building SPARQL queries, (ii) enables useful new kinds of queries on the data and (iii) makes possible intelligent reasoning and semantic graph mining in chemogenomics and systems chemical biology.AvailabilityChem2Bio2OWL is available at http://chem2bio2rdf.org/owl. The document is available at http://chem2bio2owl.wikispaces.com.

Highlights

  • Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology

  • We describe the creation of such an ontology that covers chemogenomics and the entities of Systems Chemical Biology described above, as defined by the scope of our Chem2Bio2RDF data resource and demonstrate its usage as a knowledge base for study

  • The following SPARQL presents the searching of two chemogenomics database: PREFIX compound: PREFIX bindingdb: PREFIX drugbank: PREFIX uniprot: SELECT ?uniprot_id FROM FROM FROM FROM WHERE { {?compound compound:Compound ID (CID) ?compound_cid

Read more

Summary

Introduction

Systems chemical biology and chemogenomics are considered critical, integrative disciplines in modern biomedical research, but require data mining of large, integrated, heterogeneous datasets from chemistry and biology. Recent efforts [1,2,3] in the Semantic web have involved conversion of various chemical and biological data sources into semantic formats (e.g., RDF, OWL) and linked them into very large networks. Web Ontology Language (OWL) is a language for making these descriptions designed for use within Semantic Web. A variety of ontologies in the life sciences have been developed. Gene Ontology (GO) [9] is arguably the most widely used ontology in life sciences It aims to formalize the representation of information about biological processes, molecular functions, and cellular components across multiple organisms. Several ontologies have been developed recently to formalize chemical biology experiments and provide guidance for data annotation. Holford et al created logical rules using Semantic Web Rule Language to answer research questions pertaining to pseudogenes [29]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.