Abstract

BackgroundOver the past several centuries, chemistry has permeated virtually every facet of human lifestyle, enriching fields as diverse as medicine, agriculture, manufacturing, warfare, and electronics, among numerous others. Unfortunately, application-specific, incompatible chemical information formats and representation strategies have emerged as a result of such diverse adoption of chemistry. Although a number of efforts have been dedicated to unifying the computational representation of chemical information, disparities between the various chemical databases still persist and stand in the way of cross-domain, interdisciplinary investigations. Through a common syntax and formal semantics, Semantic Web technology offers the ability to accurately represent, integrate, reason about and query across diverse chemical information.ResultsHere we specify and implement the Chemical Entity Semantic Specification (CHESS) for the representation of polyatomic chemical entities, their substructures, bonds, atoms, and reactions using Semantic Web technologies. CHESS provides means to capture aspects of their corresponding chemical descriptors, connectivity, functional composition, and geometric structure while specifying mechanisms for data provenance. We demonstrate that using our readily extensible specification, it is possible to efficiently integrate multiple disparate chemical data sources, while retaining appropriate correspondence of chemical descriptors, with very little additional effort. We demonstrate the impact of some of our representational decisions on the performance of chemically-aware knowledgebase searching and rudimentary reaction candidate selection. Finally, we provide access to the tools necessary to carry out chemical entity encoding in CHESS, along with a sample knowledgebase.ConclusionsBy harnessing the power of Semantic Web technologies with CHESS, it is possible to provide a means of facile cross-domain chemical knowledge integration with full preservation of data correspondence and provenance. Our representation builds on existing cheminformatics technologies and, by the virtue of RDF specification, remains flexible and amenable to application- and domain-specific annotations without compromising chemical data integration. We conclude that the adoption of a consistent and semantically-enabled chemical specification is imperative for surviving the coming chemical data deluge and supporting systems science research.

Highlights

  • Introduction to methodology and encoding rulesJ Chem Inf Comput Sci. 28, 31–36 (1988) 8

  • Overview To rectify the aforementioned chemical information integration problems, we propose Chemical Entity Semantic Specification (CHESS), an Resource Description Framework (RDF)-based chemical information specification that is backed by the CHEMINF ontology [22]

  • While the accurate annotation and searching of bond-level information is currently impossible in major chemical information repositories, we have developed a demonstrative set of chemical entities with O-H Bond Dissociation Enthalpies (BDEs) information annotation, including the computational method, software, and some of the parameters used to compute this information for ethanol (Appendix 8)

Read more

Summary

Introduction

No efficient way to create a canonical molecular representation for this line notation existed, meaning that a given molecule could be referred to by multiple different WLN strings in different chemical databases. This shortcoming was overcome with the introduction of the SMILES notation [7], which explicitly represented chemical molecules as graphs with atoms being nodes and bonds edges, along with an efficient algorithm to create a canonical, reproducible SMILES string representation of a given molecule. It is not unreasonable to believe that a universal adoption of the IUPAC standard InChI keys in the role of database indexes could potentially facilitate knowledge federation immensely

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call