Structure-based classification and ontology in chemistry.

Janna Hastings,Lian Duan,Marcus Ennis,Colin Batchelor,Robert Stevens,Despoina Magka,Christoph Steinbeck

doi:10.1186/1758-2946-4-8

Abstract

BackgroundRecent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies.ResultsWe analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches.ConclusionSystems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.

Highlights

Recent years have seen an explosion in the availability of data in the chemistry domain
Logic-based knowledge representation can be contrasted with algorithmic ‘knowledge representation’, in which software algorithms procedurally define outputs based on stated inputs, and with statistical ‘knowledge representation’, in which complex statistical models are trained to produce outputs based on a given set of inputs by learning weights for a complex set of internal parameters
Analysis of structural features used in class definitions By examination of the definitions of higher-level structural classes included in Chemical Entities of Biological Interest ontology (ChEBI), we have identified the following categories of elementary features used in structural chemical class definitions: 1. Interesting parts (IP), such as the carboxy group or the cholestane scaffold 2

Summary

Introduction

Recent years have seen an explosion in the availability of data in the chemistry domain. In biomedicine and the natural sciences more generally, hierarchical organisation and large-scale data management are being facilitated by formal ontologies: machine-understandable encodings of human domain knowledge. Such ontologies are used in several different ways [2,3,4]. They ensure standardisation of terminology and identification across all entities in a domain so that multiple sources of data can be aggregated through comparable reference terms They provide hierarchical organisation so that such aggregation can be performed at different levels for novel datadriven scientific discovery. An advantage of logicbased knowledge representation is that it allows the knowledge to be explicitly expressed as knowledge, i.e. as statements that are comprehensible, true and selfcontained, and available for modification by persons without a computational background such as domain experts; this is in contrast to statistical methods that operate as black boxes and to procedural methods that require a programmer in order to manipulate or extend them

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Cheminformatics	Publication Date: Apr 5, 2012
Citations: 86	License type: CC BY 2.0

R Discovery Prime

R Discovery Prime

Structure-based classification and ontology in chemistry.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

Learning chemistry: exploring the suitability of machine learning for the task of structure-based chemical ontology classification
Janna Hastings ... Martin Glauer
Journal of Cheminformatics | VOL. 13
Janna Hastings, et. al.Janna Hastings ... Martin Glauer
16 Mar 2021
Journal of Cheminformatics | VOL. 13

Prototype semantic infrastructure for automated small molecule classification and annotation in lipidomics
Leonid L Chepelev ... Michel Dumontier
BMC Bioinformatics | VOL. 12
Leonid L Chepelev, et. al.Leonid L Chepelev ... Michel Dumontier
26 Jul 2011
BMC Bioinformatics | VOL. 12

Targeting telomere maintenance mechanisms in cancer therapy.
M Folini
Current pharmaceutical design | VOL. 20
M FoliniM Folini
14 Oct 2014
Current pharmaceutical design | VOL. 20

Analytical methods in environmental effects-directed investigations of effluents
L Mark Hewitt ... Chris H Marvin
Mutation Research/Reviews in Mutation Research | VOL. 589
L Mark Hewitt, et. al.L Mark Hewitt ... Chris H Marvin
21 Mar 2005
Mutation Research/Reviews in Mutation Research | VOL. 589

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Structure-based classification and ontology in chemistry.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics