Abstract

Small molecule chemistry is of central importance to a number of R&D companies in diverse areas such as the pharmaceutical, nutraceutical, food flavoring, and cosmeceutical industries. In order to store and manage thousands of chemical compounds in such an environment, we have built a state-of-the-art master chemical database with unique structure identifiers. Here, we present the concept and methodology we used to build the system that we call the Unique Compound Database (UCD). In the UCD, each molecule is registered only once (uniqueness), structures with alternative representations are entered in a uniform way (normalization), and the chemical structure drawings are recognizable to chemists and to a cartridge. In brief, structural molecules are entered as neutral entities which can be associated with a salt. The salts are listed in a dictionary and bound to the molecule with the appropriate stoichiometric coefficient in an entity called “substance”. The substances are associated with batches. Once a molecule is registered, some properties (e.g., ADMET prediction, IUPAC name, chemical properties) are calculated automatically. The UCD has both automated and manual data controls. Moreover, the UCD concept enables the management of user errors in the structure entry by reassigning or archiving the batches. It also allows updating of the records to include newly discovered properties of individual structures. As our research spans a wide variety of scientific fields, the database enables registration of mixtures of compounds, enantiomers, tautomers, and compounds with unknown stereochemistries.

Highlights

  • General introduction Small molecule chemistry is of central importance to a number of R&D companies in diverse areas, such as the pharmaceutical, nutraceutical, food flavoring, and cosmeceutical industries

  • In this paper we show how a chemical registration system can be built that we call the Unique Compound Database (UCD) and implemented at the corporate level of companies working with chemicals

  • Most of the requirements were related to chemical structure representation and storage; the starting point was that it must be possible to create and modify structures in the platform and to record physical samples attached to their related compounds

Read more

Summary

Background

General introduction Small molecule chemistry is of central importance to a number of R&D companies in diverse areas, such as the pharmaceutical, nutraceutical, food flavoring, and cosmeceutical industries. A common challenge in chemical structure registration is to represent stereogenic centers precisely, even when the absolute configuration is not known We solved this issue by using a leading system of stereo centers description, i.e., Accelrys enhanced stereochemical representation (V3000 format), which uses embedded labels in the structure to allow precise configuration of the molecule for each possibility. In cases where the exact structure of the compound is not known, the user can use the “No Structure” function of Symyx draw to fill in the form This form contains various fields for data associated to the batch (e.g., common name, source, internal identifier).

Viewer
Registrar
Discussion and conclusion
Findings
Chemical Structure Information Systems
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call