Canonical Tautomer Research Articles

BackgroundIntegration of medicinal chemistry data from numerous public resources is an increasingly important part of academic drug discovery and translational research because it can bring a wealth of important knowledge related to compounds in one place. However, different data sources can report the same or related compounds in various forms (e.g., tautomers, racemates, etc.), thus highlighting the need of organising related compounds in hierarchies that alert the user on important bioactivity data that may be relevant. To generate these compound hierarchies, we have developed and implemented canSARchem, a new compound registration and standardization pipeline as part of the canSAR public knowledgebase. canSARchem builds on previously developed ChEMBL and PubChem pipelines and is developed using KNIME. We describe the pipeline which we make publicly available, and we provide examples on the strengths and limitations of the use of hierarchies for bioactivity data exploration. Finally, we identify canonicalization enrichment in FDA-approved drugs, illustrating the benefits of our approach.ResultsWe created a chemical registration and standardization pipeline in KNIME and made it freely available to the research community. The pipeline consists of five steps to register the compounds and create the compounds’ hierarchy: 1. Structure checker, 2. Standardization, 3. Generation of canonical tautomers and representative structures, 4. Salt strip, and 5. Generation of abstract structure to generate the compound hierarchy. Unlike ChEMBL’s RDKit pipeline, we carry out compound canonicalization ahead of getting the parent structure, similar to PubChem’s OpenEye pipeline. canSARchem has a lower rejection rate compared to both PubChem and ChEMBL. We use our pipeline to assess the impact of grouping the compounds in hierarchies for bioactivity data exploration. We find that FDA-approved drugs show statistically significant sensitivity to canonicalization compared to the majority of bioactive compounds which demonstrates the importance of this step.ConclusionsWe use canSARchem to standardize all the compounds uploaded in canSAR (> 3 million) enabling efficient data integration and the rapid identification of alternative compound forms with useful bioactivity data. Comparison with PubChem and ChEMBL pipelines evidenced comparable performances in compound standardization, but only PubChem and canSAR canonicalize tautomers and canSAR has a slightly lower rejection rate. Our results highlight the importance of compound hierarchies for bioactivity data exploration. We make canSARchem available under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0) at https://gitlab.icr.ac.uk/cansar-public/compound-registration-pipeline.

Read full abstract

The gas-phase structures of sodium cationized complexes of 5- and 6-halo-substituted uracils are examined via infrared multiple photon dissociation (IRMPD) action spectroscopy and theoretical electronic structure calculations. The halouracils examined in this investigation include: 5-flourouracil, 5-chlorouracil, 5-bromouracil, 5-iodouracil, and 6-chlorouracil. Experimental IRMPD action spectra of the sodium cationized halouracil complexes are measured using a 4.7T Fourier transform ion cyclotron resonance mass spectrometer coupled to the FELIX free electron laser (FEL). Irradiation of the mass selected sodium cationized halouracil complexes by the FEL was carried out over the range of frequencies extending from 950 to 1900cm−1. Theoretical linear IR spectra predicted for the stable low-energy conformations of the sodium cationized halouracils, calculated at B3LYP/6-31G(d) level of theory, are compared with the measured IRMPD action spectra to identify the structures accessed in the experiments. Relative stabilities of the low-energy conformations are determined from single-point energy calculations performed at the B3LYP/6-311+G(2d,2p) level of theory. The evolution of IRMPD spectral features as a function of the size (F, Cl, Br, and I) and position (5 versus 6) of the halogen substituent are examined to elucidate the effects of the halogen substituent and noncovalent interactions with sodium cations on the structure of the nucleobase. Present results are compared with results from energy-resolved collision-induced dissociation and IRMPD action spectroscopy studies previously reported for the protonated and sodium cationized forms of uracil, and halo-, methyl-, and thioketo-substituted uracils. The present results suggest that only a single conformer is accessed for all of the 5-halouracil complexes, whereas multiple conformers are accessed for the Na+(6ClU) complex. In all cases, the experimental IRMPD action spectra confirm that the sodium cation binds to the O4 carbonyl oxygen atom of the canonical diketo tautomer in the ground-state conformers, and gains additional stabilization via chelation interactions with the halogen substituent in the complexes to the 5-halouracils as predicted by theory.

Read full abstract

Canonical Tautomer Research Articles

Articles published on Canonical Tautomer

CanSAR chemistry registration and standardization pipeline

Molecular structure of 5-fluorouracil from gas-phase electron diffraction data and quantum-chemical calculations

Ground and Excited States of Gas-Phase DNA Nucleobase Cation-Radicals. A UV-vis Photodisociation Action Spectroscopy and Computational Study of Adenine and 9-Methyladenine.

Tautomerism of Guanine Analogues.

Proton Transfer in Guanine-Cytosine Base Pairs in B-DNA.

The Role of Proton Transfer on Mutations.

Dinuclear Metal-Mediated Guanine–Uracil Base Pairs: Theoretical Studies of GUM22+ (M = Cu, Ag, and Au) Ions

Ammoniated Complexes of Uracil and Transition Metal Ions: Structures of [M(Ura-H)(Ura)(NH3)]+ by IRMPD Spectroscopy and Computational Methods (M = Fe, Co, Ni, Cu, Zn, Cd).

Gas-phase quasi-degeneracy of zwitterionic and canonical tautomers of glycine and proline induced by the presence of the MAlF4 (M = Li, Na, K) salts

Structure, stability, energy barrier and ionization energies of chemically modified DNA-bases: Quantum chemical calculations on 37 favored and rare tautomeric forms of tetraphosphoadenine

Gas-phase interactions between lead(II) ions and cytosine: tandem mass spectrometry and infrared multiple-photon dissociation spectroscopy study.

Infrared multiple photon dissociation action spectroscopy of sodium cationized halouracils: Effects of sodium cationization and halogenation on gas-phase conformation

Does the G·G*syn DNA mismatch containing canonical and rare tautomers of the guanine tautomerise through the DPT? A QM/QTAIM microstructural study

Excited states of protonated DNA/RNA bases.

Is the DPT tautomerization of the long A·G Watson–Crick DNA base mispair a source of the adenine and guanine mutagenic tautomers? A QM and QTAIM response to the biologically important question

Performance of M06, M06-2X, and M06-HF Density Functionals for Conformationally Flexible Anionic Clusters: M06 Functionals Perform Better than B3LYP for a Model System with Dispersion and Ionic Hydrogen-Bonding Interactions

Complexation of anions to gas-phase amino acids: Conformation is critical in determining if the global minimum is canonical or zwitterionic

Discovery of Most Stable Structures of Neutral and Anionic Phenylalanine through Automated Scanning of Tautomeric and Conformational Spaces.

On relation between prototropy and electron delocalization for neutral and redox adenine – DFT studies

Photochemistry and photophysics of the amino and imino tautomers of 1-methylcytosine: tautomerisation as a side product of the radiationless decay

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Canonical Tautomer Research Articles

Articles published on Canonical Tautomer

CanSAR chemistry registration and standardization pipeline

Molecular structure of 5-fluorouracil from gas-phase electron diffraction data and quantum-chemical calculations

Ground and Excited States of Gas-Phase DNA Nucleobase Cation-Radicals. A UV-vis Photodisociation Action Spectroscopy and Computational Study of Adenine and 9-Methyladenine.

Tautomerism of Guanine Analogues.

Proton Transfer in Guanine-Cytosine Base Pairs in B-DNA.

The Role of Proton Transfer on Mutations.

Dinuclear Metal-Mediated Guanine–Uracil Base Pairs: Theoretical Studies of GUM22+ (M = Cu, Ag, and Au) Ions

Ammoniated Complexes of Uracil and Transition Metal Ions: Structures of [M(Ura-H)(Ura)(NH3)]+ by IRMPD Spectroscopy and Computational Methods (M = Fe, Co, Ni, Cu, Zn, Cd).

Gas-phase quasi-degeneracy of zwitterionic and canonical tautomers of glycine and proline induced by the presence of the MAlF4 (M = Li, Na, K) salts

Structure, stability, energy barrier and ionization energies of chemically modified DNA-bases: Quantum chemical calculations on 37 favored and rare tautomeric forms of tetraphosphoadenine

Gas-phase interactions between lead(II) ions and cytosine: tandem mass spectrometry and infrared multiple-photon dissociation spectroscopy study.

Infrared multiple photon dissociation action spectroscopy of sodium cationized halouracils: Effects of sodium cationization and halogenation on gas-phase conformation

Does the G·G*syn DNA mismatch containing canonical and rare tautomers of the guanine tautomerise through the DPT? A QM/QTAIM microstructural study

Excited states of protonated DNA/RNA bases.

Is the DPT tautomerization of the long A·G Watson–Crick DNA base mispair a source of the adenine and guanine mutagenic tautomers? A QM and QTAIM response to the biologically important question

Performance of M06, M06-2X, and M06-HF Density Functionals for Conformationally Flexible Anionic Clusters: M06 Functionals Perform Better than B3LYP for a Model System with Dispersion and Ionic Hydrogen-Bonding Interactions

Complexation of anions to gas-phase amino acids: Conformation is critical in determining if the global minimum is canonical or zwitterionic

Discovery of Most Stable Structures of Neutral and Anionic Phenylalanine through Automated Scanning of Tautomeric and Conformational Spaces.

On relation between prototropy and electron delocalization for neutral and redox adenine – DFT studies

Photochemistry and photophysics of the amino and imino tautomers of 1-methylcytosine: tautomerisation as a side product of the radiationless decay