Molecular Descriptors Research Articles

Unlike with the energy, which is a scalar property, machine learning (ML) prediction of vector or tensor properties poses the additional challenge of achieving proper invariance (covariance) with respect to molecular rotation. For the energy gradients needed in molecular dynamics (MD), this symmetry is automatically fulfilled when taking analytic derivative of the energy, which is a scalar invariant (using properly invariant molecular descriptors). However, if the properties cannot be obtained by differentiation, other appropriate methods should be applied to retain the covariance. Several approaches have been suggested to properly treat this issue. For nonadiabatic couplings and polarizabilities, for example, it was possible to construct virtual quantities from which the above tensorial properties are obtained by differentiation and thus guarantee the covariance. Another possible solution is to build the rotational equivariance into the design of a neural network employed in the model. Here, we propose a simpler alternative technique, which does not require construction of auxiliary properties or application of special equivariant ML techniques. We suggest a three-step approach, using the molecular tensor of inertia. In the first step, the molecule is rotated using the eigenvectors of this tensor to its principal axes. In the second step, the ML procedure predicts the vector property relative to this orientation, based on a training set where all vector properties were in this same coordinate system. As the third step, it remains to transform the ML estimate of the vector property back to the original orientation. This rotate-predict-rotate (RPR) procedure should thus guarantee proper covariance of a vector property and is trivially extensible also to tensors such as polarizability. The RPR procedure has an advantage that the accurate models can be trained very fast for thousands of molecular configurations, which might be beneficial where many training sets are required (e.g., in active learning). We have implemented the RPR technique, using the MLatom and Newton-X programs for ML and MD, and performed its assessment on the dipole moment along MD trajectories of 1,2-dichloroethane.

Read full abstract

Triacylglycerols (TAG) are the main components of vegetable oils and any attempt to simulate vegetable oils processes will demand knowledge of their properties. However, experimental values are scarce, considering that several TAG in their pure forms are unavailable or too expensive for experimental measurements. On the other hand, correlating physical properties with TAG molecular structure is not simple. TAG is a molecule composed of 3 fatty acids (FA) esterified to a glycerol (GL) backbone, making properties dependent on carbon number (CN) of each FA, number of unsaturations (UN) of each FA, and position of the FA in the GL backbone. Few models are available in literature for prediction of TAG melting properties, with a special attention to melting temperature (Tfus) and enthalpy (ΔHfus) and solid-solid transition properties of the TAG polymorphic forms. Wesdorp's, Moorthy's et al. and Zeberg-Mikkelsen and Stenby's works present models based on the Group-Contribution theory nowadays used, despite some flaws, particularly considering the polymorphic transitions. Therefore, this work was aimed at evaluating Artificial Neural Network (ANN) models for prediction of TAG's Tfus and ΔHfus (β-form) as well as temperature and enthalpy transitions of molecule polymorphic forms (α and β’). Database was composed of temperature and enthalpy experimental data from literature. For each TAG, 7 input data were provided: total CN, as well as CN and UN at sn-1, 2 and 3 TAG position. The Multilayer Perceptron Feed Forward (MPL) model was used, and the topology was evaluated for number of hidden layers (HL), number of neurons (NN) and activation function at each hidden layer, and convergence algorithm. Number of HL and NN was screened by using a Central Composite Rotatable Design (CCRD). Models were further evaluated by Explainable Artificial Intelligence (XAI) and feature evaluation strategies. Architectures showed a significant higher accuracy for calculation and prediction of TAG's melting properties of the 3 polymorphic forms, with R2 higher to 0.91 for all databases when compared to literatures’ models (excepted for the prediction of the melting temperature of the β form, where Wesdorp's model presented a better predictive ability, despite great similarity). Good results were probably related to the well-defined physicochemical relationship between input (molecular structure descriptors) and output (melting properties), that could be described by XAI evaluation. This is an important advantage considering the improvement of the performance of process and products design including TAG molecules.

Read full abstract

Molecular Descriptors Research Articles

Related Topics

Articles published on Molecular Descriptors

Predicting viral proteins that evade the innate immune system: a machine learning-based immunoinformatics tool

Prediction of potential antitumor components in Ganoderma lucidum: A combined approach using machine learning and molecular docking

Computer prediction of acute toxicity of thioderivatives of 3,5-bis(5-mercapto-4-R-4H-1,2,4-triazol-3-yl)phenol

Exponential Wiener index of some silicate networks

Learning glass transition temperatures via dimensionality reduction with data from computer simulations: Polymers as the pilot case.

Neighbourhood Sum-Based Structural Analysis for Sodalite System

Analysis of octane isomer properties via topological descriptors of line graphs

A simple approach to rotationally invariant machine learning of a vector quantity.

Theoretical investigation of Diels–Alder reaction mechanism and regioselectivity with functionalized fullerene derivatives

Ready-to-use Models Built Using a Diverse Set of 266 Aroma Compounds for the Estimation of Gas Chromatographic Retention Indices for the 50%-Cyanopropylphenyl-50%-Dimethylpolysiloxane Stationary Phase.

A machine learning and DFT assisted analysis of benzodithiophene based organic dyes for possible photovoltaic applications

Facile Synthesis of (S)‐2‐Aryl‐N‐(1‐phenylethylisonicotinamides) Derivatives via SMC Reaction: Their Thermodynamic and Spectroscopic Features via DFT Approach

Prediction of Melting and Solid Phase Transitions Temperatures and Enthalpies for Triacylglycerols using Artificial Neural Networks

Prediction of acetylene solubility by a mechanism-data hybrid-driven machine learning model constructed based on COSMO-RS theory

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

New insight into molecular interactions of surface-active ionic liquid (SAIL) with some biomolecules: Experimental and computational approaches

Beyond the Arbitrariness of Drug-Likeness Rules: Rough Set Theory and Decision Rules in the Service of Drug Design

Predicting variable-length ACE inhibitory peptides based on graph convolutional network

Exploring the Influence of Ionic Liquid Anion Structure on Gas-Ionic Liquid Partition Coefficients of Organic Solutes Using Machine Learning.

Synthesis, Characterization, and Molecular Docking of Novel Isatin-thiosemicarbazone containing 1,2,3-triazole Derivatives as Potential Anti-cancer Agents

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Molecular Descriptors Research Articles

Related Topics

Articles published on Molecular Descriptors

Predicting viral proteins that evade the innate immune system: a machine learning-based immunoinformatics tool

Prediction of potential antitumor components in Ganoderma lucidum: A combined approach using machine learning and molecular docking

Computer prediction of acute toxicity of thioderivatives of 3,5-bis(5-mercapto-4-R-4H-1,2,4-triazol-3-yl)phenol

Exponential Wiener index of some silicate networks

Learning glass transition temperatures via dimensionality reduction with data from computer simulations: Polymers as the pilot case.

Neighbourhood Sum-Based Structural Analysis for Sodalite System

Analysis of octane isomer properties via topological descriptors of line graphs

A simple approach to rotationally invariant machine learning of a vector quantity.

Theoretical investigation of Diels–Alder reaction mechanism and regioselectivity with functionalized fullerene derivatives

Ready-to-use Models Built Using a Diverse Set of 266 Aroma Compounds for the Estimation of Gas Chromatographic Retention Indices for the 50%-Cyanopropylphenyl-50%-Dimethylpolysiloxane Stationary Phase.

A machine learning and DFT assisted analysis of benzodithiophene based organic dyes for possible photovoltaic applications

Facile Synthesis of (S)‐2‐Aryl‐N‐(1‐phenylethylisonicotinamides) Derivatives via SMC Reaction: Their Thermodynamic and Spectroscopic Features via DFT Approach

Prediction of Melting and Solid Phase Transitions Temperatures and Enthalpies for Triacylglycerols using Artificial Neural Networks

Prediction of acetylene solubility by a mechanism-data hybrid-driven machine learning model constructed based on COSMO-RS theory

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

New insight into molecular interactions of surface-active ionic liquid (SAIL) with some biomolecules: Experimental and computational approaches

Beyond the Arbitrariness of Drug-Likeness Rules: Rough Set Theory and Decision Rules in the Service of Drug Design

Predicting variable-length ACE inhibitory peptides based on graph convolutional network

Exploring the Influence of Ionic Liquid Anion Structure on Gas-Ionic Liquid Partition Coefficients of Organic Solutes Using Machine Learning.

Synthesis, Characterization, and Molecular Docking of Novel Isatin-thiosemicarbazone containing 1,2,3-triazole Derivatives as Potential Anti-cancer Agents