Molecular Activity Prediction Research Articles

In recent years, artificial intelligence (AI) has flourished in drug discovery and sped up the process of drug research and development. Recently, it has been demonstrated that the inhibition of phosphoglycerate kinase 1 (PGK1) can effectively inhibit colorectal cancer (CRC). In this paper, we propose an AI-based drug repurposing protocol, VAERHNN, to investigate potent leads against CRC. VAERHNN can comprehensively integrate the information of the target and its inhibitors or agonists for drug repurposing. During the protocol, we built a voting-averaged ensemble regression (VAER) model based on ensemble learning algorithm for molecular activity prediction. The VAER model outperforms other single ensemble learning regression models. Moreover, we also assemble a hybrid neural network (HNN) consisting of multiple neural networks to predict the drug-target affinity. In HNN, we picked a combination of several top-performing neural networks for weighted average computation. Hereafter, through the evaluation of voting-averaged scores from molecular docking and AI models, we singled out flavin adenosine dinucleotide (FAD) as a potential lead. Ultimately, molecular dynamics simulations and in vitro scratch and transwell assays confirmed the stability of FAD binding to PGK1 and that FAD can significantly inhibit the migration and invasion of CRC cells in vitro. In conclusion, through VAERHNN, we identified FAD as a potential lead compound against CRC, providing a new idea for CRC treatment. The source codes and data of this study are available at https://github.com/gxCaesar/VAERHNN.

Deep neural networks can directly learn from chemical structures without extensive, user-driven selection of descriptors in order to predict molecular properties/activities with high reliability. But these approaches typically require large training sets to learn the endpoint-specific structural features and ensure reasonable prediction accuracy. Even though large datasets are becoming the new normal in drug discovery, especially when it comes to high-throughput screening or metabolomics datasets, one should also consider smaller datasets with challenging endpoints to model and forecast. Thus, it would be highly relevant to better utilize the tremendous compendium of unlabeled compounds from publicly-available datasets for improving the model performances for the user’s particular series of compounds. In this study, we propose the Molecular Prediction Model Fine-Tuning (MolPMoFiT) approach, an effective transfer learning method based on self-supervised pre-training + task-specific fine-tuning for QSPR/QSAR modeling. A large-scale molecular structure prediction model is pre-trained using one million unlabeled molecules from ChEMBL in a self-supervised learning manner, and can then be fine-tuned on various QSPR/QSAR tasks for smaller chemical datasets with specific endpoints. Herein, the method is evaluated on four benchmark datasets (lipophilicity, FreeSolv, HIV, and blood–brain barrier penetration). The results showed the method can achieve strong performances for all four datasets compared to other state-of-the-art machine learning modeling techniques reported in the literature so far.

Molecular Activity Prediction Research Articles

Articles published on Molecular Activity Prediction

Epigenetic target identification strategy based on multi-feature learning

FP-MAP: an extensive library of fingerprint-based molecular activity prediction tools.

VAERHNN: Voting-averaged ensemble regression and hybrid neural network to investigate potent leads against colorectal cancer

Graph-Based Feature Selection Approach for Molecular Activity Prediction.

Graph label prediction based on local structure characteristics representation

Experimental and density functional theory study of complexing agents on cobalt dissolution in alkaline solutions

2-D chemical structure image-based in silico model to predict agonist activity for androgen receptor

Inductive transfer learning for molecular activity prediction: Next-Gen QSAR Models with MolPMoFiT

Influence of feature rankers in the construction of molecular activity prediction models.

Molecular activity prediction by means of supervised subspace projection based ensembles of classifiers

An ensemble approach for in silico prediction of Ames mutagenicity

Robust support vector machines for multiple instance learning

Multiple instance learning via margin maximization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Molecular Activity Prediction Research Articles

Articles published on Molecular Activity Prediction

Epigenetic target identification strategy based on multi-feature learning

FP-MAP: an extensive library of fingerprint-based molecular activity prediction tools.

VAERHNN: Voting-averaged ensemble regression and hybrid neural network to investigate potent leads against colorectal cancer

Graph-Based Feature Selection Approach for Molecular Activity Prediction.

Graph label prediction based on local structure characteristics representation

Experimental and density functional theory study of complexing agents on cobalt dissolution in alkaline solutions

2-D chemical structure image-based in silico model to predict agonist activity for androgen receptor

Inductive transfer learning for molecular activity prediction: Next-Gen QSAR Models with MolPMoFiT

Influence of feature rankers in the construction of molecular activity prediction models.

Molecular activity prediction by means of supervised subspace projection based ensembles of classifiers

An ensemble approach for in silico prediction of Ames mutagenicity

Robust support vector machines for multiple instance learning

Multiple instance learning via margin maximization