Predictive Computational Models Research Articles

Computational methods are crucial for efficient and cost-effective drug toxicity prediction. Unfortunately, the data used for prediction is often imbalanced, resulting in biased models that favor the majority class. This paper proposes an approach to apply a hybrid class balancing technique and evaluate its performance on computational models for toxicity prediction in Tox21 datasets. The process begins by converting chemical compound data structures (SMILES strings) from various bioassay datasets into molecular descriptors that can be processed by algorithms. Subsequently, Undersampling and Oversampling techniques are applied in two different schemes on the training data. In the first scheme (Individual), only one balancing technique (Oversampling or Undersampling) is used. In the second scheme (Hybrid), the training data is divided according to a ratio (e.g., 90-10), applying a different balancing technique to each proportion. We considered eight resampling techniques (four Oversampling and four Undersampling), six molecular descriptors (based on MACCS, ECFP, and Mordred), and five classification models (KNN, MLP, RF, XGB and SVM) over 10 bioassay datasets to determine the configurations that yield the best performance. We defined three testing scenarios: without balancing techniques (baseline), Individual, and Hybrid. We found that using the ENN technique in the MACCS-MLP combination resulted in a 10.01% improvement in performance. The increase for ECFP6-2048 was 16.47% after incorporating a combination of the SMOTE (10%) and RUS (90%) techniques. Meanwhile, using the same combination of techniques, MORDRED-XGB showed the most significant increase in performance, achieving a 22.62% improvement. Integrating any of the class balancing schemes resulted in a minimum of 10.01% improvement in prediction performance compared to the best baseline configuration. In this study, Undersampling techniques were more appropriate due to the significant overlap among samples. By eliminating specific samples from the predominant class that are close to the minority class, this overlap is greatly reduced.

Read full abstract

The identification of compound-protein interactions (CPIs) is crucial for drug discovery and understanding mechanisms of action. Accurate CPI prediction can elucidate drug-target-disease interactions, aiding in the discovery of candidate compounds and effective synergistic drugs, particularly from traditional Chinese medicine (TCM). Existing in silico methods face challenges in prediction accuracy and generalization due to compound and target diversity and the lack of largescale interaction datasets and negative datasets for model learning. To address these issues, we developed a computational model for CPI prediction by integrating the constructed large-scale bioactivity benchmark dataset with a deep learning (DL) algorithm. To verify the accuracy of our CPI model, we applied it to predict the targets of compounds in TCM. An herb pair of Astragalus membranaceus and Hedyotis diffusaas was used as a model, and the active compounds in this herb pair were collected from various public databases and the literature. The complete targets of these active compounds were predicted by the CPI model, resulting in an expanded target dataset. This dataset was next used for the prediction of synergistic antitumor compound combinations. The predicted multi-compound combinations were subsequently examined through in vitro cellular experiments. Our CPI model demonstrated superior performance over other machine learning models, achieving an area under the Receiver Operating Characteristic curve (AUROC) of 0.98, an area under the precision-recall curve (AUPR) of 0.98, and an accuracy (ACC) of 93.31% on the test set. The model's generalization capability and applicability were further confirmed using external databases. Utilizing this model, we predicted the targets of compounds in the herb pair of Astragalus membranaceus and Hedyotis diffusaas, yielding an expanded target dataset. Then, we integrated this expanded target dataset to predict effective drug combinations using our drug synergy prediction model DeepMDS. Experimental assay on breast cancer cell line MDA-MB-231 proved the efficacy of the best predicted multi-compound combinations: Combination I (Epicatechin, Ursolic acid, Quercetin, Aesculetin and Astragaloside IV) exhibited a half-maximal inhibitory concentration (IC50) value of 19.41μM, and a combination index (CI) value of 0.682; and Combination II (Epicatechin, Ursolic acid, Quercetin, Vanillic acid and Astragaloside IV) displayed a IC50 value of 23.83μM and a CI value of 0.805. These results validated the ability of our model to make accurate predictions for novel CPI data outside the training dataset and evaluated the reliability of the predictions, showing good applicability potential in drug discovery and in the elucidation of the bioactive compounds in TCM. Our CPI prediction model can serve as a useful tool for accurately identifying potential CPI for a wide range of proteins, and is expected to facilitate drug research, repurposing and support the understanding of TCM.

Read full abstract

Predictive Computational Models Research Articles

Related Topics

Articles published on Predictive Computational Models

Recent Advances in Artificial Intelligence to Improve Immunotherapy and the Use of Digital Twins to Identify Prognosis of Patients with Solid Tumors

Integrating forecasting methods to support finite element analysis and explore heat transfer complexities

Computational model for policy simulation and prediction of the regulatory impact of front-of-package food labels

An omics-driven computational model for angiogenic protein prediction: Advancing therapeutic strategies with Ens-deep-AGP

Galactic cosmic ray environment predictions for the NASA BioSentinel Mission, part 2:Post-mission validation

Integrative Modeling of Signaling Network Dynamics Identifies Cell Type-Selective Therapeutic Strategies for FGFR4-Driven Cancers.

Hybrid Class Balancing Approach for Chemical Compound Toxicity Prediction.

Modeling Cortical Versus Hippocampal Network Dysfunction in a Human Brain Assembloid Model of Epilepsy and Intellectual Disability.

GLNNMDA: a multimodal prediction model for microbe-drug associations based on global and local features

A general prediction model for compound-protein interactions based on deep learning.

Computational modeling and uncertainty prediction of hyperelastic constitutive responses of damaged brain tissue under different temperature and strain rates.

Applying Machine Learning Techniques: Uncertainty Quantification in Nonlinear Dynamics Characters Predictions via Gated Recurrent Unit-Based Reduced-Order Models

Computational Hemodynamics-Based Growth Prediction for Small Abdominal Aortic Aneurysms: Laminar Simulations Versus Large Eddy Simulations.

A computational predictive model for nanozyme diffusion dynamics: optimizing nanosystem performance

Improvised grey wolf optimizer assisted artificial neural network (IGWO-ANN) predictive models to accurately predict the permeate flux of desalination plants

Pangenome reconstruction of Lactobacillaceae metabolism predicts species-specific metabolic traits.

Finite elements for Matérn-type random fields: Uncertainty in computational mechanics and design optimization

Beyond Static Planning: Computational Predictive Modeling to Avoid Coronary Artery Occlusion in TAVR

Mathematical modeling of reverse osmosis system design and performance

ProkDBP: Toward more precise identification of prokaryotic DNA binding proteins.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Predictive Computational Models Research Articles

Related Topics

Articles published on Predictive Computational Models

Recent Advances in Artificial Intelligence to Improve Immunotherapy and the Use of Digital Twins to Identify Prognosis of Patients with Solid Tumors

Integrating forecasting methods to support finite element analysis and explore heat transfer complexities

Computational model for policy simulation and prediction of the regulatory impact of front-of-package food labels

An omics-driven computational model for angiogenic protein prediction: Advancing therapeutic strategies with Ens-deep-AGP

Galactic cosmic ray environment predictions for the NASA BioSentinel Mission, part 2:Post-mission validation

Integrative Modeling of Signaling Network Dynamics Identifies Cell Type-Selective Therapeutic Strategies for FGFR4-Driven Cancers.

Hybrid Class Balancing Approach for Chemical Compound Toxicity Prediction.

Modeling Cortical Versus Hippocampal Network Dysfunction in a Human Brain Assembloid Model of Epilepsy and Intellectual Disability.

GLNNMDA: a multimodal prediction model for microbe-drug associations based on global and local features

A general prediction model for compound-protein interactions based on deep learning.

Computational modeling and uncertainty prediction of hyperelastic constitutive responses of damaged brain tissue under different temperature and strain rates.

Applying Machine Learning Techniques: Uncertainty Quantification in Nonlinear Dynamics Characters Predictions via Gated Recurrent Unit-Based Reduced-Order Models

Computational Hemodynamics-Based Growth Prediction for Small Abdominal Aortic Aneurysms: Laminar Simulations Versus Large Eddy Simulations.

A computational predictive model for nanozyme diffusion dynamics: optimizing nanosystem performance

Improvised grey wolf optimizer assisted artificial neural network (IGWO-ANN) predictive models to accurately predict the permeate flux of desalination plants

Pangenome reconstruction of Lactobacillaceae metabolism predicts species-specific metabolic traits.

Finite elements for Matérn-type random fields: Uncertainty in computational mechanics and design optimization

Beyond Static Planning: Computational Predictive Modeling to Avoid Coronary Artery Occlusion in TAVR

Mathematical modeling of reverse osmosis system design and performance

ProkDBP: Toward more precise identification of prokaryotic DNA binding proteins.