Abstract

AbstractWith the large‐scale development of drugs, understanding the drug phase behaviors in complex systems becomes more and more important. Among them, the solubility of drugs in biorelevant media needs to be urgently understood. To address this challenge, new strategies based on machine learning models are proposed. First, the strategy trains five machine learning models (extra trees [ET], gradient boosting [GB], k‐nearest neighbors [k‐NN], random forest [RF], and extreme gradient boosting [XGBoost]) based on 15 molecular descriptors of the drug molecular properties. The XGboost model was identified as the best predictive model for predicting drug solubility performance in various solvents. Next, the input feature vectors were expanded for machine learning using the MACCS chemical fingerprint coupled with the XGboost model. The MACCS chemical fingerprint coupled with XGboost model has significantly improved the prediction accuracy of drug solubility. This finding demonstrates that the proposed strategy has solubility prediction capability, which is expected to provide valid information for drug development and drug solvent screening.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call