Abstract

The accurate prediction of solubility of drugs is still problematic. It was thought for a long time that shortfalls had been due the lack of high-quality solubility data from the chemical space of drugs. This study considers the quality of solubility data, particularly of ionizable drugs. A database is described, comprising 6355 entries of intrinsic solubility for 3014 different molecules, drawing on 1325 citations. In an earlier publication, many factors affecting the quality of the measurement had been discussed, and suggestions were offered to improve ways of extracting more reliable information from legacy data. Many of the suggestions have been implemented in this study. By correcting solubility for ionization (i.e., deriving intrinsic solubility, S0) and by normalizing temperature (by transforming measurements performed in the range 10-50 °C to 25 °C), it can now be estimated that the average interlaboratory reproducibility is 0.17 log unit. Empirical methods to predict solubility at best have hovered around the root mean square error (RMSE) of 0.6 log unit. Three prediction methods are compared here: (a) Yalkowsky’s general solubility equation (GSE), (b) Abraham solvation equation (ABSOLV), and (c) Random Forest regression (RFR) statistical machine learning. The latter two methods were trained using the new database. The RFR method outperforms the other two models, as anticipated. However, the ability to predict the solubility of drugs to the level of the quality of data is still out of reach. The data quality is not the limiting factor in prediction. The statistical machine learning methodologies are probably up to the task. Possibly what’s missing are solubility data from a few sparsely-covered chemical space of drugs (particularly of research compounds). Also, new descriptors which can better differentiate the factors affecting solubility between molecules could be critical for narrowing the gap between the accuracy of the prediction models and that of the experimental data.

Highlights

  • In pharmaceutical research, the aqueous solubility of exploratory compounds is a very important physical property to assess [1,2]

  • The equation requires no “training.” the general solubility equation (GSE) is rooted in sound thermodynamic principles, some assumptions had to be made in developing the equation: test compounds are taken to be nonionized and fully-miscible in octanol, and that the water and octanol phases are assumed not appreciably mutually soluble

  • The GSE is popular for its simplicity and easy of calculation

Read more

Summary

Introduction

The aqueous solubility of exploratory compounds is a very important physical property to assess [1,2]. Peroral drugs with very low solubility may not release sufficient compound from the solid form during the intestinal transit to generate therapeutic benefit. Not too little and not too much solubility is an important balancing act in compound advancement during drug development. Given the large number of compounds tested in drug discovery, measurement of solubility is done by high-throughput methods, which generate “kinetic” values in buffers containing 0.5-5 %v/v DMSO [2,3]. Compounds advanced into later stages of research are fewer in number. More rigorous methods are used to measure their equilibrium solubility, often in media more reflective of the biological fluids to which drugs are exposed [7]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.