Abstract

Ten years ago we issued, in conjunction with the Journal of Chemical Information and Modeling, an open prediction challenge to the cheminformatics community. Would they be able to predict the intrinsic solubilities of 32 druglike compounds using only a high-precision set of 100 compounds as a training set? The "Solubility Challenge" was a widely recognized success and spurred many discussions about the prediction methods and quality of data. Regardless of the obvious limitations of the challenge, the conclusions were somewhat unexpected. Despite contestants employing the entire spectrum of approaches available then to predict aqueous solubility and disposing of an extremely tight data set, it was not possible to identify the best methods at predicting aqueous solubility, a variety of methods and combinations all performed equally well (or badly). Several authors have suggested since then that it is not the poor quality of the solubility data which limits the accuracy of the predictions, but the deficient methods used. Now, ten years after the original Solubility Challenge, we revisit it and challenge the community to a new test with a much larger database with estimates of interlaboratory reproducibility.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.