Abstract
A greater number of toxicity data are becoming publicly available allowing for in silico modeling. However, questions often arise as to how to incorporate data quality and how to deal with contradicting data if more than a single datum point is available for the same compound. In this study, two well-known and studied QSAR/QSPR models for skin permeability and aquatic toxicology have been investigated in the context of statistical data quality. In particular, the potential benefits of the incorporation of the statistical Confidence Scoring (CS) approach within modeling and validation. As a result, robust QSAR/QSPR models for the skin permeability coefficient and the toxicity of nonpolar narcotics to Aliivibrio fischeri assay were created. CS-weighted linear regression for training and CS-weighted root-mean-square error (RMSE) for validation were statistically superior compared to standard linear regression and standard RMSE. Strategies are proposed as to how to interpret data with high and low CS, as well as how to deal with large data sets containing multiple entries.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.