Relevance of Machine Learning to Predict the Inhibitory Activity of Small Thiazole Chemicals on Estrogen Receptor.

Thangavelu Prabha,Thangavelu Saravanan,Venkatesan Jayaprakash,Jubie Selvaraj,Thangavel Sivakumar,Karuppaiyan Ravindran,Sudeepan Jayapalan,M.V.N.L Chaitanya

doi:10.2174/1573409919666221121141646

Abstract

Drug discovery requires the use of hybrid technologies for the discovery of new chemical substances. One of those interesting strategies is QSAR via applying an artificial intelligence system that effectively predicts how chemical alterations can impact biological activity via in-silico. Our present study aimed to work on a trending machine learning approach with a new opensource data analysis python script for the discovery of anticancer lead via building the QSAR model by using 53 compounds of thiazole derivatives. A python script has been executed with 53 small thiazole chemicals using Google collaboratory interface. A total of 82 CDK molecular descriptors were downloaded from "chemdes" web server and used for our study. After training the model, we checked the model performance via cross-validation of the external test set. The generated QSAR model afforded the ordinary least squares (OLS) regression as R2 = 0.542, F=8.773, and adjusted R2 (Q2) =0.481, std. error = 0.061, reg.coef_ developed were of, - 0.00064 (PC1), -0.07753 (PC2), -0.09078 (PC3), -0.08986 (PC4), 0.05044 (PC5), and reg.intercept_ of 4.79279 developed through stats models, formula module. The performance of test set prediction was done by multiple linear regression, support vector machine, and partial least square regression classifiers of sklearn module, which generated the model score of 0.5424, 0.6422 and 0.6422 respectively. Hence, we conclude that the R2values (i.e. the model score) obtained using this script via three diverse algorithms were correlated well and there is not much difference between them and may be useful in the design of a similar group of thiazole derivatives as anticancer agents.

Full Text