Prediction of the antibacterial activity of new chemical compounds is an important task, due to the growing problem of bacterial drug resistance. Generalized linear models (GLMs) were created using 85 amidrazone derivatives based on the results of antimicrobial activity tests, determined as the minimum inhibitory concentration (MIC) against Gram-positive bacteria: Staphylococcus aureus, Enterococcus faecalis, Micrococcus luteus, Nocardia corallina, and Mycobacterium smegmatis. For the analysis of compounds characterized by experimentally measured MIC values, we included physicochemical properties (e.g., molecular weight, number of hydrogen donors and acceptors, topological polar surface area, compound percentages of carbon, nitrogen, and oxygen, melting points, and lipophilicity) as potential predictors. The presence of R1 and R2 substituents, as well as interactions between melting temperature and R1 or R2 substituents, were also considered. The set of potential predictors also included possible biological effects (e.g., antibacterial, antituberculotic) of tested compounds calculated with the PASS (Prediction of Activity Spectra for Substances) program. Using GLMs with least absolute shrinkage and selection (LASSO), least-angle regression, and stepwise selection, statistically significant models with the optimal value of the adjusted determination coefficient and of seven fit criteria were chosen, e.g., Akaike's information criterion. The most often selected variables were as follows: molecular weight, PASS_antieczematic, PASS_anti-inflam, squared melting temperature, PASS_antitumor, and experimental lipophilicity. Additionally, relevant to the bacterial strain, the interactions between melting temperature and R1 or R2 substituents were selected, indicating that the relationship between MIC and melting temperature depends on the type of R1 or R2 substituent.
Read full abstract