It is challenging to model the toxicity of nitroaromatic compounds due to limited experimental data. Nitrobenzene derivatives are commonly used in industry and can lead to environmental contamination. Extensive research, including several QSPR studies, has been conducted to understand their toxicity. Predictive QSPR models can help improve chemical safety, but their limitations must be considered, and the molecular factors affecting toxicity should be carefully investigated. The latest QSPR methods, molecular modeling techniques, machine learning algorithms, and computational chemistry tools are essential for developing accurate and robust models. In this work, we used these methods to study a series of fifty compounds derived from nitrobenzene. The Monte Carlo approach was used for QSPR modeling by applying the SMILES molecular structure representation and optimal molecular descriptors. The correlation ideality index (CII) and correlation contradiction index (CCI) were further introduced as validation parameters to estimate the developed models' predictive ability. The statistical quality of the CII models was better than those without CII. The best QSPR model with the following statistical parameters (Split-3): (R2 = 0.968, CCC = 0.984, IIC = 0.861, CII = 0.979, Q2 = 0.954, QF12 = 0.946, QF22 = 0.938, QF32 = 0.947, Rm2 = 0.878, RMSE = 0.187, MAE = 0.151, FTraining = 390, FInvisible = 218, FCalibration = 240, RTest2 = 0.905) was selected to generate the studied promoters with increasing and decreasing activity.
Read full abstract