Four highly discriminating fourth-generation topological indices (TIs), termed as superaugmented eccentric distance sum connectivity indices, as well as their topochemical versions (denoted by , , and ), have been conceptualized in this study. The values of these indices for all possible structures with three, four, and five vertices containing one heteroatom were computed using an in-house computer program. The proposed superaugmented eccentric distance sum connectivity topochemical indices exhibited exceptionally high discriminating power, low degeneracy, and high sensitivity toward both the presence and the relative position of heteroatom(s) for all possible structures with five vertices containing at least one heteroatom. Intercorrelation analysis revealed the absence of correlation of proposed indices with Zagreb indices and the molecular connectivity index. Subsequently, the proposed TIs were successfully utilized for the development of models for the prediction of checkpoint kinase inhibitory activity of 2-arylbenzimidazoles. A data set comprising 47 differently substituted analogs of 2-arylbenzimidazoles was selected for the study. The values of various TIs for each analog in the data set were computed using an in-house computer program. The resulting data were analyzed, and suitable models were developed through decision tree (DT), random forest (RF), and moving average analysis (MAA). The performance of the models was assessed by calculating the specificity, sensitivity, overall accuracy, and Mathew's correlation coefficient. A decision tree was constructed for the checkpoint kinase inhibitory activity to determine the importance of topological indices. The decision tree identified the proposed TIs -, - as the most important indices. The decision tree learned the information from the input data with an accuracy of 96% and correctly predicted the cross-validated (10-fold) data with an accuracy of 77%. Random forest correctly predicted the checkpoint kinase inhibitory activity with an accuracy of 83%. The single index-based models were also developed for the prediction of checkpoint kinase inhibitory activity using MAA. The accuracy of prediction of single index-based models derived through MAA was found to vary from a minimum of 90% to a maximum of 95%. Exceptionally high discriminating power, low degeneracy, and high sensitivity toward branching and presence of heteroatom of proposed indices can be of immense use in drug design, isomer discrimination, similarity/dissimilarity studies, quantitative structure activity/property relationships, lead optimization, and combinatorial library design.
Read full abstract