Abstract

AbstractModeling of human intestinal absorption (HIA) data of 175 diverse drugs and 336 calculated descriptors is performed to develop global predictive models that are applicable to the whole medicinal chemistry space. With this aim, we employed two automated procedures, (a) Sphere Exclusion Algorithm (SEA) to select members of the training and test sets based on structural dissimilarity and (b) k‐Nearest Neighbors (kNN) method along with Genetic Algorithms (kNN‐QSAR‐GA) to select significant and independent descriptors. This methodology helped us to derive optimal Quantitative Structure–Property Relationship (QSPR) models based on three and four descriptors. The best three descriptor model is based on Delta Chi Index of order 3 (Cluster), Hydrogen type E‐State index ShsOH, AlogP99 ($\rm{ q_{{\rm{LOO}}}^2 }$=0.7401 and $\rm{ q_{{\rm{ext}}}^2 }$=0.7989); the best four variable model is based on auto‐correlation descriptor (Moran) weighted by atomic weights – order 7, AI‐State_Indices_AISssssC, number of hydrogen bond acceptors, AlogP99 ($\rm{ q_{{\rm{LOO}}}^2 }$=0.8196 and $\rm{ q_{{\rm{ext}}}^2 }$=0.6999). Based on extensive validation tests of the models M1–M4, comparison of their overall performance and $\rm{ q_{{\rm{ext}}}^2 }$ statistics with reported models using other approaches, it is shown that: (a) the models have high stability and are robust and (b) for the first time in HIA modeling, the combination of an automated training set selection (SEA) followed by variable selection (kNN‐QSAR_GA) is shown to be a promising methodology to build multiple stable models that are useful in consensus prediction. From the analysis of the physical meaning of the selected descriptors, it is inferred that the HIA of small organic compounds can be accurately predicted using calculated descriptors that code for the following fundamental properties: (1) lipophilicity, (2) hydrogen bonding capacity, (3) size, and (4) shape and further, the role of new calculated descriptors on the HIA profile of small organic compounds is uncovered. Finally, as the models reported herein are based on computed properties, they appear to be a valuable tool in virtual screening, where selection and prioritization of candidates is required.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.