Abstract

ISIDA Property-Labelled Fragment Descriptors (IPLF) were introduced as a general framework to numerically encode molecular structures in chemoinformatics, as counts of specific subgraphs in which atom vertices are coloured with respect to some local property/feature. Combining various colouring strategies of the molecular graph - notably pH-dependent pharmacophore and electrostatic potential-based flagging - with several fragmentation schemes, the different subtypes of IPLFs may range from classical atom pair and sequence counts, to monitoring population levels of branched fragments or feature multiplets. The pH-dependent feature flagging, pursued at the level of each significantly populated microspecies involved in the proteolytic equilibrium, may furthermore add some competitive advantage over classical descriptors, even when the chosen fragmentation scheme is one of the state-of-the-art pattern extraction procedures (feature sequence or pair counts, etc.) in chemoinformatics. The implemented fragmentation schemes support counting (1) linear feature sequences, (2) feature pairs, (3) circular feature fragments a.k.a. "augmented atoms" or (4) feature trees. Fuzzy rendering - optionally allowing nonterminal fragment atoms to be counted as wildcards, ignoring their specific colours/features - ensures for a seamless transition between the "strict" counts (sequences or circular fragments) and the "fuzzy" multiplet counts (pairs or trees). Also, bond information may be represented or ignored, thus leaving the user a vast choice in terms of the level of resolution at which chemical information should be extracted into the descriptors. Selected IPLF subsets were - tree descriptors, in particular - successfully tested in both neighbourhood behaviour and QSAR modelling challenges, with very promising results. They showed excellent results in similarity-based virtual screening for analogue protease inhibitors, and generated highly predictive octanol-water partition coefficient and hERG channel inhibition models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call