Extended continuous similarity indices: theory and application for QSAR descriptor selection.

Anita Rácz,Timothy B Dunn,Taewon D Kim,Dávid Bajusz,Ramón Alain Miranda-Quintana,Károly Héberger

doi:10.1007/s10822-022-00444-7

Abstract

Extended (or n-ary) similarity indices have been recently proposed to extend the comparative analysis of binary strings. Going beyond the traditional notion of pairwise comparisons, these novel indices allow comparing any number of objects at the same time. This results in a remarkable efficiency gain with respect to other approaches, since now we can compare N molecules in O(N) instead of the common quadratic O(N2) timescale. This favorable scaling has motivated the application of these indices to diversity selection, clustering, phylogenetic analysis, chemical space visualization, and post-processing of molecular dynamics simulations. However, the current formulation of the n-ary indices is limited to vectors with binary or categorical inputs. Here, we present the further generalization of this formalism so it can be applied to numerical data, i.e. to vectors with continuous components. We discuss several ways to achieve this extension and present their analytical properties. As a practical example, we apply this formalism to the problem of feature selection in QSAR and prove that the extended continuous similarity indices provide a convenient way to discern between several sets of descriptors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Extended continuous similarity indices: theory and application for QSAR descriptor selection.

Abstract

Talk to us

Similar Papers

More From: Journal of Computer-Aided Molecular Design

Lead the way for us

Journal: Journal of Computer-Aided Molecular Design	Publication Date: Mar 1, 2022
Citations: 16

Similar Papers

Sampling and Mapping Chemical Space with Extended Similarity Indices.
Kenneth López-Pérez ... Ramón Alain Miranda-Quintana
Molecules (Basel, Switzerland) | VOL. 28
Kenneth López-Pérez, et. al.Kenneth López-Pérez ... Ramón Alain Miranda-Quintana
30 Aug 2023
Molecules (Basel, Switzerland) | VOL. 28

Rough-Bayesian approach to select class-pair specific descriptors for HEp-2 cell staining pattern recognition
Debamita Kumar ... Pradipta Maji
Pattern Recognition | VOL. 117
Debamita Kumar, et. al.Debamita Kumar ... Pradipta Maji
08 Apr 2021
Pattern Recognition | VOL. 117

MPEG-7 descriptor selection using Localized Generalization Error Model with mutual information
Jun Wang ... Eric C.C Tsang
-
Jun Wang, et. al. Jun Wang ... Eric C.C Tsang
01 Jul 2008
01 Jul 2008

Mode of action prediction of ligands of steroid hormone receptors
I.V Fedyushkina ... I.V Romero Reyes
Biomeditsinskaya Khimiya | VOL. 59
I.V Fedyushkina, et. al.I.V Fedyushkina ... I.V Romero Reyes
01 Jan 2013
Biomeditsinskaya Khimiya | VOL. 59

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Extended continuous similarity indices: theory and application for QSAR descriptor selection.

Abstract

Talk to us

Similar Papers

More From: Journal of Computer-Aided Molecular Design