Alternative weighting schemes for fine‐tuned extended similarity indices

Kenneth López Pérez,Ramón Alain Miranda‐Quintana,Camila Gonzalez,Dávid Bajusz,Károly Héberger,Anita Rácz

doi:10.1002/cem.3558

Kenneth López Pérez, Ramón Alain Miranda‐Quintana + Show 4 more

Open Access

https://doi.org/10.1002/cem.3558

Copy DOI

Journal: Journal of Chemometrics	Publication Date: May 11, 2024
License type: CC BY 4.0

Affiliation: University of Florida

Abstract

AbstractExtended similarity indices (i.e., generalization of pairwise similarity) have recently gained importance because of their simplicity, fast computation, and superiority in tasks like diversity picking. However, they operate with several meta parameters that should be optimized. Earlier, we extended the binary similarity indices to “discrete non‐binary” and “continuous” data; now we continue with introducing and comparing multiple weighting functions. As a case study, the similarity of CYP enzyme inhibitors (4016 molecules after curation) was characterized by their extended similarities, based on 2D descriptors, MACCS and Morgan fingerprints. A statistical workflow based on sum of ranking differences (SRD) and analysis of variance (ANOVA) was used for finding the optimal weight function(s). Overall, the best weighting function is the fraction (“frac”), which corresponds to the principle of parsimony. Optimal extended similarity indices were also found, and their differences are revealed across different data sets. We intend this work to be a guideline for users of extended similarity indices regarding the various weighting options available. Source code for the calculations is available at https://github.com/mqcomplab/MultipleComparisons.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Alternative weighting schemes for fine‐tuned extended similarity indices

Abstract

Talk to us

Similar Papers

More From: Journal of Chemometrics

Lead the way for us

Similar Papers

Sum of ranking differences (SRD) to ensemble multivariate calibration model merits for tuning parameter selection and comparing calibration methods
John H Kalivas ... Erik Andries
Analytica Chimica Acta | VOL. 869
John H Kalivas, et. al.John H Kalivas ... Erik Andries
07 Feb 2015
Analytica Chimica Acta | VOL. 869

Comparison of multianalyte proficiency test results by sum of ranking differences, principal component analysis, and hierarchical cluster analysis
Biljana Škrbić ... Nataša Đurišić-Mladenović
Analytical and Bioanalytical Chemistry | VOL. 405
Biljana Škrbić, et. al.Biljana Škrbić ... Nataša Đurišić-Mladenović
03 Aug 2013
Analytical and Bioanalytical Chemistry | VOL. 405

Method and model comparison by sum of ranking differences in cases of repeated observations (ties)
Klára Kollár-Hunek ... Károly Héberger
Chemometrics and Intelligent Laboratory Systems | VOL. 127
Klára Kollár-Hunek, et. al.Klára Kollár-Hunek ... Károly Héberger
27 Jun 2013
Chemometrics and Intelligent Laboratory Systems | VOL. 127

Consensus Outlier Detection Using Sum of Ranking Differences of Common and New Outlier Measures Without Tuning Parameter Selections.
Brett Brownfield ... John H Kalivas
Analytical Chemistry | VOL. 89
Brett Brownfield, et. al.Brett Brownfield ... John H Kalivas
13 Apr 2017
Analytical Chemistry | VOL. 89

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Alternative weighting schemes for fine‐tuned extended similarity indices

Abstract

Talk to us

Similar Papers

More From: Journal of Chemometrics