Flexible Analog Search with Kernel PCA Embedded Molecule Vectors

Stefano Rensi,Russ B Altman

doi:10.1016/j.csbj.2017.03.003

Stefano Rensi, Russ B Altman

Open Access

https://doi.org/10.1016/j.csbj.2017.03.003

Copy DOI

Abstract

Studying analog series to find structural transformations that enhance the activity and ADME properties of lead compounds is an important part of drug development. Matched molecular pair (MMP) search is a powerful tool for analog analysis that imitates researchers' ability to select pairs of compounds that differ only by small well-defined transformations. Abstraction is a challenge for existing MMP search algorithms, which can result in the omission of relevant, inexact MMPs, and inclusion of irrelevant, contextually dissimilar MMPs. In this work, we present a new method for MMP search that returns approximate results and enables flexible control over abstraction of contextual information. We illustrate the concepts and mechanics of our method with a series of exemplar MMP queries, and then benchmark search accuracy using MMPs found by fragment indexing. We show that we can search for MMPs in a context dependent manner, and accurately approximate context independent fragment index based MMP search over a range of fingerprint and dataset conditions. Our method can be used to search for pairwise correspondences among analog sets and bolster MMP datasets where data is missing or incomplete.

Highlights

Matched molecular pair (MMP) are a useful tool to study analog relationships and local QSAR, but current MMP search methods are brittle compared to intuitive notions of what constitutes a matched analog pair
Efficient index based search methods enforce precisely defined context independent transformations that can miss near MMPs relevant to an analysis
Previous iterations of vector based MMP search enforce strict context dependence and feature set coupling that can fail to group together transformations occurring in different contexts

Summary

Introduction

Successful optimization of lead compounds requires the iterative application of structural modifications that yield favorable changes in target activity profiles and ADMET properties. This process has been the sole domain of medicinal chemistry teams. Researchers assembled analog series data by hand, guided by their knowledge of compounds that had been synthesized and tested within their organization They would generate hypotheses using techniques such as FreeWilson analysis [1], Hansch analysis [2], Topliss schemes [3], and Craig plots [4]; and combine data driven insights with expertise to prioritize compounds for synthesis and testing in each design iteration. The challenge has driven innovation in search and index of analog sets in chemical libraries

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computational and Structural Biotechnology Journal	Publication Date: Jan 1, 2017
Citations: 6	License type: cc-by

R Discovery Prime

R Discovery Prime

Flexible Analog Search with Kernel PCA Embedded Molecule Vectors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational and Structural Biotechnology Journal

Lead the way for us

Similar Papers

Fuzzy context specific matched molecular pairs
Peter Schmidtke ... Vincent Le Guilloux
Journal of Cheminformatics | VOL. 6
Peter Schmidtke, et. al.Peter Schmidtke ... Vincent Le Guilloux
01 Mar 2014
Journal of Cheminformatics | VOL. 6

Prediction of compound potency changes in matched molecular pairs using support vector regression.
Antonio De La Vega De León ... Jürgen Bajorath
Journal of chemical information and modeling | VOL. 54
Antonio De La Vega De León, et. al.Antonio De La Vega De León ... Jürgen Bajorath
17 Sep 2014
Journal of chemical information and modeling | VOL. 54

Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets
Jameed Hussain ... Ceara Rea
Journal of Chemical Information and Modeling | VOL. 50
Jameed Hussain, et. al.Jameed Hussain ... Ceara Rea
01 Feb 2010
Journal of Chemical Information and Modeling | VOL. 50

Transformer-based molecular optimization beyond matched molecular pairs
Jiazhen He ... Esben Jannik Bjerrum
Journal of Cheminformatics | VOL. 14
Jiazhen He, et. al.Jiazhen He ... Esben Jannik Bjerrum
28 Mar 2022
Journal of Cheminformatics | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Flexible Analog Search with Kernel PCA Embedded Molecule Vectors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computational and Structural Biotechnology Journal