Abstract

BackgroundIn drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). Due to experimental difficulty in obtaining the bioactive conformers, computational approaches such as machine learning techniques are much needed. Multiple Instance Learning (MIL) is a machine learning method capable of tackling this type of problem. In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. The high dimensionality may provide significant information for learning tasks, but at the same time it may also include a large number of irrelevant or redundant features that might negatively affect learning performance. Reducing the dimensionality of data will hence facilitate the classification task and improve the interpretability of the model.ResultsIn this work we propose a novel approach, named multiple instance learning via joint instance and feature selection. The iterative joint instance and feature selection is achieved using an instance-based feature mapping and 1-norm regularized optimization. The proposed approach was tested on four biological activity datasets.ConclusionsThe empirical results demonstrate that the selected instances (prototype conformers) and features (pharmacophore fingerprints) have competitive discriminative power and the convergence of the selection process is also fast.

Highlights

  • In drug discovery and development, it is crucial to determine which conformers of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features

  • In drug discovery and development, researchers are interested in detecting which molecules are active, and in determining which conformers of a given molecule are responsible for its observed biological activity

  • Fu et al [3] applied multiple-instance learning via embedded instance selection (MILES) to study the biological activity of several sets of molecules interacting with different receptor targets including glycogen synthase kinase-3 (GSK-3) [4], cannabinoid receptors (CBrs) [5], and P-glycoprotein (P-gp) [6]

Read more

Summary

Introduction

In drug discovery and development, it is crucial to determine which conformers (instances) of a given molecule are responsible for its observed biological activity and at the same time to recognize the most representative subset of features (molecular descriptors). In the MIL framework, each instance is represented as a feature vector, which usually resides in a high-dimensional feature space. Fu et al observed that conformer and pharmacophore fingerprint features both reside in high dimensional spaces, which motivates us to extend the MILES framework with joint instance and feature selection. The selected prototype conformers and pharmacophore fingerprints may facilitate understanding of the interaction mechanism between small flexible molecules and proteins, and influence the design of a new molecule with desired properties, which is the goal in drug discovery and development

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.