Abstract

Most contemporary drug discovery projects start with a ‘hit discovery’ phase where small chemicals are identified that have the capacity to interact, in a chemical sense, with a protein target involved in a given disease. To assist and accelerate this initial drug discovery process, ’virtual docking calculations’ are routinely performed, where computational models of proteins and computational models of small chemicals are evaluated for their capacities to bind together. In cutting-edge, contemporary implementations of this process, several conformations of protein targets are independently assayed in parallel ‘ensemble docking’ calculations. Some of these protein conformations, a minority of them, will be capable of binding many chemicals, while other protein conformations, the majority of them, will not be able to do so. This fact that only some of the conformations accessible to a protein will be ’selected’ by chemicals is known as ’conformational selection’ process in biology. This work describes a machine learning approach to characterize and identify the properties of protein conformations that will be selected (i.e., bind to) chemicals, and classified as potential binding drug candidates, unlike the remaining non-binding drug candidate protein conformations. This work also addresses the class imbalance problem through advanced machine learning techniques that maximize the prediction rate of potential protein molecular conformations for the test case proteins ADORA2A (Adenosine A2a Receptor) and OPRK1 (Opioid Receptor Kappa 1), and subsequently reduces the failure rates and hastens the drug discovery process.

Highlights

  • The core concept of any drug discovery application involves, in most cases, a protein biological target which binds with a chemical to achieve a biological function

  • These time and cost issues are mostly due to (i) the time it takes in the early stage to identify drug candidates effective against a protein target, and (ii) the inflated failure rates of over 90% during later clinical trial stages, where the identified potential drug candidate proteins fail to succeed during various stages of developmental clinical trials

  • In drug discovery application, the class imbalance problem is of great consequence, since it leads to a higher risk of discarding the smaller population of protein conformations that can successfully bind to drugs due to them being misclassified as non-drug-binding protein conformations

Read more

Summary

Introduction

The core concept of any drug discovery application involves, in most cases, a protein biological target which binds with a chemical (known as ’ligand’) to achieve a biological function. The modern-day drug discovery and development timeline is a complex process that starts with protein target identification and ends with an FDA approval, and that takes an average of 12–15 years and costs more than $1 billion until the launch of the finished product. These time and cost issues are mostly due to (i) the time it takes in the early stage to identify drug candidates effective against a protein target, and (ii) the inflated failure rates of over 90% during later clinical trial stages, where the identified potential drug candidate proteins fail to succeed during various stages of developmental clinical trials. The remainder of this paper discusses more about the proposed method, design of the two-stage sampling-based classifier system, and its experimental efficacy

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.