Abstract

The task of drug-target interaction (DTI) prediction plays important roles in drug development. The experimental methods in DTIs are time-consuming, expensive and challenging. To solve these problems, machine learning-based methods are introduced, which are restricted by effective feature extraction and negative sampling. In this work, features with electrotopological state (E-state) fingerprints for drugs and amphiphilic pseudo amino acid composition (APAAC) for target proteins are tested. E-state fingerprints are extracted based on both molecular electronic and topological features with the same metric. APAAC is an extension of amino acid composition (AAC), which is calculated based on hydrophilic and hydrophobic characters to construct sequence order information. Using the combination of these feature pairs, the prediction model is established by support vector machines. In order to enhance the effectiveness of features, a distance-based negative sampling is proposed to obtain reliable negative samples. It is shown that the prediction results of area under curve for Receiver Operating Characteristic (AUC) are above 98.5% for all the three datasets in this work. The comparison of state-of-the-art methods demonstrates the effectiveness and efficiency of proposed method, which will be helpful for further drug development.

Highlights

  • Drug-target interaction (DTI) prediction is of great significance for pharmacology development [1,2]

  • The dataset was first introduced by Yamanishi et al, and can be divided into four subdatasets named by the enzyme, G-protein coupled receptors (GPCRs), ion channel and nuclear receptors [13]

  • The drug-descriptors, electrotopological state (E-state) fingerprints, are extracted by PaDEL-Descriptor, which is a free software for compound descriptors generation [30]

Read more

Summary

Introduction

Drug-target interaction (DTI) prediction is of great significance for pharmacology development [1,2]. Due to the lack of relevant theoretical knowledge, experimental methods are easy to get a high failure rate, and are restricted by their high economic and time cost [3,4]. According to reports, it often takes decades for a new drug to be approved by US Food and Drug Administration (FDA) [5]. With the improvement of a relevant knowledge system, the hypothesis that a single drug corresponds to a single target has been extended, which makes the original DTI problems more complex [6]. Computational methods have attracted more attention in DTI research in recent years [4,7,8,9]

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.