Abstract

Drug–target interaction (DTIs) prediction plays a vital role in probing new targets for breast cancer research. Considering the multifaceted challenges associated with experimental methods identifying DTIs, the in silico prediction of such interactions merits exploration. In this study, we develop a feature-based method to infer unknown DTIs, called PsePDC-DTIs, which fuses information regarding protein sequences extracted by pseudo-position specific scoring matrix (PsePSSM), detrended cross-correlation analysis coefficient (DCCA coefficient), and an FP2 format molecular fingerprint descriptor of drug compounds. In addition, the synthetic minority oversampling technique (SMOTE) is employed for dealing with the imbalanced data after Lasso dimensionality reduction. Then, the processed feature vectors are put into a random forest classifier to perform DTIs predictions on four gold standard datasets, including nuclear receptors (NR), G-protein-coupled receptors (GPCR), ion channels (IC), and enzymes (E). Furthermore, we explore new targets for breast cancer treatment using its risk genes identified from large-scale genome-wide genetic studies using PsePDC-DTIs. Through five-fold cross-validation, the average values of accuracy in NR, GPCR, IC, and E datasets are 95.28%, 96.19%, 96.74%, and 98.22%, respectively. The PsePDC-DTIs model provides us with 10 potential DTIs for breast cancer treatment, among which erlotinib (DB00530) and FGFR2 (hsa2263), caffeine (DB00201) and KCNN4 (hsa3783), as well as afatinib (DB08916) and FGFR2 (hsa2263) are found with direct or inferred evidence. The PsePDC-DTIs model has achieved good prediction results, establishing the validity and superiority of the proposed method.

Highlights

  • Breast cancer is the most common gynecological malignant tumor in the world [1], with incidence rates that outdistance other cancers in both transitioned and transitioning countries [2]

  • pseudo-position specific scoring matrix (PsePSSM) is the extraction of the features of protein sequences, which can be obtained by translating the position specific scoring matrix (PSSM) of different dimensions for different protein sequences into the same dimension

  • We develop a novel method for predicting and identifying drug–target interactions (DTIs), called PsePDC-DTIs

Read more

Summary

Introduction

Breast cancer is the most common gynecological malignant tumor in the world [1], with incidence rates that outdistance other cancers in both transitioned and transitioning countries [2]. It is reported that the global incidence of breast cancer has increased at a rate of 0.5% annually [3]. Hereditary and genetic factors can account for 5% to 10% of breast cancer cases [2]. Approximately 100 breast cancer risk loci have been identified in a genome-wide association study (GWAS) [4]. Only a few of targets are for the development of new drugs for breast cancer. With the purpose of exploring new targets for drugs of breast cancer treatment, predicting new drug–target interactions (DTIs) is a good solution. We explore new DTIs via drugs approved by FDA and the breast cancer risk genes identified from large-scale genome-wide genetic studies [8,9,10,11]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call