MML-DTI: Multimanifold Learning with Hyperbolic Graph Neural Networks for Enhanced Drug-Target Interaction Prediction.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Accurately predicting drug-target interactions (DTIs) is crucial for drug discovery, repositioning. However, most deep learning-based DTI models are designed in Euclidean space, making it difficult to effectively represent the hierarchical and scale-free characteristics of biological data. Due to its unique negatively curved geometric properties, hyperbolic space can more effectively represent hierarchical relationships within data. Therefore, we propose a multimanifold learning framework that integrates multimodal features in hyperbolic and Euclidean spaces for drug-target interaction prediction. Specifically, we employ a Hyperbolic Graph Neural Network (HGNN) to extract features from molecular graphs of small-molecular drugs, thereby effectively capturing the hierarchical structural information within these graphs. To integrate heterogeneous information, a Multi-Manifold Feature Fusion Module combines structural features from the HGNN, chemical fingerprints, and semantic embeddings derived from pretrained language models. Extensive experiments on benchmark data sets demonstrate that our framework achieves superior performance compared with state-of-the-art Euclidean-based methods. The experimental results demonstrate that hyperbolic geometry offers significant advantages in extracting hierarchical features from non-Euclidean data and also highlight the promising potential of multimanifold feature fusion in the field of drug-target interaction prediction.

Similar Papers
  • Dissertation
  • Cite Count Icon 2
  • 10.32657/10356/75771
Challenges and solutions in drug-target interaction prediction
  • Jan 1, 2018
  • Ali Ezzat

When a drug is developed, it is designed so that it interacts with a specific target of interest in order to achieve the desired therapeutic effect. However, it is quite common to later find that the developed drug also interacts with multiple other targets that were not intended during its development. This is interesting because if a drug can interact with multiple targets, then it may have more than one therapeutic effect. Therefore, this provides a clear motivation for discovering new interactions for existing drugs. In drug discovery, an important task called drug-target interaction prediction detects such interactions on a large scale by screening many drugs and targets simultaneously. While there are wet-lab techniques for discovering these interactions, the focus of this thesis is particularly on computational drug-target interaction prediction. Specifically, we investigate methods that discover new interactions based on prior knowledge of existing drugs and their experimentally confirmed targets (i.e. machine learning). Throughout this thesis, we identified and addressed 4 outstanding problems in drug target interaction (DTI) prediction. Having addressed these problems, we were able to enhance the prediction performance and outperform relevant state-of-the-art methods. Firstly, DTI prediction methods have difficulty predicting interactions involving new drugs or targets for which there are no known interactions. To predict interactions, we developed two matrix factorization methods that utilize graph regularization. In addition, considering that many of the non-occurring edges in the bipartite DTI network are actually unknown or missing cases, we developed a preprocessing step to enhance predictions in the “new drug” and “new target” cases by adding edges with intermediate interaction likelihood scores. In our experiments, our methods performed better than the state-of-the-art methods and was found to predict interactions reasonably well. Secondly, class imbalance is an issue that is prevalent across all DTI datasets. Class imbalance can be divided into two sub-problems, namely between-class and within-class 7 imbalance. Between-class imbalance refers to the imbalance ratio between interacting and non-interacting drug-target pairs; this degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Withinclass imbalance refers to the imbalance between the sizes of sub-groups (types) of interactions; this biases the predictions towards the bigger and more well-represented sub-groups, leading to more errors in the smaller groups. Here, we developed an ensemble learning method that incorporates techniques to address the issues of between class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. Thirdly, there are DTI datasets where the feature sets for representing the drugs and targets (and, by extension, the drug-target pairs) are of a high dimensionality. High dimensionality of the data may lead to much longer running times for the prediction models. Furthermore, there may be redundancy in the features which may also lead to degradation in prediction performance. In this work, we used dimensionality reduction to deal with both of these issues, and we additionally used ensemble learning to improve the prediction performance further. As base learners for the ensemble, we selected two classifiers, namely Decision Tree and Kernel Ridge Regression, resulting in two variants of ensemble models, EnsemDT and EnsemKRR, respectively. Experimental results show that our proposed methods are indeed successful. Lastly, there is a concept called differential representation bias that has an impact on the prediction performance of DTI prediction methods. Specifically, differential representation bias refers to how much a drug (or target) appears in the positive training data as opposed to the negative data. Bearing this concept in mind, we experimented with the way that the negative training data is sampled prior to training the prediction model. We found that our modified sampling procedure produced significant improvements in DTI prediction performance.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 127
  • 10.1186/s12859-017-1460-z
Link prediction in drug-target interactions network using similarity indices
  • Jan 17, 2017
  • BMC Bioinformatics
  • Yiding Lu + 2 more

BackgroundIn silico drug-target interaction (DTI) prediction plays an integral role in drug repositioning: the discovery of new uses for existing drugs. One popular method of drug repositioning is network-based DTI prediction, which uses complex network theory to predict DTIs from a drug-target network. Currently, most network-based DTI prediction is based on machine learning – methods such as Restricted Boltzmann Machines (RBM) or Support Vector Machines (SVM). These methods require additional information about the characteristics of drugs, targets and DTIs, such as chemical structure, genome sequence, binding types, causes of interactions, etc., and do not perform satisfactorily when such information is unavailable. We propose a new, alternative method for DTI prediction that makes use of only network topology information attempting to solve this problem.ResultsWe compare our method for DTI prediction against the well-known RBM approach. We show that when applied to the MATADOR database, our approach based on node neighborhoods yield higher precision for high-ranking predictions than RBM when no information regarding DTI types is available.ConclusionThis demonstrates that approaches purely based on network topology provide a more suitable approach to DTI prediction in the many real-life situations where little or no prior knowledge is available about the characteristics of drugs, targets, or their interactions.

  • Research Article
  • Cite Count Icon 87
  • 10.1093/bib/bbac109
DTI-HETA: prediction of drug-target interactions based on GCN and GAT on heterogeneous graph.
  • Apr 4, 2022
  • Briefings in Bioinformatics
  • Kanghao Shao + 5 more

Drug-target interaction (DTI) prediction plays an important role in drug repositioning, drug discovery and drug design. However, due to the large size of the chemical and genomic spaces and the complex interactions between drugs and targets, experimental identification of DTIs is costly and time-consuming. In recent years, the emerging graph neural network (GNN) has been applied to DTI prediction because DTIs can be represented effectively using graphs. However, some of these methods are only based on homogeneous graphs, and some consist of two decoupled steps that cannot be trained jointly. To further explore GNN-based DTI prediction by integrating heterogeneous graph information, this study regards DTI prediction as a link prediction problem and proposes an end-to-end model based on HETerogeneous graph with Attention mechanism (DTI-HETA). In this model, a heterogeneous graph is first constructed based on the drug-drug and target-target similarity matrices and the DTI matrix. Then, the graph convolutional neural network is utilized to obtain the embedded representation of the drugs and targets. To highlight the contribution of different neighborhood nodes to the central node in aggregating the graph convolution information, a graph attention mechanism is introduced into the node embedding process. Afterward, an inner product decoder is applied to predict DTIs. To evaluate the performance of DTI-HETA, experiments are conducted on two datasets. The experimental results show that our model is superior to the state-of-the-art methods. Also, the identification of novel DTIs indicates that DTI-HETA can serve as a powerful tool for integrating heterogeneous graph information to predict DTIs.

  • Research Article
  • Cite Count Icon 1
  • 10.1038/s41467-025-66915-1
TAPB: an interventional debiasing framework for alleviating target prior bias in drug-target interaction prediction.
  • Dec 2, 2025
  • Nature communications
  • Gaoming Lin + 6 more

Drug Target Interaction (DTI) prediction is vital for drug repurposing. Previous DTI studies on BioSNAP and BindingDB datasets often attribute biased predictions to "drug bias," while our work reveals "target prior bias" as the predominant issue. This bias stems from the "prior tendency," characterized by the imbalanced label distribution of targets in the training data. From causal lens, target "prior tendency" is a confounder, causing models trained with P(Y∣D,T) to learn spurious associations between targets and labels rather than genuine interaction mechanisms. In this study, we introduce alleviating Target Prior Bias in Drug-Target Interaction Prediction (TAPB), a novel debiasing framework that employs amino acid randomization, confounder alignment module (CAM), and interventional training to compute P(Y∣D,do(T)) via backdoor adjustment, thereby addressing this bias. TAPB achieves competitive performance over existing approaches, demonstrating enhanced generalization and providing interpretable insights into DTIs.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/bibm.2018.8621514
Drug Target Interaction Prediction with Non-random Missing Labels
  • Dec 1, 2018
  • Sheng Ni + 3 more

Drug-Target Interaction (DTI) prediction plays an important role in drug discovery and drug repurposing. DTI prediction is usually modeled as a binary classification problem. Unlike previous studies which label unknown DTIs as negative samples, we assume the unknown DTIs are labels that are missing not at random. For example, negative DTI labels are more likely to be missing because biomedical researchers prioritize to study DTIs that are more likely to be positive. We introduce a novel probabilistic model, Factorization with Non-random Missing Labels (FNML), for DTI prediction. FNML models the generative process for the DTI labels (i.e. the labels are positive or negative) and responses (i.e. the labels are observed or missing). In particular, the probability of observing or missing a label is associated with the sign of the label. We also conduct comprehensive experiments to validate the robust performance of the proposed models.

  • Research Article
  • Cite Count Icon 100
  • 10.2174/0929867327666200907141016
Deep Learning in Drug Target Interaction Prediction: Current and Future Perspectives.
  • Sep 7, 2020
  • Current Medicinal Chemistry
  • Karim Abbasi + 4 more

Drug-target Interactions (DTIs) prediction plays a central role in drug discovery. Computational methods in DTIs prediction have gained more attention because carrying out in vitro and in vivo experiments on a large scale is costly and time-consuming. Machine learning methods, especially deep learning, are widely applied to DTIs prediction. In this study, the main goal is to provide a comprehensive overview of deep learning-based DTIs prediction approaches. Here, we investigate the existing approaches from multiple perspectives. We explore these approaches to find out which deep network architectures are utilized to extract features from drug compound and protein sequences. Also, the advantages and limitations of each architecture are analyzed and compared. Moreover, we explore the process of how to combine descriptors for drug and protein features. Likewise, a list of datasets that are commonly used in DTIs prediction is investigated. Finally, current challenges are discussed and a short future outlook of deep learning in DTI prediction is given.

  • Research Article
  • Cite Count Icon 1
  • 10.1021/acs.jcim.5c01250
SGcCA: Deciphering Drug-Target Interactions through an End-to-End Model with Spatial and Channel Reconstruction Convolution and Cross-Efficient-Additive Attention.
  • Oct 9, 2025
  • Journal of chemical information and modeling
  • Lihong Peng + 5 more

Drug-Target Interaction (DTI) prediction is an indispensable process in drug repositioning. Wet-lab experiments for potential DTI identification are reliable but expensive, labor-intensive, and time-consuming. Deep learning demonstrates the superior representation learning capability in the DTI prediction. However, there is still debate about how to accurately learn drug and protein features and further effectively fuse these features. To address the above issues, this work introduces SGcCA, an end-to-end DTI prediction framework by incorporating Spatial and Channel reconstruction Convolution (SCConv), Graph convolutional Network (GCN), and Cross-efficient-additive Attention (CEAA). First, an SCConv module is proposed to encode drug features from their SMILES strings and protein features from their amino acid sequences by reducing spatial and channel redundancies. Next, GCN is employed to encode drug features from their 2D molecular graphs. Subsequently, a CEAA block is devised to fuse the learned drug and protein features. Finally, the fused features are taken as the inputs and all unobserved drug-target pairs are classified through a multilayer perceptron. Using accuracy, F1-score, MCC, AUROC, and AUPRC as evaluation metrics, SGcCA outperformed six popular DTI prediction models (i.e., CPI-GNN, MolTrans, BACPI, CPGL, GIFDTI, and FOTF-CPI) under four different experimental scenarios on four publicly available DTI data sets (Human, C.elegans, BindingDB, and DrugBank), showcasing its better interpretability and generalization ability. Ablation study further underscored the importance of SCConv, CEAA, and GCN. Moreover, visualization of the fused features along with case study and molecular docking outcomes ensured that the predicted DTIs matched closely with the real interactions, further proving the greater performance of SGcCA. As an open-source tool, SGcCA is poised to provide support for drug repositioning. The source codes and data are freely available: https://github.com/plhhnu/SGcCA.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 51
  • 10.1371/journal.pone.0226484
Drug-target interaction prediction using Multi Graph Regularized Nuclear Norm Minimization
  • Jan 16, 2020
  • PLOS ONE
  • Aanchal Mongia + 1 more

The identification of potential interactions between drugs and target proteins is crucial in pharmaceutical sciences. The experimental validation of interactions in genomic drug discovery is laborious and expensive; hence, there is a need for efficient and accurate in-silico techniques which can predict potential drug-target interactions to narrow down the search space for experimental verification. In this work, we propose a new framework, namely, Multi-Graph Regularized Nuclear Norm Minimization, which predicts the interactions between drugs and target proteins from three inputs: known drug-target interaction network, similarities over drugs and those over targets. The proposed method focuses on finding a low-rank interaction matrix that is structured by the proximities of drugs and targets encoded by graphs. Previous works on Drug Target Interaction (DTI) prediction have shown that incorporating drug and target similarities helps in learning the data manifold better by preserving the local geometries of the original data. But, there is no clear consensus on which kind and what combination of similarities would best assist the prediction task. Hence, we propose to use various multiple drug-drug similarities and target-target similarities as multiple graph Laplacian (over drugs/targets) regularization terms to capture the proximities exhaustively. Extensive cross-validation experiments on four benchmark datasets using standard evaluation metrics (AUPR and AUC) show that the proposed algorithm improves the predictive performance and outperforms recent state-of-the-art computational methods by a large margin. Software is publicly available at https://github.com/aanchalMongia/MGRNNMforDTI.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 849
  • 10.1371/journal.pcbi.1002503
Prediction of Drug-Target Interactions and Drug Repositioning via Network-Based Inference
  • May 10, 2012
  • PLoS Computational Biology
  • Feixiong Cheng + 8 more

Drug-target interaction (DTI) is the basis of drug discovery and design. It is time consuming and costly to determine DTI experimentally. Hence, it is necessary to develop computational methods for the prediction of potential DTI. Based on complex network theory, three supervised inference methods were developed here to predict DTI and used for drug repositioning, namely drug-based similarity inference (DBSI), target-based similarity inference (TBSI) and network-based inference (NBI). Among them, NBI performed best on four benchmark data sets. Then a drug-target network was created with NBI based on 12,483 FDA-approved and experimental drug-target binary links, and some new DTIs were further predicted. In vitro assays confirmed that five old drugs, namely montelukast, diclofenac, simvastatin, ketoconazole, and itraconazole, showed polypharmacological features on estrogen receptors or dipeptidyl peptidase-IV with half maximal inhibitory or effective concentration ranged from 0.2 to 10 µM. Moreover, simvastatin and ketoconazole showed potent antiproliferative activities on human MDA-MB-231 breast cancer cell line in MTT assays. The results indicated that these methods could be powerful tools in prediction of DTIs and drug repositioning.

  • Conference Article
  • 10.1109/aims52415.2021.9466059
One-Dimensional Convolutional Neural Network Method as The Predicting Model for Interactions Between Drug and Protein on Heterogeneous Network
  • Apr 28, 2021
  • Iswahyuli + 3 more

Prediction task of drug-target interactions (DTI) is an important step of drug development and repositioning. Experimental identification of drugs and target interactions is expensive and time-consuming. Therefore, predictive drug-target interactions with computational approaches are being developed to alleviate work in drug development. In recent years, many computational approaches aimed at predicting drug-target interactions have been developed. One of the most popular models for predicting drug interactions and targets in recent times is the machine learning-based approach and homogeneous network information. However, the accuracy and efficiency of the methods used still need to be improved. Therefore, this research aims to propose a deep learning-based prediction model for DTI implemented in heterogeneous networks. We use 12,015 nodes and 1,895,445 edges that extract from several databases to build the heterogeneous network. The model of DTI prediction that we proposed implements the random walk with restart (RWR) algorithm to build a heterogeneous network of drug and protein targets, and utilizes diffusion component analysis (DCA) algorithm to obtain low-dimensional vectors. Furthermore, a one-dimensional convolutional neural network (1D-CNN) was used as a predictive model between drug and target. The results show that our proposed model provides good performance with a mean score of AUROC was 0.9332, and a mean score of AUPR was 0.9402.

  • Research Article
  • Cite Count Icon 16
  • 10.1093/bioinformatics/btad774
GeNNius: an ultrafast drug-target interaction inference method based on graph neural networks.
  • Dec 22, 2023
  • Bioinformatics
  • Uxía Veleiro + 9 more

Drug-target interaction (DTI) prediction is a relevant but challenging task in the drug repurposing field. In-silico approaches have drawn particular attention as they can reduce associated costs and time commitment of traditional methodologies. Yet, current state-of-the-art methods present several limitations: existing DTI prediction approaches are computationally expensive, thereby hindering the ability to use large networks and exploit available datasets and, the generalization to unseen datasets of DTI prediction methods remains unexplored, which could potentially improve the development processes of DTI inferring approaches in terms of accuracy and robustness. In this work, we introduce GeNNius (Graph Embedding Neural Network Interaction Uncovering System), a Graph Neural Network (GNN)-based method that outperforms state-of-the-art models in terms of both accuracy and time efficiency across a variety of datasets. We also demonstrated its prediction power to uncover new interactions by evaluating not previously known DTIs for each dataset. We further assessed the generalization capability of GeNNius by training and testing it on different datasets, showing that this framework can potentially improve the DTI prediction task by training on large datasets and testing on smaller ones. Finally, we investigated qualitatively the embeddings generated by GeNNius, revealing that the GNN encoder maintains biological information after the graph convolutions while diffusing this information through nodes, eventually distinguishing protein families in the node embedding space. GeNNius code is available at https://github.com/ubioinformat/GeNNius.

  • Research Article
  • Cite Count Icon 29
  • 10.1093/bioinformatics/btaa038
Secure multiparty computation for privacy-preserving drug discovery.
  • Jan 17, 2020
  • Bioinformatics
  • Rong Ma + 6 more

Quantitative structure-activity relationship (QSAR) and drug-target interaction (DTI) prediction are both commonly used in drug discovery. Collaboration among pharmaceutical institutions can lead to better performance in both QSAR and DTI prediction. However, the drug-related data privacy and intellectual property issues have become a noticeable hindrance for inter-institutional collaboration in drug discovery. We have developed two novel algorithms under secure multiparty computation (MPC), including QSARMPC and DTIMPC, which enable pharmaceutical institutions to achieve high-quality collaboration to advance drug discovery without divulging private drug-related information. QSARMPC, a neural network model under MPC, displays good scalability and performance and is feasible for privacy-preserving collaboration on large-scale QSAR prediction. DTIMPC integrates drug-related heterogeneous network data and accurately predicts novel DTIs, while keeping the drug information confidential. Under several experimental settings that reflect the situations in real drug discovery scenarios, we have demonstrated that DTIMPC possesses significant performance improvement over the baseline methods, generates novel DTI predictions with supporting evidence from the literature and shows the feasible scalability to handle growing DTI data. All these results indicate that QSARMPC and DTIMPC can provide practically useful tools for advancing privacy-preserving drug discovery. The source codes of QSARMPC and DTIMPC are available on the GitHub: https://github.com/rongma6/QSARMPC_DTIMPC.git. Supplementary data are available at Bioinformatics online.

  • Research Article
  • 10.1504/ijdmb.2020.10032430
Drug target interaction prediction via multi-task co-attention
  • Jan 1, 2020
  • International Journal of Data Mining and Bioinformatics
  • Yun Liang + 4 more

Drug-Target Interaction (DTI) prediction is a key step in drug discovery and drug repurposing. A variety of machine learning models are considered to be effective means of predicting DTI. Most current studies regard DTI prediction as a classification task (that is, negative or positive labels are applied to indicate the intensity of interaction) or regression tasks (numerical value is used to measure detailed DTI). In this article, we explore how to balance bias and variance through a multi-task learning framework. Because the classifier is more likely to produce higher bias, and the regression models are more prone to create a significant variance and overfit the training data. We propose a novel model, named Multi-DTI, that can predict the precise value and determine the correct labels of positive or negative interactions. Besides, these two tasks are performed with similar feature representations of CNN, which is adopted with a co-attention mechanism. Detailed experiments show that Multi-DTI is superior to state-of-the-art methods.

  • Research Article
  • Cite Count Icon 1
  • 10.1021/acs.jcim.5c01753
HitScreen: A Sequence-Based Drug Virtual Screening Approach Using Data Augmentation and Protein Language Models.
  • Sep 16, 2025
  • Journal of chemical information and modeling
  • Geng Chen + 7 more

Sequence-based drug-target interaction (DTI) prediction is an effective approach for identifying potential drug candidates without relying on three-dimensional protein structures. However, current sequence-based methods often suffer from limited generalization to novel targets and fail to capture essential spatial interaction features. As a result, they exhibit a significant performance gap compared with structure-based methods. To bridge this gap, we propose HitScreen, a robust deep learning framework specifically designed for sequence-based DTI prediction, applied to virtual screening scenarios. We introduce a conditional label inversion strategy to address class imbalance, annotation biases, and ligand biases in the data sets. HitScreen integrates multiple pretrained protein language models (Ankh, ESM-2, ProtT5) alongside the molecular pretrained model Uni-Mol to encode spatial information. Additionally, HitScreen utilizes a cross-attention mechanism to capture local intermolecular interactions between drug molecules and protein sequences. Rigorous benchmarking on independent data sets (DEKOIS2.0 and DUD-E) demonstrates that HitScreen achieves performance comparable to or surpassing state-of-the-art structure-based methods, while relying solely on protein sequence information. Comprehensive interpretability analyses further validate how the model accurately identifies biologically relevant molecular interactions, providing valuable insights into rational drug design. In summary, these findings demonstrate HitScreen as a robust, interpretable, and broadly applicable framework for DTI prediction with applications in sequence-based drug virtual screening.

  • Research Article
  • 10.1371/journal.pone.0331037
KG-MACNF: A nonlinear cross-modal fusion model for predicting drug-target interactions via multi-relational embedding and fine-grained structure
  • Sep 9, 2025
  • PLOS One
  • Yihan Feng + 6 more

Drug-target interaction (DTI) prediction is essential for the development of novel drugs and the repurposing of existing ones. However, when the features of drug and target are applied to biological networks, there is a lack of capturing the relational features of drug-target interactions. And the corresponding multimodal models mainly depend on shallow fusion strategies, which results in suboptimal performance when trying to capture complex interaction relationships. Therefore, this study proposes a novel framework named KG-MACNF. This framework utilizes knowledge graph embedding (KGE) techniques to capture multi-level relational features of entities in large-scale biological networks. Simultaneously, our innovative PoolGAT network, along with CTD descriptors, is employed to extract drug structural features and protein sequence information. Finally, by employing our innovative nonlinear-driven cross-modal attention fusion network, the framework efficiently integrates these multimodal data and generates the final DTI prediction results. Experiments on two publicly available datasets, Yamanishi_08’s and BioKG, demonstrate the substantial advantages of KG-MACNF in DTI prediction. KG-MACNF demonstrates robust stability, especially under imbalanced data conditions. This study successfully overcomes the bottlenecks of prior models in utilizing modality information and feature complementarity, providing a more accurate tool for drug discovery and DTI prediction.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant