Accurate prediction of drug–target binding affinity (DTA) is essential in the field of drug discovery. Recently, scientists have been attempting to utilize artificial intelligence prediction to screen out a significant number of ineffective compounds, thereby mitigating labor and financial losses. While graph neural networks (GNNs) have been applied to DTA, existing GNNs have limitations in effectively extracting substructural features across various sizes. Functional groups play a crucial role in modulating molecular properties, but existing GNNs struggle with feature extraction from certain motifs due to scale mismatches. Additionally, sequence-based models for target proteins lack the integration of structural information. To address these limitations, we present SSR-DTA, a multi-layer graph network capable of adapting to diverse structural sizes, which can extract richer biological features, thereby improving the robustness and accuracy of predictions. Multi-layer GNNs enable the capture of molecular motifs across different scales, ranging from atomic to macrocyclic motifs. Furthermore, we introduce BiGNN to simultaneously learn sequence and structural information. Sequence information corresponds to the primary structure of proteins, while graph information represents the tertiary structure. BiGNN assimilates richer information compared to sequence-based methods while mitigating the impact of errors from predicted structures, resulting in more accurate predictions. Through rigorous experimental evaluations conducted on four benchmark datasets, we demonstrate the superiority of SSR-DTA over state-of-the-art models. Particularly, in comparison to state-of-the-art models, SSR-DTA demonstrates an impressive 20% reduction in mean squared error on the Davis dataset and a 5% reduction on the KIBA dataset, underscoring its potential as a valuable tool for advancing DTA prediction.
Read full abstract