Accurate measurement of the affinity between drug targets is of great importance in the field of drug discovery, as it provides significant information about drug action. However, the diverse representations of compounds and proteins increase the difficulty of this task. Recently, numerous studies have explored the application of various deep learning methods and architectures in Drug-Target Affinity (DTA) prediction tasks, continuously improving performance. In this work, we present a multimodal late-stage fusion prediction model called MGDTA that combines the pioneering Mamba architecture with a hybrid model architecture based on graph transformers and transformers (GTT). For molecular compounds, structural characterisation is extracted using GTT layers. Additionally, a feature integration module is utilized to obtain molecular features from derived molecular fingerprints, culminating in a unified representation of drug molecules through a feature integration module. For proteins, we adopt the state-space model (SSM) module Mamba to learn feature representations from target protein sequences. This represents a preliminary exploration of the Mamba module in protein sequence feature extraction in DTA tasks. Finally, we predict the results by integrating the feature representations of both molecules and proteins. Our results indicate that the proposed model achieves superior performance on benchmark datasets, validating the feasibility of using the Mamba architecture for DTA tasks. Furthermore, sequence feature visualization demonstrates the capability of the Mamba block to focus on binding site residues within drug targets.