Sea surface temperature (SST) prediction has received increasing attention in recent years due to its paramount importance in the various fields of oceanography. Existing studies have shown that neural networks are particularly effective in making accurate SST predictions by efficiently capturing spatiotemporal dependencies in SST data. Among various models, the ConvLSTM framework is notably prominent. This model skillfully combines convolutional neural networks (CNNs) with recurrent neural networks (RNNs), enabling it to simultaneously capture spatiotemporal dependencies within a single computational framework. To overcome the limitation that CNNs primarily capture local spatial information, in this paper we propose a novel model named DatLSTM that integrates a deformable attention transformer (DAT) module into the ConvLSTM framework, thereby enhancing its ability to process more complex spatial relationships effectively. Specifically, the DAT module adaptively focuses on salient features in space, while ConvLSTM further captures the temporal dependencies of spatial correlations in the SST data. In this way, DatLSTM can adaptively capture complex spatiotemporal dependencies between the preceding and current states within ConvLSTM. To evaluate the performance of the DatLSTM model, we conducted short-term SST forecasts in the Bohai Sea region with forecast lead times ranging from 1 to 10 days and compared its efficacy against several benchmark models, including ConvLSTM, PredRNN, TCTN, and SwinLSTM. Our experimental results show that the proposed model outperforms all of these models in terms of multiple evaluation metrics short-term SST prediction. The proposed model offers a new predictive learning method for improving the accuracy of spatiotemporal predictions in various domains, including meteorology, oceanography, and climate science.