Abstract The accuracy of polar motion prediction significantly impacts the fields of coordinate frame transformation, satellite orbit determination, and deep space exploration. The present study develops two short term forecasting models based on the EOP 14C04 series. One hybrid approach incorporates convolutional neural networks (CNN) and long short-term memory networks (LSTM), augmented with an attention mechanism; whereas another baseline model comprises CNN and LSTM. The first model, in contrast to the second model, incorporates an attention mechanism module for a more comprehensive integration of temporal information at each time step. In the initial short-term forecasting experiment, we conducted 360 repeated predictions, and the findings revealed that the parameters suitable for PMX forecasting may not necessarily be applicable to PMY forecasting. In the second experiment, the two models generated a total of 500 forecasts, each encompassing short-term predictions ranging from 1 to 30 days. The experimental results demonstrate that the first model exhibits mean absolute error (MAE) range of 0 ~ 7.72 mas for PMX and 0 ~ 4.73 mas for PMY, while the second model shows MAE range of 0 ~ 7.88 mas for PMX and 0 ~ 4.78 mas for PMY. After two exploratory experiments, we discovered the following results: the first model exhibits marginally superior predictive accuracy compared to the second model. Furthermore, this study substantiates the robustness of both models in short-term prediction and affirms the significance of assigning distinct weights to past temporal intervals in forecasting, thereby offering a novel perspective for polar motion prediction research.