Since the inception of the China–Europe Railway Express (CRE), rail transportation has emerged as a predominant means of transporting goods across international borders. The quality and reliability of railway transport services are constrained by the precision of train travel times. However, the real-time data available for the CRE trains is rather limited due to the volume and duration of operations, which increases the difficulty of forecasting CRE train travel times. In order to enhance the accuracy of predictions, a novel two-stage transfer learning model with the Long Short-Term Memory attention (T.R2_LSTM_A) is introduced in this study. By incorporating attention mechanisms, the Long Short-Term Memory networks (LSTM) module, the two-stage TrAdaBoost.R2 algorithm, and transfer learning (TL), the model captures the inherent characteristics of time series data and attains a high level of accuracy in predictions. This approach effectively mitigates the limitations of current transfer learning methods, which are primarily caused by insufficient sample data. In the model, an instance-based TL method is utilized to address the shortcomings of insufficient data, and an attention-based LSTM is selected as a base learner to overcome the thorny issues of gradient disappearance and explosion. Additionally, LSTM Attention (LSTM_A), a novel attention mechanism, is introduced to improve the ability of the proposed model to capture complicated characteristics. The T.R2_LSTM_A model is systemically evaluated and shown to be superior to Random Forest (RF), Support Vector Regression (SVR), LSTM, convolution-based LSTM (CLSTM), AdaBoost_LSTM, LSTM with attention mechanism (LSTM_Attention), and TrAdaBoost.R2_RF (T.R2_RF) for predicting train travel time, using a real-life case study of CRE. The MSE, RMSE, MAPE, and MAE values for the T.R2_LSTM_A model are equal to 5.572, 2.361, 2.578, and 1.767, respectively. These values surpass those of other widely used prediction methods, indicating that the T.R2_LSTM_A model is well-suited for predicting train travel times in the CRE.