To construct reduced-order models, we propose a convolutional autoencoder and a convolutional LSTM (CAE-ConvLSTM). Using convolutional LSTMs, our model preserves spatial information at the level of representative features, which are extracted using convolutional autoencoding. A block attention algorithm was proposed for improving the convolutional layer to capture long-range dependency in features. According to the results, the concatenated version of this attention mechanism can improve the performance of the model. A first- and second-order time derivative architecture is employed to modify CAE-ConvLSTM. Future forecasting in this approach is not only based on current conditions but also on dynamic representations of input that enable better features to be captured. The results indicate that the model’s accuracy has improved significantly. On the basis of the obtained results, more than 3000 and 30 times better accuracy were achieved on cylinder and dam break test cases, as well as the ability to predict the next 20 steps.