History matching can estimate the parameter of spatially varying geological properties and provide reliable numerical models for reservoir development and management. However, in practice, high-dimension, multiple-solutions and computational cost are key issues that restrict the application of history matching methods. Recently, the combination of deep-learning-based surrogate model and sampling algorithm has been widely studied in history matching to overcome the limitations. Considering that real-world large-scale reservoirs often have hundreds of thousands or even millions of grid-based uncertain parameters, extracting spatial features using convolutional neural networks requires a lot of computational cost and storage requirements. Therefore, in this work, we mainly study how to use the recurrent neural network (RNN) to construct the surrogate model for history matching. Specifically, we propose a multilayer RNN surrogate model based on a vector-to-sequence modeling framework. The multilayer RNN surrogate model with gated recurrent unit (GRU), termed MLGRU, is developed to approximate the mapping from feature vector of geological realizations to the production data. The feature vector is the low-dimensional representation of geological parameter fields after using the re-parameterization method, while production data are the simulation results of historical period. In addition, we design a log-transformation-based windowed normalization (LTWN) method for the production data, which can enhance the learnability and features of production data. The MLGRU model is incorporated into a multimodal estimation of distribution algorithm (MEDA) to formulate a history matching workflow. The hyperparameters and performance of the proposed MLGRU model are analyzed by numerical experiments on a 2D reservoir model. Furthermore, numerical experiments performed on the Brugge benchmark model, a large-scale 3D reservoir model, demonstrated the performance of the proposed surrogate model and history matching method.