With the increasing energy demand and environmental protection requirements of heating systems, natural gas (NG), as an efficient and clean alternative energy source for coal, plays an increasingly important role in district heating systems (DHSs). The accurate prediction of the DHS consumption of NG in the whole city is helpful in formulating an efficient energy scheduling plan. However, because the DHS is a complex non-linear system with time-space delay effect and multi-factors, the traditional algorithms have deficiencies in the prediction accuracy of the natural gas consumption for city-level DHS. To achieve accurate prediction, a gas consumption prediction algorithm based on the attention gated recurrent unit (AGRU) model is proposed, which can obtain high-level features of historical data that affect gas consumption prediction. In addition, the attention mechanism can help to select more critical feature inputs and improve the prediction accuracy of gas consumption. Detailed comparative experiments were performed between the proposed AGRU and state-of-art NG consumption prediction algorithms, such as the recurrent neural network (RNN), long short-term memory (LSTM), GRU, attention RNN (ARNN), attention LSTM (ALSTM), support vector regression (SVR), random forest regression (RFR), gradient boosting regression (GBR) and decision tree regression (DTR). An analysis of the experimental results shows that the proposed AGRU algorithm has a prediction accuracy of 95.3%, which is significantly better than other algorithms. In addition, the hyperparameters of the AGRU algorithm are also tested in detail to optimize the selection.