Accurate production prediction of shale gas wells in the early stage is the key to optimizing development plans and effectively extending the life cycle of gas wells. The existing models pay less attention to the production prediction of gas wells with short production history in the early stage, and the long-term shut-in intervals in the early production process have not been taken into account. In addition, uncertainty in production forecasts is often not mentioned because of difficulties in characterizing. In this paper, a new early production prediction model is developed that integrates the long short-term memory (LSTM) model and deep learning autoregressive (DeepAR) model. The LSTM model makes a deterministic prediction of early production. At the same time, the Bidirectional LSTM (Bi-LSTM) model is applied to the interpolation of long shut-in interval data. For the problem of lag and difficult to deal with production mutation in LSTM model, DeepAR model uses probabilistic prediction to quantify the uncertainty in early production prediction. To verify the performance of the proposed model, the production time series of production case wells are analyzed. The experimental results indicate that the early production prediction model established in this paper can effectively realize the production dynamic tracking with limited production data, and can reliably quantify the uncertainty of early production prediction.