Recently, the global background concentration of ozone (O3) has demonstrated a rising trend. Among various methods, groun-based monitoring of O3 concentrations is highly reliable for research analysis. To obtain information on the spatial characteristics of O3 concentrations, it is necessary that the ground monitoring sites be constructed in sufficient density. In recent years, many researchers have used machine learning models to estimate surface O3 concentrations, which cannot fully provide the spatial and temporal information contained in a sample dataset. To solve this problem, the current study utilized a deep learning model called the Residual connection Convolutional Long Short-Term Memory network (R-ConvLSTM) to estimate daily maximum 8-hr average (MDA8) O3 over Jiangsu province, China during 2020. In this research, the R-ConvLSTM model not only provides the spatiotemporal information of MDA8 O3, but also involves residual connection to avoid the problem of gradient explosion and gradient disappearance with the deepening of network layers. We utilized the TROPOMI total O3 column retrieved from Sentinel-5 Precursor, ERA5 reanalysis meteorological data, and other supplementary data to build a pre-trained dataset. The R-ConvLSTM model achieved an overall sample-base cross-validation (CV) R2 of 0.955 with root mean square error (RMSE) of 9.372 µg/m3. Model estimation also showed a city-based CV R2 of 0.896 with RMSE of 14.029 µg/m3, the highest MDA8 O3 in spring being 122.60 ± 31.60 µg/m3 and the lowest in winter being 69.93 ± 18.48 µg/m3.