Spatial temporal forecasting of urban sensors is essentially important for many urban systems, such as intelligent transportation and smart cities. However, due to the problem of hardware failure or network failure, there are some missing values or missing monitoring sensors that need to be interpolated. Recent research on deep learning has made substantial progress on imputation problem, especially temporal aspect (i.e., time series imputation), while little attention has been paid to spatial aspect (both dynamic and static) and long-term temporal dependencies. In this article, we proposed a spatial temporal imputation model, named Long Short-Term Graph Convolution Networks (LSTGCN), which includes gated temporal extraction (GTE) module, multi-head attention-based temporal capture (MHAT) module, long-term periodic temporal encoding (LPTE) module, and bidirectional spatial graph convolution (BSGC) module. The GTE adopts a gated mechanism to filter short-term temporal information, while the MHAT utilizes position encoding to enhance the difference of each timestamps, then use multi-head attention to capture short-term temporal dependency. The BSGC is adopted to handle with spatial relationships between sensor nodes. And we design a periodic encoding technique to process long-term temporal dependencies. The BSGC handles spatial relationships between sensor nodes, and a periodic encoding technique is used to process long-term temporal dependencies. Our experimental analysis includes completion and forecasting tasks, as well as transfer and ablation analyses. The results show that our proposed model outperforms state-of-the-art baselines on real-world datasets.