The local climate zone (LCZ) scheme was originally proposed to provide an interdisciplinary taxonomy for urban heat island (UHI) studies. In recent years, the scheme has also become a starting point for the development of higher-level products, as the LCZ classes can help provide a generalized understanding of urban structures and land uses. LCZ mapping can therefore theoretically aid in fostering a better understanding of spatio-temporal dynamics of cities on a global scale. However, reliable LCZ maps are not yet available globally. As a first step toward automatic LCZ mapping, this work focuses on LCZ-derived land cover classification, using multi-seasonal Sentinel-2 images. We propose a recurrent residual network (Re-ResNet) architecture that is capable of learning a joint spectral-spatial-temporal feature representation within a unitized framework. To this end, a residual convolutional neural network (ResNet) and a recurrent neural network (RNN) are combined into one end-to-end architecture. The ResNet is able to learn rich spectral-spatial feature representations from single-seasonal imagery, while the RNN can effectively analyze temporal dependencies of multi-seasonal imagery. Cross validations were carried out on a diverse dataset covering seven distinct European cities, and a quantitative analysis of the experimental results revealed that the combined use of the multi-temporal information and Re-ResNet results in an improvement of approximately 7 percent points in overall accuracy. The proposed framework has the potential to produce consistent-quality urban land cover and LCZ maps on a large scale, to support scientific progress in fields such as urban geography and urban climatology.