Tropical cyclones (TC) exert a profound impact on cities, causing extensive damage and losses. Thus, TC Intensity Prediction is crucial for creating sustainable cities as it enables proactive measures to be taken, including evacuation planning, infrastructure reinforcement, and emergency response coordination. In this study, we propose a Deep learning-powered TC Intensity Prediction (Deep-TCP) framework. In particular, Deep-TCP contains a data constraint module for fusing data features from multiple sources and establishing a unified global representation. To capture the spatiotemporal attributes, a Spatial-Temporal Attention (ST-Attention) module is built to distill insights from environmental variables. To improve the robustness and stability of the predictions, an encoder-decoder module that utilizes the ConvGPU unit is introduced to enhance feature maps. Then, a novel feature enhancement module is built to bolster the generalization capability and solve the dependency attenuation. The results demonstrate that the Deep-TCP framework significantly outperforms various benchmarks. Additionally, it effectively predicts multiple TC categories within the 6–24 h timeframe, showing strong capability in predicting changing trends. The reliable prediction results are potentially beneficial for disaster management and urban planning, significantly enhancing urban sustainability by improving preparedness and response strategies.