Current remote sensing techniques fail to observe and generate large scale multi-layer soil moisture (SM) due to the inherent features of the satellite sensors. The lack of comprehensive understanding of multi-layer SM hinders the sustainable development of agriculture, hydrology, and food security. In order to overcome the depth barrier of traditional SM assimilation and downscaling methods, we developed a Two-step Multi-layer SM Downscaling (TMSMD) framework by fusing multi-source remotely sensed, reanalysis, and in-situ data through both machine learning and state-of-the-art deep learning models to generate multi-layer SM. The produced multi-layer SM was characterized by high resolution (1 km), high spatio-temporal continuity (cloud-free and daily), and high accuracy (i.e., 3H data). Firstly, the coarse resolution SMAP SM was downscaled to 1 km spatial resolution using LightGBM to weaken the effects of scale mismatch issue and provide high-resolution input for the subsequent calibration. Results indicated that the downscaled SMAP SM remained high consistency with the original SMAP SM product. With the high-resolution inputs, we calibrated the downscaled SMAP SM using multi-layer in-situ SM through state-of-the-art attention-based LSTM. Results demonstrated that the average PCC, RMSE, ubRMSE, and MAE were improved by 22.3 %, 50.7 %, 26.2 %, and 56.7 % compared to SMAP L4 SM while 38.5 %, 52.1 %, 29.5 %, and 58.7 % compared to downscaled SMAP SM. Further spatio-temporal and comparative analysis confirmed that the multi-layer SM produced by the TMSMD framework had excellent performance in capturing the spatial and temporal dynamics. In conclude, the proposed TMSMD framework successfully generated 3H multi-layer SM data and is promising for accurate assessment and monitoring in agriculture, water resources, and environmental domains.