AbstractClimate change in Arctic regions poses a threat to permafrost stability, potentially leading to issues such as infrastructure damage, altered ecosystems, and increased greenhouse gas emissions. The challenge of monitoring permafrost temperatures and identifying critical environmental factors is accentuated by the scarcity of subsurface thermal data in remote Arctic locales. To alleviate this lack of soil thermal data we set out to combine in situ observed soil temperatures with MERRA-2 reanalysis data and Machine Learning (ML) to develop local and season-specific subsurface soil temperature predictions in Alaska. First, reanalysis features are selected based on surface energy budget physics. Coupled with Julian calendar dates, reanalysis features are preprocessed and then fed to the ML prediction model. We conducted a series of computational experiments and tested the results against performance benchmarks to determine the optimal prediction models, length of look-back period of model inputs, and the training set size. Six conventional ML models (e.g., GBDT, RF, SVR) and five statistical baselines (e.g. ARIMA) using in situ time series of soil temperature data ranging from 0 - 1 m depth are considered for the prediction task. The models are trained using in situ soil temperatures at depths between 0 and 1 m, spanning 16+ years, from field sites at Deadhorse and Toolik Lake, Alaska. All prediction performances are assessed using root mean squared error (RMSE) and mean absolute error (MAE). Results show that locally trained ML models can estimate shallow soil temperatures with an average error of $$RMSE=1.308^{\circ }$$ R M S E = 1 . 308 ∘ C.