Robot technology equipped with vision system is expected to facilitate deep, in-hole, in-situ nondestructive recognition of water content in loess slopes. This advancement could considerably aid in the study of thespatial and temporal evolution of loess water content. However, a robust image recognition method capable of accurately recognizing the water content in loess across different regions is essential for the widespread adoption of this technology. Thus, we collected loess samples from the western (Lanzhou), central (Lantian), and eastern (Yan’an) parts of the Chinese Loess Plateau as a case study to explore the feasibility of a cross-regional intelligent recognition method for loess water content. Initially,we simulated the environmental conditions encountered during in-hole recognition to design an image collection platform, and subsequently prepared a loess water content dataset comprising 32,940 images from these three regions. Based on domain adaptation, we proposed a deep learning recognition method, namely, self-attention-based domain adaptation with Deep EXpectation (DA-DEX Swin). The proposed method demonstrates reasonable accuracy in estimating the water content of loess across regions, with a mean absolute error of 0.807%–1.137%, mean absolute percentage error of 0.074–0.154, and root mean square error of 0.939%–1.546%. The findings of this study provide valuable insights for addressing cross-regional issues, and DA-DEX Swin shows promise for application in robot detection technology to monitor variations and distributions of water content within slopes.