Optimizing agricultural water resource management is crucial for food production, as effective water management can significantly improve irrigation efficiency and crop yields. Currently, precise agricultural water demand forecasting and management have become key research focuses; however, existing methods often fail to capture complex spatial and temporal dependencies. To address this, we propose a novel deep learning framework that combines remote sensing technology with the UNet-ConvLSTM (UCL) model to effectively integrate spatial and temporal features from MODIS and GLDAS datasets. Our model leverages the high-resolution spatial data from UNet and the temporal dependencies captured by ConvLSTM to significantly improve prediction accuracy. Experimental results demonstrate that our UCL model achieves the best R2\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$R^2$$\\end{document} compared to existing methods, reaching 0.927 on the MODIS dataset and 0.935 on the GLDAS dataset. This approach highlights the potential of AI and remote sensing technologies in addressing critical challenges in agricultural water management, contributing to more sustainable and efficient food production systems.