Plant nitrogen concentration (PNC) is a key indicator reflecting the growth and development status of plants. The timely and accurate monitoring of plant PNC is of great significance for the refined management of crop nutrition in the field. The rapidly developing sensor technology provides a powerful means for monitoring crop PNC. Although RGB images have rich spatial information, they lack the spectral information of the red edge and near infrared bands, which are more sensitive to vegetation. Conversely, multispectral images offer superior spectral resolution but typically lag in spatial detail compared to RGB images. Therefore, the purpose of this study is to improve the accuracy and efficiency of crop PNC monitoring by combining the advantages of RGB images and multispectral images through image-fusion technology. This study was based on the booting, heading, and early-filling stages of winter wheat, synchronously acquiring UAV RGB and MS data, using Gram–Schmidt (GS) and principal component (PC) image-fusion methods to generate fused images and evaluate them with multiple image-quality indicators. Subsequently, models for predicting wheat PNC were constructed using machine-selection algorithms such as RF, GPR, and XGB. The results show that the RGB_B1 image contains richer image information and more image details compared to other bands. The GS image-fusion method is superior to the PC method, and the performance of fusing high-resolution RGB_B1 band images with MS images using the GS method is optimal. After image fusion, the correlation between vegetation indices (VIs) and wheat PNC has been enhanced to varying degrees in different growth periods, significantly enhancing the response ability of spectral information to wheat PNC. To comprehensively assess the potential of fused images in estimating wheat PNC, this study fully compared the performance of PNC models before and after fusion using machine learning algorithms such as Random Forest (RF), Gaussian Process Regression (GPR), and eXtreme Gradient Boosting (XGB). The results show that the model established by the fusion image has high stability and accuracy in a single growth period, multiple growth periods, different varieties, and different nitrogen treatments, making it significantly better than the MS image. The most significant enhancements were during the booting to early-filling stages, particularly with the RF algorithm, which achieved an 18.8% increase in R2, a 26.5% increase in RPD, and a 19.7% decrease in RMSE. This study provides an effective technical means for the dynamic monitoring of crop nutritional status and provides strong technical support for the precise management of crop nutrition.