In the practical application of farmland, the soil organic matter prediction model established by the traditional near-infrared (NIR) spectroscopy is affected by factors such as soil texture, which leads to a serious decline in the accuracy of the model. To improve the robustness and prediction accuracy of the model, a prediction model based on NIR spectroscopy and image fusion is proposed. A 1D-CNN organic matter prediction model (based on NIR spectroscopy) was established using eight characteristic wavelengths of extracted soil organic matter (932 nm, 999 nm, 1083 nm, 1191 nm, 1316 nm, 1356 nm, 1583 nm, and 1626 nm) as spectral information. A 2D -CNN organic matter prediction model was established using soil RGB images as information. Based on the idea of model weight fusion, 1D-CNN and 2D-CNN models are fused. When using small convolutional kernels (three-layer convolutional kernel size: 3*3, 1*1, 1*1) and 1D-CNN:2D-CNN = 6:4, the model has the highest prediction accuracy (R 2 = 0.872). The optimal fusion model was embedded into the inspection system. The final laboratory and field testing results are as follows: under laboratory conditions, the detection accuracy R 2 of the 1D CNN prediction model, 2D-CNN prediction model, and fusion model are 0.838, 0.781, and 0.869, respectively. The root mean square error is 3.005, 3.546, and 2.678, respectively. The above experimental data indicates that the R 2 of the fused model is more accurate compared to the model established with a single information. In the field test, the R 2 detection accuracy of 1D-CNN prediction model, 2D-CNN prediction model and fusion model is 0.809, 0.731 and 0.835, respectively. The root mean square errors are 3.466, 3.828 and 2.973, respectively. The results show that the fusion model improves the prediction accuracy and model robustness, and the detection system can meet the needs of soil nutrient detection in farmland.