Crop model assimilation is a crucial technique for improving the accuracy and precision of crop models by integrating observational data and crop models. Although conventional data assimilation methods such as Kalman filtering and variational methods have been widely applied, these methods often face limitations in data quality, model bias, and high computational complexity. This study explored the potential of reinforcement learning (RL) for crop model assimilation, which has the advantage of not requiring large datasets. Based on the WOFOST crop model, two RL environments were constructed: a Daily-Data Driven approach and a Time-Series Driven approach. The Proximal Policy Optimization (PPO) algorithm was used to train these environments for 100,000 iterations. The assimilation results were compared with the commonly used SUBPLEX optimization algorithm using four-year field measurement data and a public dataset with added random errors. Our results demonstrate that the Time-Series Driven RL model achieved assimilation accuracy comparable to the SUBPLEX optimization algorithm, with an average MAE of 0.65 compared to 0.76 for SUBPLEX, and a slight decrease in RMSE, while significantly reducing the computational burden by 365 times. In a multi-year stability test, the Time-Series Driven RL model and SUBPLEX had similar assimilation performance. This study demonstrates the potential of RL for crop model assimilation, providing a novel approach to overcome the limitations of conventional assimilation algorithms. The findings suggest that RL-based crop model assimilation can improve model accuracy and efficiency, with potential for practical applications in precision agriculture.