Identification of best performing fertilizer practices with on-farm trials is challenging, in particular in rainfed farming due to weather uncertainty. However, it remains crucial to test a range of viable practices to ascertain their performances, given that they are not known beforehand. This process also involves the testing of practices that could potentially yield inferior results in comparison to the best available practice(s). To identify a best management practice, an “intuitive strategy” typically sets up multi-year, multi-location field trials, wherein each practice is tested in a proportionally equal manner over a set number of years. Our objective was to provide an identification strategy for nitrogen fertilizer management designing a bandit learning algorithm. We aimed for the bandit algorithm to be better at minimizing farmers’ losses occurring from the testing of management practices that do not perform best, compared with the “intuitive strategy” that was formulized as the Explore-Then-Commit strategy. Our case study was for maize production in southern Mali. Bandit framework is a machine learning approach in which an agent learns from the feedback over time and accordingly selects actions in order to maximize its cumulative reward in the long term. To mimic the maize responses to nitrogen fertilization, we used the Decision Support System for Agrotechnology Transfer (DSSAT) crop model. We compared nitrogen fertilizer practices using a risk-aware measure, the Conditional Value-at-Risk (CVaR), and a novel agronomic metric, the Yield Excess (YE). The YE accounts for both grain yield and agronomic nitrogen use efficiency. The bandit algorithm performed better than the intuitive strategy: it minimized farmers’ yield losses during the identification process. This study is a methodological step which opens up new horizons for risk-aware identification of the performance of a range of crop management practices in real conditions.