AbstractScientific and technical changes to flood forecasting models are implemented to improve forecasts. However, responses to such changes are complex, particularly in global models, and evaluation of improvements remains focussed on generalised skill assessments and not on the most relevant outcomes for those taking decisions. Recently, the Global Flood Awareness System (GloFAS) flood forecasting model has been upgraded from version 2.1 to 3.1 with a significant change to its hydrological model structure. In the updated version 3.1, a single fully configured hydrological model (LISFLOOD) has been adopted, including ground water and river routing processes, instead of two coupled models, a land surface and a simplified hydrological model, of the previous version 2.1. This study aims to evaluate changes in the simulated behaviour of floods and the forecast skill of the two GloFAS versions based on different decision criteria for early action. We evaluate GloFAS reforecasts for the Brahmaputra and the Ganges Rivers in Bangladesh for the period 1999–2018. For the Brahmaputra River, the old GloFAS 2.1 version performs better than the 3.1 version, especially in predicting low‐ (90th percentile) and medium‐level (95th percentile) floods. For the Ganges, GloFAS 3.1 shows improved probability of detection of low‐ to medium‐level floods compared to version 2.1, especially for lead times longer than 10 days. Both versions show limited skill for more extreme floods (99th percentile) but results are less robust for these less frequent floods given the lower number of events. Using lead‐time dependent thresholds improves the false alarm ratio while reducing the probability of detection. The changes in model structures influence the model performance in a complex and varied way and forecast skill needs further investigation across regions and decision‐making criteria. Understanding the skill changes between different model versions is important for decision‐makers; however, focused case studies such as this should also be used by model developers to guide future changes to the system to ensure that they lead to improvements in decision‐making ability.