Abstract. Despite applications of machine learning (ML) models for predicting floods, their transferability for out-of-sample data has not been explored. This paper developed an ML-based model for hindcasting maximum river water depths during major events in coastal watersheds and evaluated its transferability across other events (out-of-sample). The model considered the spatial distribution of influential factors that explain the underlying physical processes to hindcast maximum river water depths. Our model evaluations in a six-digit hydrologic unity code (HUC6) watershed in the northeastern USA showed that the model satisfactorily hindcasted maximum water depths at 116 stream gauges during a major flood event, Hurricane Ida (R2 of 0.94). The pre-trained, validated model was successfully transferred to three other major flood events, hurricanes Isaias, Sandy, and Irene (R2>0.70). Our results showed that ML-based models can be transferable for hindcasting maximum river water depths across events when informed by the spatial distribution of pertinent features, their interactions, and underlying physical processes in coastal watersheds.