The recent destructive hurricane seasons and concerns related to the future influence of climate change have increased the relevance of coastal storm hazards and, in particular, of storm surge hazard estimation when discussing the resilience of coastal communities. This hazard is generally represented as surge inundation probabilities over a large number of individual locations in the geographic domain of interest, and is typically assessed utilizing an ensemble of storm scenarios (i.e., storm events) that are representative of the regional climatology. This paper investigates the storm ensemble selection within this setting, with the objective of identifying a small number of storm scenarios that are consistent with some chosen hazard descriptions over a large geographic region. Beyond the storm events, the occurrence rates (i.e., weights) are also updated. Following past works, a two-stage optimization is adopted for the storm selection. The inner-loop identifies the occurrence rates for a given storm subset, formulating the problem as a linear programming optimization for the sum of absolute deviation for the predicted hazard. The outer-loop searches for the best subset with the desired number of storms, adopting a genetic algorithm integer optimization for minimizing the aforementioned deviation. This work extends this implementation to extended coastal regions, with many locations of interest. In this case, it is computationally intractable to consider all the locations in the domain within the linear programming formulation, and for this reason, a subset of representative locations is chosen through cluster analysis. The hazard description for only these locations is used in the storm ensemble selection. For clustering, different strategies using correlation between locations based on geospatial information, surge response, or a combination of both are examined. Additionally, the correlation in the hazard description is better integrated into the storm selection by establishing a modification of the objective function adopted for the outer optimization loop. Applications to different North Atlantic coastal domains are presented as case studies.