Urban water plays a significant role in the urban ecosystem, but urban water extraction is still a challenging task in automatic interpretation of synthetic aperture radar (SAR) images. The influence of radar shadows and strong scatters in urban areas may lead to misclassification in urban water extraction. Nevertheless, the local features captured by convolutional layers in Convolutional Neural Networks (CNNs) are generally redundant and cannot make effective use of global information to guide the prediction of water pixels. To effectively emphasize the identifiable water characteristics and fully exploit the global information of SAR images, a modified Unet based on hybrid attention mechanism is proposed to improve the performance of urban water extraction in this paper. Considering the feature extraction ability and the global modeling capability in SAR image segmentation, the Channel and Spatial Attention Module (CSAM) and the Multi-head Self-Attention Block (MSAB) are both introduced into the proposed Hybrid Attention Unet (HA-Unet). In this work, Resnet50 is adopted as the backbone of HA-Unet to extract multi-level features of SAR images. During the feature extraction process, CSAM based on local attention is adopted to enhance the meaningful water features and ignore unnecessary features adaptively in feature maps of two shallow layers. In the last two layers of the backbone, MSAB is introduced to capture the global information of SAR images to generate global attention. In addition, two global attention maps generated by MSAB are aggregated together to reconstruct the spatial feature relationship of SAR images from high-resolution feature maps. The experimental results on Sentinel-1A SAR images show that the proposed urban water extraction method has a strong ability to extract water bodies in the complex urban areas. The ablation experiment and visualization results vividly indicate that both CSAM and MSAB contribute significantly to extracting urban water accurately and effectively.