This paper studies the use of the Fully Convolutional Networks (FCN) model in the extraction of water bodies from Very High spatial Resolution (VHR) optical images in the case of limited training samples. Two different seasonal GaoFen-2 images with a spatial resolution of 0.8 m in the south of the Beijing metropolitan area were used to extensively validate the FCN model. Four key factors including input features, training data, transfer learning, and data augmentation related to the performance of the FCN model were empirically analyzed by using 36 combinations of various parameter settings. Our findings indicate that the FCN-based method can work as a robust and cost-effective tool in the extraction of water bodies from VHR images. The FCN-based method trained on a small amount of labeled L1A data can also significantly outperform the Normalized Difference Water Index (NDWI) based method, the Support Vector Machine (SVM) based method, and the Sparsity Model (SM) based method, even when radiometric normalization and spatial contexts are introduced to preprocess the input data for the latter three methods. The advantages of the FCN-based method are mainly due to its capability to exploit spatial contexts in the image, especially in urban areas with mixed water and shadows. Though the settings of four key factors significantly affect the performance of the FCN based method, choosing a qualified setting for the FCN model is not difficult. Our lessons learned from the successful use of the FCN model for the extraction of water from VHR images can be extended to extract other land covers.