In this paper, a novel deep learning-based image inpainting framework consisting of restoring image structure and reconstructing image details from corrupted images is proposed. Most image inpainting methods in the literature aim at restoring image details, outlines, and colors, simultaneously, which may suffer from blurring, deformation, and unreasonable content recovery due to interference among various information. To solve these problems, a two-stage image inpainting deep neural network based on GAN (generative adversarial network) architecture is proposed. The proposed inpainting framework consists of two modules: (1) the first stage, called the structure-aware learning stage, aims at learning a GAN-based structure restoration network, focusing on recovering the low-frequency image component, including colors and outlines of the missing regions of the input corrupted image; and (2) the second stage, called the texture-aware learning stage, aims at learning a GAN-based detail refinement network, focusing on rebuilding the high-frequency image details and texture information. In particular, we also propose to remove details from the training images to better train the structure restoration network to avoid inadequate image structure recovery induced by richer image textures, where the detail reconstruction task is left to the second stage. This strategy achieves to balance the workload between the two stages and the image quality can be progressively enhanced through the two stages. Experimental results have shown that the proposed deep inpainting framework quantitatively and qualitatively achieves state-of-the-art performance on the well-known datasets, including the CelebA, Places2, and ImageNet datasets, compared with existing deep learning-based image inpainting approaches. More specifically, in terms of the two well-known image quality assessment metrics, PSNR (peak signal-to-noise ratio) and SSIM (structural similarity), the improvement percentage of the proposed method, compared with the baseline approach, respectively, ranges from 3.23 % to 11.12 %, and 1.95 % to 13.39 %. The improvements have been shown to stably and significantly outperform the compared state-of-the-art methods in most types of inpainting mask. We also show that the proposed method is applicable to image editing in object removal from a single image.