Crosshole ground penetrating radar (GPR) is an efficient method for ensuring the quality of retaining structures without the need for excavation. However, interpreting crosshole GPR data is time-consuming and prone to inaccuracies. To address this challenge, we proposed a novel three-dimensional (3D) reconstruction method based on a generative adversarial network (GAN) to recover 3D permittivity distributions from crosshole GPR images. The established framework, named CGPR2VOX, integrates a fully connected layer, a residual network, and a specialized 3D decoder in the generator to effectively translate crosshole GPR data into 3D permittivity voxels. The discriminator was designed to enhance the generator’s performance by ensuring the physical plausibility and accuracy of the reconstructed models. This adversarial training mechanism enables the network to learn non-linear relationships between crosshole GPR data and subsurface permittivity distributions. CGPR2VOX was trained using a dataset generated through finite-difference time-domain (FDTD) simulations, achieving precision, recall and F1-score of 91.43%, 96.97% and 94.12%, respectively. Model experiments validate that the relative errors of the estimated positions of the defects were 1.67%, 1.65%, and 1.30% in the X-, Y-, and Z-direction, respectively. Meanwhile, the method exhibits noteworthy generalization capabilities under complex conditions, including condition variations, heterogeneous materials and electromagnetic noise, highlighting its reliability and effectiveness for practical quality assurance of retaining structures.