Modern physical design flows highly depend on design space exploration to find the commercial tools’ clock tree synthesis (CTS) parameters that lead to optimized clock trees. However, such exploration is often time-consuming and computationally inefficient. In this article, we overcome this drawback by proposing a novel framework named GAN-CTS, which utilizes conditional generative adversarial network (GAN) to predict and optimize CTS outcomes. Our framework is built upon three sequential learning stages. First, to precisely characterize distinct designs, we leverage transfer learning to extract netlist features directly from placement images. Second, we perform regression learning using various methods to predict the target CTS outcomes and demonstrate that the proposed multitask learning approach achieves better accuracy than the meta-modeling method adopted by previous works. To fully benefit from the predictions made by our framework, we further quantitatively interpret the importance of each CTS input parameter subject to various design objectives through attribution-based learning. Finally, generative adversarial learning is leveraged to optimize the target clock metrics with the guidance provided by the pretrained regression model. To substantiate the generality of our framework, we perform validations on four unseen netlists that are not utilized in the training process. The experimental results conducted on real-world designs demonstrate that our framework: 1) achieves an average prediction error of 3%; 2) improves the commercial tool’s auto-generated clock tree by 20.7% in clock power, 21.5% in clock wirelength, 36.1% in the worst skew; and 3) reaches an F1-score of 0.93 in the classification task of determining successful and failed CTS runs.