Terahertz communications are envisioned as a promising technology for the sixth generation and beyond wireless systems, which can support wireless links with Terabits-per-second (Tbps) data rates. As the foundation of designing terahertz communications, channel modeling and characterization are crucial to scrutinize the potential of this spectrum. However, current channel modeling in the terahertz band heavily relies on time-consuming and costly measurements. Here, we propose a transfer learning enabled transformer based generative adversarial network to mitigate this problem in terahertz channel modeling. Specifically, as a fundamental building block, a generative adversarial network is exploited to generate channel parameters. To improve the accuracy, a transformer structure with a self-attention mechanism is incorporated in generative adversarial network. Still incurring errors compared with ground-truth measurement, a transfer learning is designed to solve the mismatch between the formulated network and measurement. The proposed method can achieve high accuracy in channel modeling, while requiring only rather limited amount of measurement, which is a promising complement of current channel modeling techniques.