Time-dependent diffusion magnetic resonance imaging (TDDMRI) is useful for the non-invasive characterization of tissue microstructure. These models require both densely sampled q-t space data for microstructural fitting, leading to very time-consuming acquisition protocols. To overcome this problem, we present a joint q-t space model-tDKI-Net to estimate diffusion-time dependent kurtosis and the transmembrane exchange, using downsampled q-t space data. The tDKI-Net is composed of several q-Encoders and a t-Encoder, designed based on the extragradient mechanism, each integrated with their respective mapping networks. In the tDKI-Net, two types of encoders along with their mapping networks are employed sequentially to generate kurtosis at individual diffusion times and to fit the transmembrane exchange time ( τm) using the time-dependent kurtosis according the Kärger's model. Meanwhile, we proposed a three-stage training strategy, including physics-informed self-supervised pretraining, DKI warm-up, and joint training, to match the network structure. Our results demonstrated that the proposed tDKI-Net could effectively accelerate tDKI scans, resulting in lower estimation error compared with other methods. Our proposed three-stage training strategy demonstrated superior results than those training from scratch, e.g., the normalized root mean square error (NRMSE) of τm decreased by up to 1.4%. We also investigated the training data size effects and found that although we used one-subject training, the network achieved lower NRMSEs for Kavg, K0 and τm (2.50%, 3.04%, 10.86%) than previous work that used three-subject training (3.8%, 9.5%, 12.1%). tDKI-Net can considerably reduce the scan time by 10.5- fold by joint downsampling the q-t space data without compromising the estimation accuracy.