As an airborne electromagnetic method induced by natural sources, the Z-axis tipper electromagnetic (ZTEM) system can primarily recover near-surface shallow structures, due to band-limited frequencies (usually 30–720 Hz) of the airborne survey and high sample rate acquisition along the terrain. In contrast, traditional ground magnetotellurics (MT) allows better recovery of deep structures as the data acquired are typical of large site intervals (usually higher than 1 km) and lower frequencies (usually lower than 400 Hz). High-resolution MT surveys allow for shallow small and deep large anomalies to be adequately interpreted but need large site intervals and broadband frequency range, which are seldom used as they are quite costly and laborious. ZTEM data are tippers that relate local vertical to orthogonal horizontal fields, measured at a reference station on the ground. As the 1D structures produce zero vertical magnetic fields, ZTEM is not sensitive to background resistivity. Thus, in general, ZTEM can only reveal relative resistivities and not real resistivities. A combination of the ZTEM and MT methods can be an effective technique, alleviating the shortcomings of the individual methods. At present, complex underground structures and topography introduce difficulties for data inversion and interpretation, as conventional ZTEM and MT forward modeling are generally used on structured grids with limited accuracy. To effectively recover complex underground structures with topography, we developed a 3D framework for joint MT and ZTEM inversion with unstructured tetrahedral grids. The finite element method is used for the forward problem because of its flexibility with unstructured tetrahedral meshes. The limited-memory quasi-Newton algorithm (L-BFGS) for optimization is used to solve the joint inverse problem, which saves memory and computational time by avoiding the explicit calculation of the Hessian matrix. To validate our joint inversion algorithm, we run numerical experiments on two synthetic models. The first synthetic model uses two conductive anomalous bodies of different sizes and depths. At the same time, a simple quadrangular is used for comparing the inversions with and without topography. In contrast, the second synthetic model represents a realistic topography with two different conductivities and the same depth. Both single-domain and joint inversions of the ZTEM and MT data are carried out for the two synthetic models to demonstrate the complementary advantages of joint inversion, while the second model is also used to test the adaptability of the joint inversion to complex topography. The results demonstrate the effectiveness of the finite element method with unstructured tetrahedral grids and the L-BFGS method for joint MT and ZTEM inversion. In addition, the inversion results prove that joint MT and ZTEM inversion can recover deep structures from the MT data and small near-surface structures from the ZTEM data by alleviating the weaknesses of the individual methods.