Synthetic aperture radar tomography (TomoSAR) has been proven to be a useful way to reconstruct vertical structure over forest areas with P-band images, on account of its three-dimensional imaging ability. In the case of a small number of non-uniformly distributed acquisitions, compressive sensing (CS) is generally adopted in TomoSAR. However, the performance of CS depends on the selected hyperparameter, which is closely related to the noise of a pixel. In this paper, to overcome this limitation, we propose a sparse iterative covariance-based estimation (SPICE) approach based on the wavelet and orthogonal sparse basis (W&O-SPICE) for application over forest areas. SPICE is a sparse spectral estimation method that achieves a high vertical resolution, and takes account of the noise adaptively for each resolution cell. Thus, it does not require the user to select a hyperparameter. Furthermore, the used sparse basis not only ensures the sparsity of the forest canopy scattering contribution, but it can also keep the original sparse information of the ground contribution. The proposed method was tested in simulated experiments and the results demonstrated that W&O-SPICE can successfully reconstruct the vertical structure of a forest. Moreover, three P-band fully polarimetric airborne SAR images with non-uniformly distributed baselines were applied to reconstruct the vertical structure of a tropical forest in Mabounie, Gabon. The underlying topography and forest height were estimated, and the root-mean-square errors (RMSEs) were 6.40 m and 4.50 m with respect to the LiDAR digital terrain model (DTM) and canopy height model (CHM), respectively. In addition, W&O-SPICE showed a better performance than W&O-CS, beamforming, Capon, and the iterative adaptive approach (IAA).