Abstract Higher mode surface waves, which can provide additional constraints on subsurface structures in addition to fundamental modes in surface-wave tomography, have been observed from ambient noise cross-correlation functions (CCFs) in sedimentary basins in oceans or near coastlines. However, few studies show that higher mode surface waves can be observed and extracted directly from ambient noise CCFs in inland basins. In this study, we report observations of high signal-to-noise ratio fundamental and the first higher mode Rayleigh waves at a period range of 0.2–1.90 s and 0.2–1.35 s, respectively, from ambient noise CCFs in the southeastern margin of the Tarim basin, the biggest inland basin in China. We confirm the credibility of the first higher mode surface waves by showing that the observed first higher mode dispersion curves are matched with predicted ones calculated from S velocity models solely constrained by fundamental-mode dispersion curves. After the verification of the credibility of the first higher mode surface waves, we demonstrate that the inclusion of the first higher mode dispersion curves helps image deeper structures with an increase of average depths from ∼0.73 to ∼1.24 km, which will be beneficial to future explorations of deep oil and gas resources in the Tarim basin.