Urban land surface temperature (ULST) is a key surface feature parameter in urban heat island studies. However, the geometry and adjacency effects are usually neglected in conventional land surface temperature (LST) retrieval methods. In this study, considering the geometry and adjacency effects, a new urban canopy multiple-scattering thermal radiative transfer (UCM-RT) model and an urban temperature-emissivity separation (UTES) algorithm were developed for ULST retrieval from Chinese GaoFen-5 (GF-5) satellite thermal infrared (TIR) images. The UCM-RT model and the UTES algorithm incorporate multiple scattering within the urban building canopy and consider the thermal radiance contribution from adjacent pixels and the atmosphere. Two schemes were designed to evaluate the geometry and adjacency effects quantitatively in this study: (1) evaluating their impact on ground temperature and top of atmosphere (TOA) brightness temperature for GF-5 four TIR bands from the simulation dataset, and (2) comparing the retrieval temperature difference between the UTES algorithm and conventional temperature-emissivity separation (TES) algorithm (without considering geometry and adjacency effects). For a specific simulation situation, the magnitude of the geometric effects on urban ground temperatures were found to be approximately 3.2 K, 4.2 K, 4.0 K, and 3.9 K for the four GF-5 TIR bands, while the effects on the TOA brightness temperature reached 2.4 K, 3.3 K, 3.3 K, and 3.0 K for those bands, respectively. The results show that the geometry effects have a significant influence on the thermal radiative transfer process. The temperature error of the UTES algorithm was generally up to 0.8 K, much lower than that of the conventional TES algorithm (~2.2 K) in urban surface. The results indicate that the UTES algorithm is suitable for ULST retrieval, and the retrieval temperature difference between the UTES algorithm and conventional TES algorithm ranges from 0.1 K to 2.1 K, especially for dense building pixels. Additionally, the thermal radiation from adjacent pixels in high-spatial-resolution TIR images has significant influence on ULST retrieval. The comparison between the ULST algorithm for urban land surface and the conventional TES algorithm for the flat surface indicated that the three-dimensional structure inside the pixel has a significant influence on the ULST retrieval. The findings of this study indicate that the geometry and adjacency effects must be considered in ULST retrieval for highly accurate results.