ABSTRACT The foundational step in constructing a digital twin involves transforming the physical world into an immersive 3D virtual environment. Unmanned aerial vehicle (UAV)-based photogrammetry, mobile mapping system (MMS), and terrestrial laser scanning (TLS) serve as powerful tools for 3D reconstruction of urban environments. Given the distinctive strengths and weaknesses of each sensor, a multi-source point cloud registration technique is imperative to achieve a detailed and precisely geolocated 3D urban model. Hence, we developed coarse-to-fine approach methods, called building exterior wall-based (BEWB) and building outline-based (BOB) algorithms, to register point clouds captured by diverse sensors in urban scenes. The BEWB algorithm coarsely registers urban point clouds by extracting building exterior walls, establishing correspondence points, and effectively removing outliers within this correspondence point set. The BOB algorithm precisely registers the urban point clouds acquired from multiple sensors by leveraging building outlines and points corresponding to the ground of the point clouds. To validate the proposed algorithms, we initially registered a UAV-based photogrammetry point cloud with the MMS point cloud and subsequently registered the TLS point clouds. Comprehensive quantitative and qualitative analyzes of the results demonstrate that our algorithms outperform existing methods in achieving precise registration. By successfully registering three distinct point clouds, we generated a comprehensive urban scene point cloud characterized by enhanced precision, point density, and geolocation accuracy.