The surge of data traffic in wireless networks necessitates the provision of high-quality data services to meet users’ satisfaction levels. However, the limited spectral resources of the current network infrastructures and inherent challenges of achieving reliable line-of-sight (LoS) probability for ground users (GUs) in urban environments often lead to disruption to communication services delivery. This paper aims to address the challenges of frequent handover (HO) failures and disrupted communication services for mobile GUs by deploying an unmanned aerial vehicle as a flying base station (UAV-BS) in heterogeneous networks (HetNets). A channel model is investigated that considers both LoS and non-line-of-sight (NLoS) paths in three-dimensional (3D) air-to-ground (A2G) links using a detailed mathematical model with urban infrastructure parameters like building density and heights. In addition, a reinforcement learning (RL) algorithm is presented in this work to optimize UAV trajectories in response to the dynamic mobility of GUs for enhancing LoS connections. The proposed algorithm dynamically adjusts the UAV positions and enhances transmission channels by identifying both LoS and NLoS paths. Simulation results demonstrate that the proposed algorithm outperforms existing benchmarks through learning-based adaptive control of UAVs’ mobility, ensuring ubiquitous network connectivity for GUs and reducing HO failures in HetNets.