This study presents an innovative real-time urban microclimate simulation strategy using a large-eddy simulation (LES) solver deployed on a multi-GPU architecture. The present LES solver is developed using a monolithic projection-based method with staggered time discretization, ensuring efficient computation and a high CFL number. It is hosted on a multi-GPU platform using CUDA Fortran and CUDA-aware MPI, which enhances its performance. To capture the complexity of urban geometries, we utilized a building-resolved, wall-modeled LES augmented by an immersed boundary method. First, we validated the results in an isolated building by comparing them with wind tunnel profiles. Next, we evaluated the developed LES solver using an idealized street array. A set of evaluation metrics, namely FAC2 and hit rate, was employed to determine the optimal grid configuration that produces physically valid results. Moreover, we proposed a real-time indicator that provides insight into the interplay among grid resolution, simulation domain size, and the number of GPUs. This facilitates feasibility assessments for real-time simulations. The performance of the LES solver was further tested using real urban geometry, conducting simulations over an area of 10.49 km2 in Seoul, with a grid resolution of 4 m. The simulation results efficiently highlighted variations in wind patterns with altitude, demonstrating its potential to provide useful wind information for pedestrian-level wind comfort assessment and urban air mobility.