(1) Background: A three-dimensional (3D) real scene is a digital representation of the multidimensional dynamic real-world structure that enables the realistic and stereoscopic expression of actual scenarios, and is an important technological tool for urban refinement management. The above-ground biomass (AGB) of urban forests is an important indicator that reflects the urban ecological environment; therefore, the accurate estimation of AGB is of great significance for evaluating urban ecological functions. (2) Methods: In this study, multiangle aerial photographs of urban street trees were obtained via an unmanned aerial vehicle (UAV) single-lens five-way flight, from 0°, 0°, 90°, 180°, 270°, and five other directions. The multiple view stereo (MVS) algorithm was used to construct three-dimensional realistic models of two tree species: ginkgo and camphor. Then, structural parameters such as tree height, crown diameter, and crown volume were estimated from the 3D real-scene models. Lastly, single-tree AGB models were developed based on structural parameters. (3) Results: The results of this study indicated the following: (A) The UAV visible-light realistic 3D model had clear texture and truly reflected the structural characteristics of two tree species, ginkgo and camphor. (B) There was a significant correlation between the reference tree height, crown diameter and crown volume obtained from the realistic 3D model and the measured values; the R2 for ginkgo height was 0.90, the R2 for camphor crown diameter was 0.87, and the R2 for ginkgo crown volume was 0.89. (C) The accuracy of the AGB estimation models constructed with tree height and canopy volume as variables was generally higher than that of models with tree height and canopy diameter; the model with the highest accuracy of AGB estimation for ginkgo was the linear model with a validation accuracy R2 of 0.96 and RMSE of 8.21 kg, while the model with the highest accuracy of AGB estimation for camphor was the quadratic polynomial model with a validation accuracy R2 of 0.92 and RMSE of 27.74 kg. (4) Conclusions: This study demonstrated that the UAV 3D real-scene model can achieve high accuracy in estimating single-wood biomass in urban forests. In addition, for both tree species, there was no significant difference between the AGB estimates based on the UAV 3D real scene and LiDAR and the measured AGB. These results of urban single-wood AGB estimation based on the UAV 3D real-scene model were consistent with those of LiDAR and even with the measured AGB. Therefore, based on the UAV 3D real-scene model, the single-wood biomass can be estimated with high accuracy. This represents a new technical approach to urban forest resource monitoring and ecological environment function evaluation.