Individual tree canopy extraction plays an important role in downstream studies such as plant phenotyping, panoptic segmentation and growth monitoring. Canopy volume calculation is an essential part of these studies. However, existing volume calculation methods based on LiDAR or based on UAV-RGB imagery cannot balance accuracy and real-time performance. Thus, we propose a two-step individual tree volumetric modeling method: first, we use RGB remote sensing images to obtain the crown volume information, and then we use spatially aligned point cloud data to obtain the height information to automate the calculation of the crown volume. After introducing the point cloud information, our method outperforms the RGB image-only based method in 62.5% of the volumetric accuracy. The AbsoluteError of tree crown volume is decreased by 8.304. Compared with the traditional 2.5D volume calculation method using cloud point data only, the proposed method is decreased by 93.306. Our method also achieves fast extraction of vegetation over a large area. Moreover, the proposed YOLOTree model is more comprehensive than the existing YOLO series in tree detection, with 0.81% improvement in precision, and ranks second in the whole series for mAP50-95 metrics. We sample and open-source the TreeLD dataset to contribute to research migration.