Semantic maps play a crucial role in smart agriculture, providing practical three-dimensional fruit tree data for orchard management and aiding the optimization of management strategies and improvement of economic benefits. However, previous map studies have mainly focused on geometric features and have lacked semantic information, limiting robots' ability to reason about useful information in complex tasks and achieve human–machine interaction. Furthermore, most existing map reconstructions have been offline or nonreal-time, making it difficult to satisfy the needs of real-time decision-making and planning in agricultural scenarios. Therefore, this paper proposes a real-time localization and semantic map reconstruction method for unstructured citrus orchards, integrating the visual-inertial SLAM VINS-RGBD framework with the semantic segmentation algorithm BiSeNetV1. By conducting semantic segmentation of 2D RGB images and mapping them to point clouds, a 3D semantic point cloud map is reconstructed. The statistical outlier removal filter and OctoMap are introduced for postprocessing to remove outliers and estimate obstacles in 3D space, constructing a more accurate, efficient and flexible map. The experimental results show that the proposed method achieved a semantic segmentation accuracy mIoU of 79.31% on a self-built citrus dataset, a citrus recall relative error of 11.29% and a localization accuracy mean translational error of 1.917 m with the map constructed under an unstructured orchard scenario. Additionally, the average memory saving rate of the statistical outlier removal filter was 10.36%, and the average memory saving rate of OctoMap was 97.39%. The processing time for each frame of real-time front-end feature detection and tracking was 11.14 ms. Moreover, the deployed semantic segmentation network BiSeNetV1 achieved a processing time of 7.35 ms per frame. These results indicate that the proposed method can achieve both high accuracy and real-time performance in semantic map reconstruction. This exploratory work provides theoretical and technical references for future research on more precise localization and more complete semantic mapping and has extensive application potential, providing essential technical support for intelligent agriculture.