Accurately detecting the maturity and 3D position of flowering Chinese cabbage (Brassica rapa var. chinensis) in natural environments is vital for autonomous robot harvesting in unstructured farms. The challenge lies in dense planting, small flower buds, similar colors and occlusions. This study proposes a YOLOv8-Improved network integrated with the ByteTrack tracking algorithm to achieve multi-object detection and 3D positioning of flowering Chinese cabbage plants in fields. In this study, C2F-MLCA is created by adding a lightweight Mixed Local Channel Attention (MLCA) with spatial awareness capability to the C2F module of YOLOv8, which improves the extraction of spatial feature information in the backbone network. In addition, a P2 detection layer is added to the neck network, and BiFPN is used instead of PAN to enhance multi-scale feature fusion and small target detection. Wise-IoU in combination with Inner-IoU is adopted as a new loss function to optimize the network for different quality samples and different size bounding boxes. Lastly, ByteTrack is integrated for video tracking, and RGB-D camera depth data are used to estimate cabbage positions. The experimental results show that YOLOv8-Improve achieves a precision (P) of 86.5% and a recall (R) of 86.0% in detecting the maturity of flowering Chinese cabbage. Among them, mAP50 and mAP75 reach 91.8% and 61.6%, respectively, representing an improvement of 2.9% and 4.7% over the original network. Additionally, the number of parameters is reduced by 25.43%. In summary, the improved YOLOv8 algorithm demonstrates high robustness and real-time detection performance, thereby providing strong technical support for automated harvesting management.