Detecting the water deficit status of vertical greenery plants rapidly and accurately is a significant challenge in the process of cultivating and planting greenery plants. Currently, the mainstream method involves utilizing a single target detection algorithm for this task. However, in complex real-world scenarios, the accuracy of detection is influenced by factors such as image quality and background environment. Therefore, we propose a multi-stage progressive detection method aimed at enhancing detection accuracy by gradually filtering, processing, and detecting images through a multi-stage architecture. Additionally, to reduce the additional computational load brought by multiple stages and improve overall detection efficiency, we introduce a Swin Transformer based on mobile windows and hierarchical representations for feature extraction, along with global feature modeling through a self-attention mechanism. The experimental results demonstrate that our multi-stage detection approach achieves high accuracy in vertical greenery plants detection tasks, with an average precision of 93.5%. This represents an improvement of 19.2%, 17.3%, 13.8%, and 9.2% compared to Mask R-CNN (74.3%), YOLOv7 (76.2%), DETR (79.7%), and Deformable DETR (84.3%), respectively.
Read full abstract