Abstract

In the domain of target detection in mobile and embedded devices, neural network model inference speed is a crucial metric. This paper introduces YOLO-FLNet, a lightweight algorithm for detecting people in open scenes. The model utilizes the DFEM structure to capture and process high-frequency and low-frequency information in the feature map. Additionally, the VoV-DFEM structure, based on the concept of one-shot aggregation, enhances feature aggregation from different scales and frequencies in the backbone network. To validate its performance, experiments were conducted using publicly available datasets on a computer with dedicated GPUs. As a result, compared to YOLOv7-tiny, YOLO-FLNet achieved a 0.3% mAP@0.5 improvement, reduced parameter size by 52.9%, and increased inference speed by 30.2%. These characteristics make it valuable for person detection in engineering domains, providing theoretical guidance for lightweight models in edge computing.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call