Autonomous vehicle scenarios often involve occluded and distant pedestrians, leading to missed and false detections or models that are too large to deploy. To address these issues, this study proposed a lightweight model based on Yolov8s. Feature extraction and fusion networks were redesigned to optimize the detection layer for better detection. The Backbone Network incorporated Dual Conv and ELAN to create the EDLAN module. The EDLAN module and optimized SPPF-LSKA improved the small-scale pedestrian feature extraction in complex backgrounds while reducing the parameters and computation. In Neck Network, BiFPN and VoVGSCSP enhance pedestrian features and improve detection. In addition, the WIoU loss function addressed the target imbalance to enhance generalization ability and overall performance. Enhanced Yolov8s was trained and validated using the CityPersons dataset. Compared to Yolov8s, it improved the precision, recall, F1 score, and mAP@50 by 5.2%, 7.2%, 6.8%, and 6.8%, respectively, while reducing the parameters by 68% and compressing the model size by 67%. The validation experiments were conducted on Caltech and BDD100K datasets. The result demonstrated that precision increased by 3.4% and 1.1%, the mAP@50 also increased by 7.6% and 2.8%, respectively. The modified model reduced the model parameters and size while effectively improving the detection accuracy, making it highly valuable for autonomous driving scenarios.
Read full abstract