Pedestrian Dataset Research Articles

Video prediction has emerged as a critical task in the field of computer vision. While conventional deep learning-based video prediction models excel in accuracy, they often impose significant computational overhead. Moreover, they tend to disregard crucial factors like inference latency and memory consumption, which may render them unsuitable for real-time video prediction applications. This issue is further exacerbated by the necessity to deploy many of these applications on resource-limited embedded devices. In this study, we introduce an innovative faster yet accurate video prediction (FAVP) model designed to mitigate inference latency and memory footprint, all the while maintaining predictive accuracy. Our model employs an encoder-convertor-decoder architecture with a novel multi-layer convertor. Each layer within the convertor uses consistent kernel sizes to precisely capture dimensional features, while varying kernel sizes between layers effectively capture both local and global features. This combination of uniformity within layers and diversity between layers enhances the model’s ability to grasp dynamic changes. Furthermore, we demonstrate that replacing conventional large-kernel convolutions with involutions significantly trims the model’s parameter count and inference latency, without compromising prediction accuracy. Through comprehensive experiments conducted on resource-adequate x86 platforms, utilizing KTH, TrafficBJ, Human3.6M, Moving MNIST, and KTH-Enhanced datasets, we showcase our model’s exceptional prediction performance, particularly in terms of inference latency. Additionally, we evaluate the model on resource-constrained NVIDIA Jetson Nano platform using KITTI & CalTech Pedestrian dataset. The results underscore our model’s superior inference speed and its competence in meeting real-time video prediction requirements, even under stringent resource limitations.

Read full abstract

Due to the global population increase and the recovery of agricultural demand after the COVID-19 pandemic, the importance of agricultural automation and autonomous agricultural vehicles is growing. Fallen person detection is critical to preventing fatal accidents during autonomous agricultural vehicle operations. However, there is a challenge due to the relatively limited dataset for fallen persons in off-road environments compared to on-road pedestrian datasets. To enhance the generalization performance of fallen person detection off-road using object detection technology, data augmentation is necessary. This paper proposes a data augmentation technique called Automated Region of Interest Copy-Paste (ARCP) to address the issue of data scarcity. The technique involves copying real fallen person objects obtained from public source datasets and then pasting the objects onto a background off-road dataset. Segmentation annotations for these objects are generated using YOLOv8x-seg and Grounded-Segment-Anything, respectively. The proposed algorithm is then applied to automatically produce augmented data based on the generated segmentation annotations. The technique encompasses segmentation annotation generation, Intersection over Union-based segment setting, and Region of Interest configuration. When the ARCP technique is applied, significant improvements in detection accuracy are observed for two state-of-the-art object detectors: anchor-based YOLOv7x and anchor-free YOLOv8x, showing an increase of 17.8% (from 77.8% to 95.6%) and 12.4% (from 83.8% to 96.2%), respectively. This suggests high applicability for addressing the challenges of limited datasets in off-road environments and is expected to have a significant impact on the advancement of object detection technology in the agricultural industry.

Read full abstract

Pedestrian Dataset Research Articles

Articles published on Pedestrian Dataset

Video Anomaly Detection Based on Global–Local Convolutional Autoencoder

YOLO-ESL: An Enhanced Pedestrian Recognition Network Based on YOLO

Ranking-based adaptive query generation for DETRs in crowded pedestrian detection

Towards faster yet accurate video prediction for resource-constrained platforms

Data compensation and feature fusion for sketch based person retrieval

Pseudo-Multispectral Pedestrian Detection with Deep Thermal Feature Guidance

An Improved Pedestrian Detection Model Based on YOLOv8 for Dense Scenes

GPT-4V Takes the Wheel: Promises and Challenges for Pedestrian Behavior Prediction

PEDESTRIAN ALERT SYSTEM USING DEEP LEARNING

Enhanced YOLOX with United Attention Head for Road Detetion When Driving

Pedestrian trajectory prediction method based on automatic driving

Automated Region of Interest-Based Data Augmentation for Fallen Person Detection in Off-Road Autonomous Agricultural Vehicles.

A spatiotemporal motion prediction network based on multi-level feature disentanglement

Mm-CasGAN: A cascaded adversarial neural framework for mmWave radar point cloud enhancement

CrossFormer: Cross-guided attention for multi-modal object detection

Improving RGB-infrared object detection with cascade alignment-guided transformer

Pedestrian detection algorithm integrating large kernel attention and YOLOV5 lightweight model.

A comprehensive review of pedestrian re-identification based on deep learning

Contour Information-Guided Multi-Scale Feature Detection Method for Visible-Infrared Pedestrian Detection.

NIRPed: A Novel Benchmark for Nighttime Pedestrian and Its Distance Joint Detection

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Pedestrian Dataset Research Articles

Articles published on Pedestrian Dataset

Video Anomaly Detection Based on Global–Local Convolutional Autoencoder

YOLO-ESL: An Enhanced Pedestrian Recognition Network Based on YOLO

Ranking-based adaptive query generation for DETRs in crowded pedestrian detection

Towards faster yet accurate video prediction for resource-constrained platforms

Data compensation and feature fusion for sketch based person retrieval

Pseudo-Multispectral Pedestrian Detection with Deep Thermal Feature Guidance

An Improved Pedestrian Detection Model Based on YOLOv8 for Dense Scenes

GPT-4V Takes the Wheel: Promises and Challenges for Pedestrian Behavior Prediction

PEDESTRIAN ALERT SYSTEM USING DEEP LEARNING

Enhanced YOLOX with United Attention Head for Road Detetion When Driving

Pedestrian trajectory prediction method based on automatic driving

Automated Region of Interest-Based Data Augmentation for Fallen Person Detection in Off-Road Autonomous Agricultural Vehicles.

A spatiotemporal motion prediction network based on multi-level feature disentanglement

Mm-CasGAN: A cascaded adversarial neural framework for mmWave radar point cloud enhancement

CrossFormer: Cross-guided attention for multi-modal object detection

Improving RGB-infrared object detection with cascade alignment-guided transformer

Pedestrian detection algorithm integrating large kernel attention and YOLOV5 lightweight model.

A comprehensive review of pedestrian re-identification based on deep learning

Contour Information-Guided Multi-Scale Feature Detection Method for Visible-Infrared Pedestrian Detection.

NIRPed: A Novel Benchmark for Nighttime Pedestrian and Its Distance Joint Detection