Abstract

Real-time load information in public transport is of high importance for both passengers and service providers. Neural algorithms have shown a high performance on various object counting tasks and play a continually growing methodological role in developing automated passenger counting systems. However, the publication of public-space video footage is often contradicted by legal and ethical considerations to protect the passengers’ privacy. This work proposes an end-to-end Long Short-Term Memory network with a problem-adapted cost function that learned to count boarding and alighting passengers on a publicly available, comprehensive dataset of approx.13,000 manually annotated low-resolution 3D LiDAR video recordings (depth information only) from the doorways of a regional train. These depth recordings do not allow the identification of single individuals. For each door opening phase, the trained models predict the correct passenger count (ranging from 0 to 67) in approx.96% of boarding and alighting, respectively. Repeated training with different training and validation sets confirms the independence of this result from a specific test set.

Highlights

  • T HE day-to-day operational management of transport systems relies on large networks of sensors, actuators, and software to provide passengers with safe, reliable, and affordable means of transportation

  • We present a series of experiments that demonstrate the high passenger counting accuracy of Neural Automated Passenger Counting system (NAPC) on the previously introduced dataset

  • We introduced a real-time Neural Automated Passenger Counting system (NAPC), which is based on an end-to-end Long Short-Term Memory (LSTM) recurrent neural network

Read more

Summary

INTRODUCTION

T HE day-to-day operational management of transport systems relies on large networks of sensors, actuators, and software to provide passengers with safe, reliable, and affordable means of transportation. They conducted experiments using a 7-segment bus monitoring video where the segments have different characteristics, such as dark/strong outside light, crowded while getting on/off, including passengers carrying children, including passengers with babies and children Their APC counted all 28 boarding passengers and 79 of 81 alighting passengers, resulting in an accuracy of 98%. Sun et al introduced a depth video stream generating the method from RGB-D videos obtained by a camera mounted on top of the door area of three different buses They propose a boarding and alighting passenger counting method combining a two-step (generating and refining head proposal) head detection with a tracking algorithm for the generated depth video samples [33]. Low-resolution depth images obtained by RGB-D sensors are used for privacy-preserving human pose estimation in [43] and for head detection in the task of counting boarding and alighting passengers [33]. Top-view depth images from a video surveillance system are employed for detecting people [7], as well as people committing attacks and intrusions [8]

Contributions
THE Berlin-APC DATASET
The Network Architecture
Data Augmentation
Simple Loss
Refined Loss
Learning Procedure
RESULTS
Classification Performance
Regression Performance
Counting Metrics
Resolution Tradeoff
Hyperparameter Validation
Minimal Required Train Set Size
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.