Abstract

This work investigates how to detect emergency vehicles such as ambulances, fire engines, and police cars based on their siren sounds. Recognizing that car drivers may sometimes be unaware of the siren warnings from the emergency vehicles, especially when in-vehicle audio systems are used, we propose to develop an automatic detection system that determines whether there are siren sounds from emergency vehicles nearby to alert other vehicles' drivers to pay attention. A convolutional neural network (CNN)-based ensemble model (SirenNet) with two network streams is designed to classify sounds of traffic soundscape to siren sounds, vehicle horns, and noise, in which the first stream (WaveNet) directly processes raw waveform, and the second one (MLNet) works with a combined feature formed by MFCC (Mel-frequency cepstral coefficients) and log-mel spectrogram. Our experiments conducted on a diverse dataset show that the raw data can complement the MFCC and log-mel features to achieve a promising accuracy of 98.24% in the siren sound detection. In addition, the proposed system can work very well with variable input length. Even for short samples of 0.25 seconds, the system still achieves a high accuracy of 96.89%. The proposed system could be helpful for not only drivers but also autopilot systems.

Highlights

  • Siren is a special signal sounded by alarm systems or emergency service vehicles such as fire trucks, police cars, and ambulances

  • To the best of our knowledge, the collection and consideration of an extensive siren sound dataset captured in many countries, are first introduced in this work; especially the large dataset is collected in real-life environments where include different levels of noise, collection distances, and the Doppler Effect; (2) We propose a 2-dimensional neural network (2D-convolutional neural network (CNN)) model for emergency vehicle detection system (EVD) based on the combination of the Mel-frequency cepstral coefficients (MFCC) and log-mel spectrogram features

  • Our experiment results indicate that MLNet yields higher accuracy compared to the related works, which proves that the aggregated features are beneficial for acoustic-based EVD; (3) We further develop an end-to-end 1-dimensional convolutional neural network (1D-CNN) model which automatically learns from raw waveform the useful features for classification, our experiment results show the promising accuracy obtained with this model; (4) We propose an ensemble architecture of MLNet and WaveNet to boost the detection accuracy and to prove the complementary relationship between the raw features and handcrafted features in acoustic-based EVD; (5) Last but not least, the success of this work is a good fundamental for the applications listed above

Read more

Summary

Introduction

Siren is a special signal sounded by alarm systems or emergency service vehicles such as fire trucks, police cars, and ambulances. Private cars’ drivers may sometimes not listen to nearby siren sounds due to the interference of the in-car audio signal, the modern car’s soundproofing ability, or even the distraction of drivers themselves. This problem could lead to a delay in providing emergency services or even traffic accidents because of inappropriate communication and cooperation. This study proposes an acoustic-based method to detect the presence of emergency vehicles on the road At this stage, we focus on the detection of siren sounds from standard emergency vehicles including ambulances and fire engines, and police cars. In view of the fact that each country may have itself regulation on the types and frequency band of siren sounds, we aim to develop an

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call