Abstract

The development of robust acoustic traffic monitoring (ATM) algorithms based on machine learning faces several challenges. The biggest challenge is to collect and annotate large high-quality datasets for algorithm training and evaluation. Such a dataset must reflect a broad variety of vehicle sounds since their emitted acoustic noise patterns depend on a variety of factors such as engine noises at different speeds and road conditions. Additionally, the characteristics of the employed microphones have a strong influence on the data. If microphones with different directionality and frequency responses are used during the model development and the final deployment phase, a data mismatch is caused, which can have a deteriorating effect on the performance of machine learning algorithms. In this paper, the influence of mismatched recording locations and microphone characteristics on the proposed ATM system is investigated. To evaluate these effects, we implement state-of-the-art convolutional neural networks to detect passing vehicles, classify their type, and estimate their speed and direction of movement. The evaluated models perform well on low- and high-quality recordings at different locations when using the same recording device for training and testing. However, the results indicate that microphone mismatch causes several issues, which need to be carefully addressed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.