Abstract

Vehicles equipped for emergencies like ambulances, fire engines, and police cruisers play a vital role in society by responding quickly to emergencies and helping to prevent loss of life and maintain order. Vehicle sound identification and classification are very important in the cities to identify emergency vehicles easily and to clear the traffic effectively. Convolutional Neural Network plays an important role in the accurate prediction of vehicles during an emergency. The main motive of this paper is to develop a suitable model and algorithms for data augmentation, feature extraction, and classification. The proposed TB-MFCC multifuse feature is comprised of data augmentation and feature extraction. First, in the proposed signal augmentation, each audio signal uses noise injection, stretching, shifting, and pitching separately and this process increases the number of instances in the dataset. The proposed augmentation reduces the overfitting problem in the network. Second, Triangular Bluestein Mel Frequency Cepstral Coefficients (TB-MFCC) are proposed and fused with Zero Crossing Rate (ZCR), Mel-frequency cepstral coefficients (MFCC), Root Mean Square (RMS), Chroma, and Tempogram to extract the exact feature which increases the accuracy and reduces the Mean Squared Error (MSE) of the model during classification. Finally, the proposed Multi-stacked Convolutional Neural Network (MCNN) with Attention-based Bidirectional Long Short Term Memory (A-BiLSTM) improves the nonlinear relationship among the features. The proposed Pooled Multifuse Feature Augmentation (PMFA) with MCNN & A-BiLSTM increases the accuracy (98.66 %), reduces the False Positive Rate (FPR) by 1.01 %, and loss (0 %). Thus the model predicts the sound without overfitting, underfitting, and vanishing gradient problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call