Abstract

A major problem in Acoustic Scene Classification (ASC) is a representation of an acoustic scene, which serves to be an important task for ASC. This study used Linear Prediction Cepstral Coefficients (LPCC) and Spectral Centroid Magnitude Cepstral Coefficients (SCMC) features along with log-Mel band energies for the representation of an acoustic scene. Deep Neural Networks (DNN) is being used to model the Acoustic Scene Classification (ASC). LPCCs are used to capture the changes in the auditory spectrum with time and SCMCs are used to capture the weighted average magnitude finely for a given acoustic scene subband. log-Mel band energies are used to capture the spectral envelopes of audio frame. The DNN architecture is used for audio track level classification. We have experimented on Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 development dataset and DCASE 2017 dataset. We carried out experiments with individual feature sets, and also performed decision level DNN score fusions for improving the performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call