Abstract
In this study, we propose a polyphonic sound event detection method based on a hybrid system of Convolutional Bidirectional Long Short-Term Memory Recurrent Neural Network and Hidden Markov Model (CBLSTM-HMM). Inspired by the state-of-the-art approach to integrating neural networks to HMM in speech recognition, the proposed method develops the hybrid system using CBLSTM to estimate the HMM state output probability, making it possible to model sequential data while handling its duration change. The proposed hybrid system is capable of detecting a segment of each sound event without post-processing, such as a smoothing process of detection results over multiple frames, usually required in the frame-wise detection methods. Moreover, we can easily apply it to a multi-label classification problem to achieve polyphonic sound event detection. We conduct experimental evaluations using the DCASE2016 task two dataset to compare the performance of the proposed method to that of the conventional methods, such as non-...
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.