Abstract

Sound event detection (SED) has been widely applied in different applications such as smart home, video surveillance, environmental monitoring. The SED models which are based on neural network (NN) have attracted lots of attention due to its high detection accuracy. However, the existing NN-based SED models have high computational complexity in terms of both the number of parameters and the number of multiply accumulates (MACs) operations which leads to significant processing time, power consumption, and memory storage, making it unsuitable for the Internet of Things (IoT) devices with constrained power consumption and resource. To address the above issue, a low complexity SED model (named LCSED) with a hybrid convolution scheme and a lightweight dual-attention scheme is proposed to reduce the number of parameters and MACs operations while maintaining high detection accuracy. The proposed LCSED model is evaluated on the DCASE2017 task4 public dataset. Compared with several state-of-the-art methods, the computational complexity is significantly reduced (up to 48.8 times and 2.50 times for parameters and MACs operations respectively) while maintaining high detection accuracy. The proposed LCSED model is suitable for sound event detection in power & resource constrained IoT devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call