Abstract

Environmental sound signals are multi-source, heterogeneous, and varying in time. Many systems have been proposed to process such signals for event detection in ambient assisted living applications. Typically, these systems use feature extraction, selection, and classification. However, despite major advances, several important questions remain unanswered, especially in real-world settings. This paper contributes to the body of knowledge in the field by addressing the following problems for ambient sounds recorded in various real-world kitchen environments: (1) which features and which classifiers are most suitable in the presence of background noise? (2) what is the effect of signal duration on recognition accuracy? (3) how do the signal-to-noise-ratio and the distance between the microphone and the audio source affect the recognition accuracy in an environment in which the system was not trained? We show that for systems that use traditional classifiers, it is beneficial to combine gammatone frequency cepstral coefficients and discrete wavelet transform coefficients and to use a gradient boosting classifier. For systems based on deep learning, we consider 1D and 2D Convolutional Neural Networks (CNN) using mel-spectrogram energies and mel-spectrograms images as inputs, respectively, and show that the 2D CNN outperforms the 1D CNN. We obtained competitive classification results for two such systems. The first one, which uses a gradient boosting classifier, achieved an F1-Score of 90.2% and a recognition accuracy of 91.7%. The second one, which uses a 2D CNN with mel-spectrogram images, achieved an F1-Score of 92.7% and a recognition accuracy of 96%.

Highlights

  • Smart home-based ambient assisted living Information and Communications Technology (ICT) solutions can allow the elderly to remain in their own homes for longer and live independently [1]

  • For all of our experiments and comparisons, we applied the same split between the number of training and testing samples, as it is common in 365 the literature [49]

  • Selection, and classification, the effect of signal duration, and SNR at various distances on recognition accuracy in a noisy kitchen environment

Read more

Summary

Introduction

Smart home-based ambient assisted living Information and Communications Technology (ICT) solutions can allow the elderly to remain in their own homes for longer and live independently [1]. Research on ICT solutions for ambient 5 assisted living has intensified over the last decades considerably, due to the emergence of affordable powerful sensors and progress in artificial intelligence [2, 3, 4]. One common approach to automated HAR uses portable sensors such as accelerometers and gyroscopes [7, 8]. These sensors require cooperation of the subject, may restrict body movement, and are energy constrained [9, 10]. Another approach relies on computer vision [11, 12]. Features are extracted from the environmental sounds and classifiers are used to recognize the corresponding human activity [13, 14, 15]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call