Abstract
The amount of time an infant cries in a day helps the medical staff in the evaluation of his/her health conditions. Extracting this information requires a cry detection algorithm able to operate in environments with challenging acoustic conditions, since multiple noise sources, such as interferent cries, medical equipments, and persons may be present. This paper proposes an algorithm for detecting infant cries in such environments. The proposed solution is a multiple stage detection algorithm: the first stage is composed of an eight-channel filter-and-sum beamformer, followed by an Optimally Modified Log-Spectral Amplitude estimator (OMLSA) post-filter for reducing the effect of interferences. The second stage is the Deep Neural Network (DNN) based cry detector, having audio Log-Mel features as inputs. A synthetic dataset mimicking a real neonatal hospital scenario has been created for training the network and evaluating the performance. Additionally, a dataset containing cries acquired in a real neonatology department has been used for assessing the performance in a real scenario. The algorithm has been compared to a popular approach for voice activity detection based on Long-Term Spectral Divergence, and the results show that the proposed solution achieves superior detection performance both on synthetic data and on real data.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.