Abstract

We analyse pretrained and non-pretrained deep neural models to detect 10-seconds Bowel Sounds(BS) audio segments in continuous audio data streams. The models include MobileNet, EfficientNet, and Distilled Transformer architectures. Models were initially trained on AudioSet and then transferred and evaluated on 84hours of labelled audio data of eighteen healthy participants. Evaluation data was recorded in a semi-naturalistic daytime setting including movement and background noise using a smart shirt with embedded microphones. The collected dataset was annotated for individual BS events by two independent raters with substantial agreement(Cohen's Kappa κ = 0.74). Leave-One-Participant-Out cross-validation for detecting 10-second BS audio segments, i.e. segment-based BS spotting, yielded a best F1 score of 73% and 67%, with and without transfer learning respectively. The best model for segment-based BS spotting was EfficientNet-B2 with an attention module. Our results show that pretrained models could improve F1 score up to 26%, in particular, increasing robustness against background noise. Our segment-based BS spotting approach reduces the amount of audio data to be reviewed by experts from 84h to 11h, thus by ∼87%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.