Abstract

This work explores the effectiveness of using the transfer learning approach for the detection of physical load intensity using speech and residual signals. Physical workout affects the breathing pattern, which perturbs the speech signal. For analysis of source, the residual signal is obtained by inverse-filtering of speech by linear prediction method. The pre-trained OpenL3 model is used for extracting acoustic embeddings for both signals. The binary classification by shallow, fully connected networks show an F1-score of 83.20% and 79.42% for speech and residual embeddings, which are better than the baseline performances. The deep features for speech and residual signals are extracted from the fully connected networks. The best binary performance with an F1-score of 84.30% is obtained by a feature level combination of the speech and residual-based deep features. The results imply that the speech and residual embeddings obtained by the transfer learning approach are able to detect the subtle changes in speech under physical workout scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call