Regular monitoring of respiratory quality of life (RQoL) is essential in respiratory healthcare, facilitating prompt diagnosis and tailored treatment for chronic respiratory diseases. Voice alterations resulting from respiratory conditions create unique audio signatures that can potentially be utilized for disease screening or monitoring. Analyzing data from 1908 participants from the Colive Voice study, which collects standardized voice recordings alongside comprehensive demographic, epidemiological, and patient-reported outcome data, we evaluated various strategies to estimate RQoL from voice, including handcrafted acoustic features, standard acoustic feature sets, and advanced deep audio embeddings derived from pretrained convolutional neural networks. We compared models using clinical features alone, voice features alone, and a combination of both. The multimodal model combining clinical and voice features demonstrated the best performance, achieving an accuracy of 70.8% and an area under the receiver operating characteristic curve (AUROC) of 0.77; an improvement of over 5% in terms of accuracy and 7% in terms of AUROC compared to model utilizing voice features alone. Incorporating vocal biomarkers significantly enhanced the predictive capacity of clinical variables across all acoustic feature types, with a net classification improvement (NRI) of up to 0.19. Our digital voice-based biomarker is capable of accurately predicting RQoL, either as an alternative to or in conjunction with clinical measures, and could be used to facilitate rapid screening and remote monitoring of respiratory health status.
Read full abstract