Automatic speech recognition (ASR) and natural language processing (NLP) play key roles in advancing human–technology interactions, particularly in healthcare communications. This study aims to enhance French-language online mental health platforms through the adaptation of the QuartzNet 15 × 5 ASR model, selected for its robust performance across a variety of French accents as demonstrated on the Mozilla Common Voice dataset. The adaptation process involved tailoring the ASR model to accommodate various French dialects and idiomatic expressions, and integrating it with an NLP system to refine user interactions. The adapted QuartzNet 15 × 5 model achieved a baseline word error rate (WER) of 14%, and the accompanying NLP system displayed weighted averages of 64.24% in precision, 63.64% in recall, and an F1-score of 62.75%. Notably, critical functionalities such as ‘Prendre Rdv’ (schedule appointment) achieved precision, recall, and F1-scores above 90%. These improvements substantially enhance the functionality and management of user interactions on French-language digital therapy platforms, indicating that continuous adaptation and enhancement of these technologies are beneficial for improving digital mental health interventions, with a focus on linguistic accuracy and user satisfaction.
Read full abstract