Abstract
In order to improve the recognition rate of the tone classification of doctors in online medical services scenarios, we propose a model that integrates a one-dimensional convolutional neural network (1DCNN) with a bidirectional long short-term memory network (BiLSTM). Firstly, significant tone types within online medical services scenarios were identified through a survey questionnaire. Secondly, 68 features in both the time and frequency domains of doctors’ tone were extracted using Librosa, serving as the initial input for the model. We utilize the 1DCNN branch to extract local features in the time and frequency domains, while the BiLSTM branch captures the global sequential features of the audio, and a feature-level fusion is performed to enhance tone classification effectiveness. When applied in online medical services scenarios, experimental results show that the model achieved an average recognition rate of 84.4% and an F1 score of 84.4%, significantly outperforming other models and effectively improving the efficiency of doctor-patient communication. Additionally, a series of ablation experiments were conducted to validate the effectiveness of the 1DCNN and BiLSTM modules and the parameter settings.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.