Abstract
Ambient clinical documentation technology uses automatic speech recognition (ASR) and natural language processing (NLP) to turn patient-clinician conversations into clinical documentation. It is a promising approach to reducing clinician burden and improving documentation quality. However, the performance of current-generation ASR remains inadequately validated. In this study, we investigated the impact of non-lexical conversational sounds (NLCS) on ASR performance. NLCS, such as Mm-hm and Uh-uh, are commonly used to convey important information in clinical conversations, for example, Mm-hm as a "yes" response from the patient to the clinician question "are you allergic to antibiotics?" In this study, we evaluated 2 contemporary ASR engines, Google Speech-to-Text Clinical Conversation ("Google ASR"), and Amazon Transcribe Medical ("Amazon ASR"), both of which have their language models specifically tailored to clinical conversations. The empirical data used were from 36 primary care encounters. We conducted a series of quantitative and qualitative analyses to examine the word error rate (WER) and the potential impact of misrecognized NLCS on the quality of clinical documentation. Out of a total of 135647 spoken words contained in the evaluation data, 3284 (2.4%) were NLCS. Among these NLCS, 76 (0.06% of total words, 2.3% of all NLCS) were used to convey clinically relevant information. The overall WER, of all spoken words, was 11.8% for Google ASR and 12.8% for Amazon ASR. However, both ASR engines demonstrated poor performance in recognizing NLCS: the WERs across frequently used NLCS were 40.8% (Google) and 57.2% (Amazon), respectively; and among the NLCS that conveyed clinically relevant information, 94.7% and 98.7%, respectively. Current ASR solutions are not capable of properly recognizing NLCS, particularly those that convey clinically relevant information. Although the volume of NLCS in our evaluation data was very small (2.4% of the total corpus; and for NLCS that conveyed clinically relevant information: 0.06%), incorrect recognition of them could result in inaccuracies in clinical documentation and introduce new patient safety risks.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Journal of the American Medical Informatics Association
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.