Abstract

Background: Big data has the potential to revolutionize echocardiography by enabling novel research and rigorous, scalable quality improvement. Text reports are a key part of such analyses. Currently, echocardiogram reports include both structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both non-deep learning and deep-learning (e.g., large language model, or LLM) based techniques. Challenges to date in using echo text with LLMs include small size, domain-specific language, and high need for accuracy and clinical meaning in model results. Hypotheses: We tested whether we could map echocardiography text to a structured ontology using NLP. Methods: We developed a three-tier ontology for the echocardiographic anatomic structures, functional elements, and descriptive characteristics in an adult transthoracic echocardiogram using 919 sentences from UCSF’s structured echocardiogram report text. We tested LLM fine-tuning as well as non-LLM techniques to map echocardiography sentences to this ontology. Two-hundred twenty-eight UCSF sentences served as an internal test set. Additional test datasets included free text from UCSF reports; structured text sentences from two other hospitals; and sentences from reports representing 17 additional hospitals. Results: Despite all adhering to clinical guidelines for reporting, there were notable differences by institution in what structural and functional information was included in structured reporting. A non-LLM hierarchical model performed best in mapping sentences to the ontology, with internal test accuracy of 96% for the first level of the ontology, 91% for the second level, and 77% for the third level. Echomap retained good performance across diverse datasets and displayed the ability to extrapolate to ontological terms not initially included in training. Conclusions: We show that non-LLM NLP methods can achieve good performance and may be especially useful for small, specialized text datasets where clinical meaning is important. These results highlight the utility of a high-resolution, standardized cardiac ontology to harmonize reports across institutions.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.