Zero-shot Learning Research Articles

Abstract Background The precise assessment of the aortic valve via echocardiography is critical for early detection and management of aortic valve diseases. Until recently, previous studies have examined machine learning models to estimate individual measurements and severity of aortic stenosis (AS) from echocardiographic images. These image processing algorithms, while precise in their narrow focus, fall short in mirroring the holistic and interconnected clinical judgment typical of human echocardiographers in producing a qualitative report. Large language models (LLMs), particularly image-to-text multi-modal LLMs, are a fundamental advance in the field of deep learning with implications for a host of applications in medical imaging. They promise to encapsulate not just discrete data points typical in traditional machine learning, but also the complex contextual interrelations in clinical diagnosis. Methods In this study, a large-scale heterogeneous database of echocardiographic images containing over 90,681 studies with textual descriptors of the aortic valve was used to train a single, image-to-text multimodal LLM called ValveVision AI. The ground truth textual summaries were drafted by level III echocardiographers in a clinical setting between 2015-2020. BLEU and ROUGE score was calculated. The models were retrospectively assessed on a holdout dataset. Reviewing physicians compared the generated summary to the ground truth and binarily agreed or rejected it. Receiver Operating Characteristics (ROC) for distinct pathologies were also assessed (Figure I). Results ValveVision AI performed with a BLEU score of 0.45 and a ROUGE score of 0.49. The performance of the model in reporting on classification of moderate/severe vs none/mild AS in concordance with the validation protocol described above achieved a specificity of 91.98% and a sensitivity of 83.89%, along with more precise qualitative description. Qualitatively, the model exhibited the capability of zero-shot learning in certain instances, however, this result remains an area of exploration. Conclusion This study represents to our knowledge, the first attempt at an image-representation to text-tokenizer deep learning model architecture to mimic the thought and subtlety of echocardiographic qualitative analysis of the aortic valve. The results suggest that this multimodal LLM has sufficient accuracy to create a preliminary textual summary of the aortic valve that, if paired with a point of care ultrasound (POCUS) device in a primary care setting, may facilitate case triage, increase efficiency, and determine a more precise care pathway for patients.

Read full abstract

ObjectiveRecent advances in large language models (LLM) offer opportunities to automate health coaching. With zero-shot learning ability, LLMs could revolutionize health coaching by providing better accessibility, scalability, and customization. The aim of this study is to compare the quality of responses to clients' sleep-related questions provided by health coaches and an LLM. Design, setting, and participantsFrom a de-identified dataset of coaching conversations from a pilot randomized controlled trial, we extracted 100 question-answer pairs comprising client questions and corresponding health coach responses. These questions were entered into a retrieval-augmented generation (RAG)-enabled open-source LLM (LLaMa-2-7b-chat) to generate LLM responses. Out of 100 question-answer pairs, 90 were taken out and assigned to three groups of evaluators: experts, lay-users, and GPT-4. Each group conducted two evaluation tasks: (Task 1) a single-response quality assessment spanning five criteria—accuracy, readability, helpfulness, empathy, and likelihood of harm—rated on a five-point Likert scale, and (Task 2) a pairwise comparison to choose the superior response between pairs. A suite of inferential statistical methods, including the paired and independent sample t-tests, Pearson correlation, and chi-square tests, were utilized to answer the study objective. Recognizing potential biases in human judgment, the remaining 10 question-answer pairs were used to assess inter-evaluator reliability among the human evaluators, quantified using the interclass correlation coefficient and percentage agreement metrics. ResultsUpon exclusion of incomplete data, the analysis included 178 single-response evaluations (Task 1) and 83 pairwise comparisons (Task 2). Expert and GPT-4 assessments revealed no discernible disparities in health coach and LLM responses across the five metrics. Contrarily, lay-users deemed LLM responses significantly more helpful than that of human coaches (p < 0.05). LLM responses were preferred in the majority (62.25 %, n = 155) of the aggregate 249 assessments, with all three evaluator groups favoring LLM over health coach inputs. While GPT-4 rated both health coach and LLM responses significantly higher than experts in terms of readability, helpfulness, and empathy, its ratings on accuracy and likelihood of harm aligned with those of experts. Response length positively correlated with accuracy and empathy scores, but negatively affected readability across all evaluator groups. Expert and lay-user evaluators demonstrated moderate to high inter-evaluator reliability. ConclusionOur study showed encouraging findings by demonstrating that RAG-enabled LLM has comparable performance to health coaches in the domain tested. Serving as an initial step towards the creation of more sophisticated, adaptive, round-the-clock automated health coaching systems, our findings call for more extensive evaluation which could assist in the development of the model that could in the future lead to potential clinical implementation.

Read full abstract

Zero-shot Learning Research Articles

Related Topics

Articles published on Zero-shot Learning

Path Planning for Robots Combined with Zero-Shot and Hierarchical Reinforcement Learning in Novel Environments

Knowledge Guided Transformer Network for Compositional Zero-Shot Learning

Abstract Su801: Use of Large Language Models to Optimize Clinical Text Analysis for In-Hospital Cardiac Arrest Identification

Advanced Imaging Integration: Multi-Modal Raman Light Sheet Microscopy Combined with Zero-Shot Learning for Denoising and Super-Resolution.

A Cross-Lingual Media Profiling Model for Detecting Factuality and Political Bias

ValveVision AI - a multimodal language model for qualitative reporting of the aortic valve in echocardiography

Can a zero-shot learning Large Language Model code complex interview data?

A comprehensive review of deep learning for medical image segmentation

Advancing health coaching: A comparative study of large language model and health coaches

Instance-wise multi-view visual fusion for zero-shot learning

PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning

Zero-shot learning with visual-semantic mutual reinforcement for image recognition

Comparative Study of GPT-4Vision and Convolutional Neural Networks in Histopathological Image Analysis

A Suite of Foundation Models Captures the Contextual Interplay Between Codons.

Promoting Machine Abilities of Discovering and Utilizing Knowledge in a Unified Zero-shot Learning Paradigm

Zero-Shot Learning for Accurate Project Duration Prediction in Crowdsourcing Software Development

Open-Pose 3D zero-shot learning: Benchmark and challenges

Digital Fingerprinting of Complex Liquids Using a Reconfigurable Multi-Sensor System with Foundation Models.

ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records

Adaptive indefinite kernels in hyperbolic spaces

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Zero-shot Learning Research Articles

Related Topics

Articles published on Zero-shot Learning

Path Planning for Robots Combined with Zero-Shot and Hierarchical Reinforcement Learning in Novel Environments

Knowledge Guided Transformer Network for Compositional Zero-Shot Learning

Abstract Su801: Use of Large Language Models to Optimize Clinical Text Analysis for In-Hospital Cardiac Arrest Identification

Advanced Imaging Integration: Multi-Modal Raman Light Sheet Microscopy Combined with Zero-Shot Learning for Denoising and Super-Resolution.

A Cross-Lingual Media Profiling Model for Detecting Factuality and Political Bias

ValveVision AI - a multimodal language model for qualitative reporting of the aortic valve in echocardiography

Can a zero-shot learning Large Language Model code complex interview data?

A comprehensive review of deep learning for medical image segmentation

Advancing health coaching: A comparative study of large language model and health coaches

Instance-wise multi-view visual fusion for zero-shot learning

PMGNet: Disentanglement and entanglement benefit mutually for compositional zero-shot learning

Zero-shot learning with visual-semantic mutual reinforcement for image recognition

Comparative Study of GPT-4Vision and Convolutional Neural Networks in Histopathological Image Analysis

A Suite of Foundation Models Captures the Contextual Interplay Between Codons.

Promoting Machine Abilities of Discovering and Utilizing Knowledge in a Unified Zero-shot Learning Paradigm

Zero-Shot Learning for Accurate Project Duration Prediction in Crowdsourcing Software Development

Open-Pose 3D zero-shot learning: Benchmark and challenges

Digital Fingerprinting of Complex Liquids Using a Reconfigurable Multi-Sensor System with Foundation Models.

ChatGPT-4 extraction of heart failure symptoms and signs from electronic health records

Adaptive indefinite kernels in hyperbolic spaces