Abstract The interest in Large Languages Models (LLMs) has been rapidly evolving over the last years, boosting concerns about their potential to improve the efficiency and effectiveness of clinical, educational and research work in medicine. In this study, we aimed to test the performance of MEDITRON, a recently released suite of open-source LLMs with 7B parameters adapted to the medical domain, on a small sample of human nutrition queries. To facilitate our assessment, we curated a specialized dataset comprising a diverse range of human nutrition-related queries, which have been manually collected from various medical databases and nutrition exams. The dataset was adapted for the scope of our analysis to encompass variations in language complexity, query types, and semantic nuances commonly encountered in real-world settings. Additionally, to ensure a standardized and clinically relevant context for our evaluations, we engineered a specialized prompt designed to mimic interactions with a highly esteemed physician specializing in nutrition, food science, and diet-related disorders. The prompt guided the generation of nutrition-related queries in a structured format, enabling MEDITRON to provide responses consistent with the latest advancements in medical research. Our preliminary findings revealed promising capabilities of MEDITRON in understanding medical language, providing contextually appropriate responses to several human nutrition related questions. Through this study, we add valuable insights to the ongoing discussion around the deployment of LLMs in public health, highlighting their potential to improve access to essential human nutrition literacy. Key messages • Design of LLMs tools holds the potential to improve healthcare in clinical, research and education applications. • Further evidence is needed to foster their actual integration in real world settings.
Read full abstract