This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy. Performance was evaluated on a dataset of 100 diverse prescription labels from Thai healthcare facilities, using RAG Assessment (RAGAs) metrics to assess Context Recall, Factual Correctness, Faithfulness, and Semantic Similarity. The Two-Stage model achieved high accuracy (94%) and strong RAGAs scores, particularly in Context Recall (0.88) and Semantic Similarity (0.91), making it well-suited for complex medication instructions. In contrast, the Uni-Stage model delivered faster response times, making it practical for high-volume environments such as pharmacies. This study demonstrates the potential of zero-shot AI models in addressing medication management challenges for the elderly by providing clear, accurate, and contextually relevant label interpretations. The findings underscore the adaptability of AI in healthcare, balancing accuracy and efficiency to meet various real-world needs.
Read full abstract