Abstract Objective: Timely access to essential patient information is critical for informed, data-driven decision-making in clinical nursing to enhance care efficiency and quality. Despite advancements in electronic health records, many medical reports remain in paper form, posing challenges for data analysis and application. This study aimed to develop ChatSchema, a pipeline based on Large Multimodal Models (LMMs) for extracting structured information from paper-based medical reports, and evaluated its effectiveness in a pilot setting. Method: ChatSchema was a two-stage approach to extract and structure data from paper-based medical reports. The classification stage leveraged Optical Character Recognition (OCR) of the pictured medical reports, pre-correction, desensitization, and prompt engineering to categorize report types. In the extraction stage, OCR-converted text was transformed into a structured format using a predefined schema, with LMMs applied for standardizing fields and converting data types. A dataset of 100 annotated medical reports was collected from Peking University First Hospital, and the effectiveness of ChatSchema was evaluated in terms of precision, recall, F1-score, and overall accuracy. To validate ChatSchema’s effectiveness, we compared it against a baseline method that provided only schema and task instructions. For sensitivity analysis, two basic LMMs, GPT-4o and Gemini 1.5 Pro, were utilized and compared in the development of ChatSchema. Results: A ground-truth dataset comprising 2,945 test item-result pairs was extracted from 100 annotated medical reports. ChatSchema was capable of correctly extracting and structuring data from paper-based medical reports, showing remarkable accuracy across various data types, unit standardization, field mapping, and basic LMMs. Overall, ChatSchema achieved a high F1-score of 95.8%, an accuracy of 97.2%, a precision of 95.8%, and a recall of 95.8%. ChatSchema surpassed the baseline model by 12.9% in accuracy and by 12.3% in F1-score using the GPT-4o model. Conclusion: Our findings demonstrate that ChatSchema is highly effective for extracting and structuring data from paper-based medical reports. By providing accurate, structured information extraction, ChatSchema has the potential to enhance patient care and support clinical decision-making, showing significant promise for broader application in nursing and healthcare settings.
Read full abstract