This research aimed to evaluate the effectiveness of ChatGPT in accurately diagnosing hepatobiliary tumors using histopathologic images. The study compared the diagnostic accuracies of the GPT-4 model, providing the same set of images and 2 different input prompts. The first prompt, the morphologic approach, was designed to mimic pathologists' approach to analyzing tissue morphology. In contrast, the second prompt functioned without incorporating this morphologic analysis feature. Diagnostic accuracy and consistency were analyzed. A total of 120 photomicrographs, composed of 60 images of each hepatobiliary tumor and nonneoplastic liver tissue, were used. The findings revealed that the morphologic approach significantly enhanced the diagnostic accuracy and consistency of the artificial intelligence (AI). This version was particularly more accurate in identifying hepatocellular carcinoma (mean accuracy: 62.0% vs 27.3%), bile duct adenoma (10.7% vs 3.3%), and cholangiocarcinoma (68.7% vs 16.0%), as well as in distinguishing nonneoplastic liver tissues (77.3% vs 37.5%) (Ps ≤ .01). It also demonstrated higher diagnostic consistency than the other model without a morphologic analysis (κ: 0.46 vs 0.27). This research emphasizes the importance of incorporating pathologists' diagnostic approaches into AI to enhance accuracy and consistency in medical diagnostics. It mainly showcases the AI's histopathologic promise when replicating expert diagnostic processes.
Read full abstract