The Montreal classification has been widely used in Crohn's disease since 2005 to categorize patients by the age of onset (A), disease location (L), behavior (B), and upper gastrointestinal tract and perianal involvement. With evolving management paradigms in Crohn's disease, we aimed to assess the performance of gastroenterologists in applying the Montreal classification. An online survey was conducted among participants at an international educational conference on inflammatory bowel diseases. Participants classified 20 theoretical Crohn's disease cases using the Montreal classification. Agreement rates with the inflammatory bowel diseases board (three expert gastroenterologists whose consensus rating was considered the gold standard) were calculated for gastroenterologist specialists and fellows/specialists with ≤2years of clinical experience. A majority vote <75% among participants was considered a notable disagreement. The same cases were classified using three large language models (LLMs), ChatGPT-4, Claude-3, and Gemini-1.5, and assessed for agreement with the board and gastroenterologists. Fleiss Kappa was used to assess within-group agreement. Thirty-eight participants from five countries completed the survey. In defining the Montreal classification as a whole, specialists (21/38 [55%]) had a higher agreement rate with the board compared to fellows/young specialists (17/38 [45%]) (58% vs. 49%, p=0.012) and to LLMs (58% vs. 18%, p<0.001). Disease behavior classification was the most challenging, with 76% agreement among specialists and fellows/young specialists and 48% among LLMs compared to the inflammatory bowel diseases board. Regarding disease behavior, within-group agreement was moderate (specialists: k=0.522, fellows/young specialists: k=0.532, LLMs: k=0.577; p<0.001 for all). Notable points of disagreement included: defining disease behavior concerning obstructive symptoms, assessing disease extent via video capsule endoscopy, and evaluating treatment-related reversibility of the disease phenotype. There is significant inter-rater disagreement in applying the Montreal classification, particularly for disease behavior in Crohn's disease. Improved education or revisions to phenotype criteria may be needed to enhance consensus on the Montreal classification.
Read full abstract