Abstract

325 Background: Digital decision support tools such as Watson offer a promising opportunity to recommend the best cancer treatment to doctors around the globe. Although concordance studies between Watson's recommendations and routine daily practice or expert opinion were reported; scientific papers demonstrating how Watson affects medical decision-making are scarce. The first step of external validation and localization is content evaluation against national guidelines. Therefore, we present a feasibility study of the content evaluation of Watson for Oncology. Methods: We developed synthetic patient cases exhaustively testing the adjuvant treatment of stage I to stage III colon cancer based on relevant patient, clinical and tumor characteristics, and entered these in Watson. We used cross tabulations and a novel scoring system to compare Dutch and NCCN guideline recommendations with Watson’s treatment advice. Our scoring system allows a range from +12 to -12, reflecting the difference between minor disagreement (possible treatment not mentioned) versus serious error (contraindicated treatment was recommended). Treatment options that were contraindicated according to the national guidelines were labeled with a "red flag" if Watson recommended it, and an "orange flag" if Watson considered it. Results: In total, we developed 190 synthetic patient cases (stage I: n = 8; stage II: n = 110; and stage III: n = 72). Overall concordance scores per case for Watson versus Dutch guidelines ranged from a minimum score of -4 (n = 6) to a maximum score of +12 (n = 17) and from -4 (n = 9) to +12 (n = 24) for Watson versus the NCCN guidelines. In total, 69 cases (36%) were labeled with red flags, 96 cases (51%) with orange flags and 25 cases (13%) without flags. For the comparison of Watson with the NCCN guidelines, no red or orange flags were identified. Conclusions: Evaluation of the content of Watson for Oncology against national guidelines is feasible. Overall concordance scores varied considerably between synthetic patient cases. Non-concordance is partially attributable to guideline differences between the United States and The Netherlands. This implies that further adjustments and localization are required before implementation of Watson outside the United States.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call