Cancer diagnosis comes as a shock to many patients, and many of them feel unprepared to handle the complexity of the life-changing event, understand technicalities of the diagnostic reports, and fully engage with the clinical team regarding the personalized clinical decision-making. We develop Oncointerpreter.ai an interactive resource to offer personalized summarization of clinical cancer genomic and pathological data, and frame questions or address queries about therapeutic opportunities in near-real time via a graphical interface. It is built on the Mistral-7B and Llama-2 7B large language models trained on a local database trained using a large, curated corpus. We showcase its utility with case studies, where Oncointerpreter.ai extracted key clinical and molecular attributes from deidentified pathology and clinical genomics reports, summarized their contextual significance and answered queries on pertinent treatment options. Oncointerpreter also provided personalized summary of currently active clinical trials that match the patients' disease status, their selection criteria, and geographic locations. Benchmarking and comparative assessment indicated that the model responses were generally consistent, and hallucination, ie, factually incorrect or nonsensical response was rare; treatment- and outcome related queries led to context-aware responses, and response time correlated with verbosity. The choice of model and domain-specific training also affected the response quality. Oncointerpreter.ai can aid the existing clinical care with interactive, individualized summarization of diagnostics data to promote informed dialogs with the patients with new cancer diagnoses. https://github.com/Siris2314/Oncointerpreter.
Read full abstract