Abstract

Abstract Introduction: To facilitate research on personalized cancer medicine, we developed the DeepPhe system for deep phenotype extraction of computable longitudinal summaries from electronic medical records (EMR). DeepPhe provides an advanced natural language processing (NLP) pipeline capable of extracting cancer details from clinical notes, storage and export of the resulting details, and a visualization tool with interactive graphical capabilities at both the cohort and individual patient levels. Our initial software release included training and evaluation of breast cancer specific attributes (Savova et al. Cancer Research 2017). As part of our iterative process of testing and improving DeepPhe, we have recently expanded to ovarian cancer. Methods: We used DeepPhe to extract ovarian cancer characteristics from EMR, including tumor location(s), laterality, temporality, and clinical and pathologic components of stage of disease at diagnosis. Tool performance was quantified by F1 scores, defined as the harmonic mean between precision and recall, with F1=1.00 indicating perfect performance, compared to the gold standard of information from manual data abstraction. Results: Among a test set of 26 primary epithelial ovarian cancer cases with 1,675 annotated notes from UPMC, the overall F1 was 0.9327. F1 scores for attributes of tumor location, laterality, and temporality were 0.8621, 0.5000, and 1.000, respectively. Performance was perfect for all clinical stage of disease components (T, N, and M), while pathologic components scored 0.7879, 0.7692, and 0.9200, respectively. Evaluation of 16 additional UPMC cases is currently underway and ovarian cancer specific attributes are being expanded to include degree of surgical cytoreduction (debulking) and presence of residual disease. Generalizability will be further assessed by evaluating DeepPhe’s performance on EMR for 427 ovarian cancer cases from the Vanderbilt University Medical Center. Conclusions: Our results demonstrate that DeepPhe can be adapted to additional cancer types without losing good-to-excellent performance characteristics. While some concepts, such as laterality, remain challenging for extraction, current results are considered adequate for semi-automated abstraction approaches. Further refinement and expansion to additional clinical and tumor characteristics, including genomics, tumor biomarkers, treatments, and patient outcomes are currently underway. Open-source and freely available for research use, DeepPhe tools (https://github.com/DeepPhe/DeepPhe-Release) can facilitate large-scale extraction of detailed information from EMR systems for research on cancer. Citation Format: Alicia Beeghly-Fadiel, Jeremy L. Warner, Sean Finan, James Masanz, Harry Hochheiser, Guergana Savova. Deep phenotype extraction to facilitate cancer research: Extending DeepPhe to ovarian cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 5114.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.