Abstract

Background: The identification of medical entities and relations from electronic medical records is a fundamental research issue for medical informatics. However, the task of extracting valuable knowledge from these records is challenging due to its high complexity. The accurate identification of entity and relation is still an open research problem in medical information extraction.Methods: A pattern-based method for extracting certain tumor-related entities and attributes from Chinese unstructured diagnostic imaging text is proposed. This method is a composition of three steps. Firstly, an algorithm based on keyword matching is designed to obtain the primary sites of tumors. Then a set of regular expressions is applied to identify primary tumor size information. Finally, a set of rules is defined to acquire metastatic sites of tumors.Results: Our method achieves a recall of 0.697, a precision of 0.825 and an F1 score of 0.755 using an overall weighted metric. For each of the extraction tasks, the F1 scores are 0.784, 0.822 and 0.740.Conclusions: The method proves to be stable and robust with different amounts of testing data. It achieves a comparatively high performance in the CHIP 2018 open challenge, demonstrating its effectiveness in extracting tumor-related entities from Chinese diagnostic imaging text.

Highlights

  • Biomedical named entity recognition (NER) is a critical task for extracting patient information from medical diagnosis to support medical research and treatment decision making

  • Based on a standard dataset obtained from the CHIP 2018 open challenge, this medical named entity recognition research targets three sub-tasks: (1) Identification of primary tumor sites, (2) extraction of primary tumor sizes, and (3) recognition of metastatic tumor sites

  • We proposed a pattern-based method to extract primary tumor sites, the size of primary tumors and metastatic sites from diagnostic imaging test

Read more

Summary

Background

The identification of medical entities and relations from electronic medical records is a fundamental research issue for medical informatics. The task of extracting valuable knowledge from these records is challenging due to its high complexity. The accurate identification of entity and relation is still an open research problem in medical information extraction

Methods
Conclusions
INTRODUCTION
RELATED WORK
METHODOLOGY
Identification of Primary Tumor Sites
Extraction of Primary Tumor Sizes
Recognition of Metastatic Sites of Tumors
Datasets
Evaluation Metrics
Results
Error Analysis
CONCLUSIONS
ETHICS STATEMENT
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.