Abstract
Background: The identification of medical entities and relations from electronic medical records is a fundamental research issue for medical informatics. However, the task of extracting valuable knowledge from these records is challenging due to its high complexity. The accurate identification of entity and relation is still an open research problem in medical information extraction.Methods: A pattern-based method for extracting certain tumor-related entities and attributes from Chinese unstructured diagnostic imaging text is proposed. This method is a composition of three steps. Firstly, an algorithm based on keyword matching is designed to obtain the primary sites of tumors. Then a set of regular expressions is applied to identify primary tumor size information. Finally, a set of rules is defined to acquire metastatic sites of tumors.Results: Our method achieves a recall of 0.697, a precision of 0.825 and an F1 score of 0.755 using an overall weighted metric. For each of the extraction tasks, the F1 scores are 0.784, 0.822 and 0.740.Conclusions: The method proves to be stable and robust with different amounts of testing data. It achieves a comparatively high performance in the CHIP 2018 open challenge, demonstrating its effectiveness in extracting tumor-related entities from Chinese diagnostic imaging text.
Highlights
Biomedical named entity recognition (NER) is a critical task for extracting patient information from medical diagnosis to support medical research and treatment decision making
Based on a standard dataset obtained from the CHIP 2018 open challenge, this medical named entity recognition research targets three sub-tasks: (1) Identification of primary tumor sites, (2) extraction of primary tumor sizes, and (3) recognition of metastatic tumor sites
We proposed a pattern-based method to extract primary tumor sites, the size of primary tumors and metastatic sites from diagnostic imaging test
Summary
The identification of medical entities and relations from electronic medical records is a fundamental research issue for medical informatics. The task of extracting valuable knowledge from these records is challenging due to its high complexity. The accurate identification of entity and relation is still an open research problem in medical information extraction
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.