Abstract
With the application of electronic medical records in medical field, more and more people are paying attention to how to use these data efficiently. In this paper, the BiLSTM-CRF model is applied to Chinese electronic medical records to recognize related named entities in these records. For the characteristics of Chinese electronic medical records, firstly, the one-hot vector of each word is obtained in units of sentences. Secondly, map one-hot vector to a low-dimensional dense word vector. Thirdly, word vector is used as the input of the BiLSTM layer to achieve automatic extraction of sentence features. Finally, the CRF layer performs sequence-level labeling of sentences. In addition, drug dictionary and post-correction rules are added to correct the segmentation error of entity boundary, to improve recognition accuracy of related named entities. The F1 value of this method on a given test data set is 87.68%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.