Abstract

AbstractA complete inpatient electronic medical record contains a lot of information. In recent years, numerous researchers have carried out research on word segmentation of medical texts. Since the medical record text is written by free text, structuring the medical record text is an important part of the medical record intelligent analysis when these medical named entities are recognized. At present, named entity recognition methods are mainly divided into lexicographical and rule-based methods and machine learning-based methods. The machine learn-based approach takes the named entity recognition task as the annotation problem of sequence data, mainly considering the context information. The features commonly used in feature construction are contextual feature and dictionary features. BERT+LSTM+CRF were used to train the named entity recognition model. Open source CRF++ was adopted as the tool we relied on. We trained the LSTM+F model using the results of the original word segmentation and the information in the context as features. We carried out a 5-fold cross validation. The results showed that the overall F-1 score (MICRO-F) of named entity recognition reached 0.92, which confirmed that the model could accurately complete the task of medical named entity recognition.KeywordsWord segmentationNamed entity recognitionNatural language processing

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call