Abstract
Rheumatoid arthritis (RA) is a disease of the immune system with a high rate of disability and there are a large amount of valuable disease diagnosis and treatment information in the clinical note of the electronic medical record. Artificial intelligence methods can be used to mine useful information in clinical notes effectively. This study aimed to develop an effective method to identify and classify medical entities in the clinical notes relating to RA and use the entity identification results in subsequent studies. In this paper, we introduced the bidirectional encoder representation from transformers (BERT) pre-training model to enhance the semantic representation of word vectors. The generated word vectors were then inputted into the model, which is composed of traditional bidirectional long short-term memory neural networks and conditional random field machine learning algorithms for the named entity recognition of clinical notes to improve the model's effectiveness. The BERT method takes the combination of token embeddings, segment embeddings, and position embeddings as the model input and fine-tunes the model during training. Compared with the traditional Word2vec word vector model, the performance of the BERT pre-training model to obtain a word vector as model input was significantly improved. The best F1-score of the named entity recognition task after training using many rheumatoid arthritis clinical notes was 0.936. This paper confirms the effectiveness of using an advanced artificial intelligence method to carry out named entity recognition tasks on a corpus of a large number of clinical notes; this application is promising in the medical setting. Moreover, the extraction of results in this study provides a lot of basic data for subsequent tasks, including relation extraction, medical knowledge graph construction, and disease reasoning.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have