Abstract

Background The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. Methods Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. Results The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. Conclusions The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms.

Highlights

  • Traditional Chinese medicine (TCM) symptoms are recorded by TCM practitioners who sometimes use different words when recording the same symptoms, as a consequence of their diverse experience and educational background. ese variations in words lead to the phenomenon known as “one symptom with different literal expressions,” which is prevalent in TCM medical records

  • Comparing the bidirectional encoder representation from transformers (BERT)-unified language model (UniLM) models with the BERTClassification model, the BERT-Classification model had more advantages. at is, the BERT-Classification model was the best model for normalizing expressions of TCM synonymous symptoms in this study, on both the HFDS and total data sets (TDS) test data sets

  • Discussion e normalization of expressions of TCM synonymous symptoms plays an important role in the collation of medical records, statistical mining, construction of TCM knowledge databases, and construction of TCM medical assistant decision-making systems [9]. e application of Natural language processing (NLP) technology improves the efficiency of normalization processing

Read more

Summary

Introduction

Traditional Chinese medicine (TCM) symptoms are recorded by TCM practitioners who sometimes use different words when recording the same symptoms, as a consequence of their diverse experience and educational background. ese variations in words lead to the phenomenon known as “one symptom with different literal expressions,” which is prevalent in TCM medical records. Traditional Chinese medicine (TCM) symptoms are recorded by TCM practitioners who sometimes use different words when recording the same symptoms, as a consequence of their diverse experience and educational background. Ese variations in words lead to the phenomenon known as “one symptom with different literal expressions,” which is prevalent in TCM medical records. Wang et al [1] reported that approximately 80% of TCM symptoms were recorded with multiple expressions. The literal expressions of these symptoms are different, they have the same meaning, and their use does not affect understanding. Us, the use of these alternative symptoms does not affect the pathogenesis diagnosis. TCM symptoms that have the same meaning but different literal descriptions are known as TCM synonymous symptoms. The abundance of synonymous symptoms in TCM

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.