Abstract

Traditional Chinese medicine (TCM) is a clinical medicine. The huge clinical data from the daily clinical process which keeps to TCM theories and principles, is the core empirical knowledge source for TCM research. Induction of the common knowledge or regularities from the large-scale clinical data is a vital task for both theoretical and clinical research of TCM. Topic model have recently shown much success for text analysis and information retrieval by extracting latent topics from text collection. In this paper, we propose a hierarchical symptom-herb topic model (HSHT), to automatically extract the hierarchical latent topic structures with both symptoms and their corresponding herbs in the TCM clinical data. The HSHT model is one of the extensions of hierarchical latent Dirichlet allocation model (hLDA) and Link latent Dirichlet allocation (LinkLDA). The proposed HSHT model is used for extracting the hierarchical structure of symptoms with their corresponding herbs in clinical type 2 diabetes mellitus (T2DM). We get one meaningful super-topic with common symptoms and commonly used herbs and some meaningful subtopics denoted T2DM complications with corresponding symptoms and their commonly used herbs. The results indicate some important medical groups corresponding to the companioned diseases in the T2DM inpatients. And then the results show that there exactly exist TCM diagnosis and treatment sub-categories and the personalized therapies to T2DM. Furthermore, it manifested that the HSHT model is useful for establishing of the TCM clinical guidelines based on the TCM clinical data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call