Abstract

Chinese herbal formulae are the heritage of traditional Chinese medicine (TCM) in treating diseases through thousands of years. The formula function is not just a simple herbal efficacy addition, but produces complex and nonlinear relationships between different herbs and their overall efficacy, which brings challenges to the formula efficacy analysis. In our study, we proposed a model called HPE-GCN that combines graph convolutional networks (GCN) with TCM-defined herbal properties (TCM-HPs) to predict formulae efficacy. In addition, to process the unstructured natural language in the formula text, we proposed a weighting calculation method related to herb frequency and the number of herbs in a formula called Formula-Herb dependence degree (FHDD), to assess the dependency degree of a formula with its herbs. In our research, 214 classic tonic formulae from ancient TCM books such as Synopsis of the Golden Chamber, Jingyue's Complete Works and the Golden Mirror of Medicin were collected as datasets. The performance of HPE-GCN on multi-classification of tonic formulae reached the best result compared with classic machine learning models, such as support vector machine, naive Bayes, logistic regression, gradient boosting decision tree, and K-nearest neighbors. The evaluated index Macro-Precision, Macro-Recall, Macro-F1 of HPE-GCN on the test set were 87.70%, 84.08% and 83.51% respectively, increased by 7.27%, 7.41% and 7.30% respectively from second best compared models. GCN has the advantage of low-dimensional feature expression for herbs and formulae, and is an effective analysis tool for TCM research. HPE-GCN integrates TCM-HPs and fits the complex nonlinear mapping relationship between TCM-HPs and formulae efficacy, which provides new ideas for related research.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call