The increasing prevalence of artificial intelligence-based knowledge reasoning has contributed to more accurate and efficient auxiliary diagnoses. However, a majority of the disease prediction methods concentrate on the symptoms themselves while discarding the inherent properties of symptoms and the relationships underlying them. This paper proposes a feature aggregation-based intelligent diagnosis model employing a Heterogeneous Graph Convolutional Network (GCN), termed HeteroGCN. It focuses on symptoms’ inherent properties and multiple hidden relationships among symptoms and properties. By aggregating features of nodes, it realizes effective and accurate symptom-based knowledge reasoning for disease-type prediction. The diagnosis-related information from the Electronic Medical Record (EMR) has been extracted and standardized by taking chronic obstructive pulmonary disease (COPD) as an instance. Then the presented model extracts the symptoms and their properties as nodes and the relationships underlying the nodes as edges to construct a heterogeneous graph. The adjacency matrix and feature matrix have been fused and taken as the input of this model, and then the node representations (embeddings) are generated by aggregating neighbor nodes’ information. Finally, specific disease types (syndromes) will be predicted by the generated symptom node embeddings. The results of the model comparison and parameter sensitivity test demonstrate that the presented HeteroGCN model performs best on disease-type prediction. This paper provides a novel feature aggregation-based multi-relational knowledge reasoning approach for disease type (syndrome) prediction, which holds great significance in improving disease diagnosis.
Read full abstract