Abstract

Disease diagnosis mainly depends on the doctor’s medical knowledge and clinical experience, which can be treated as a medical text classification task. We observe that existing data-driven methods always suffer from the distribution bias since a small amount of common diseases appear high-frequently, while most diseases are infrequent in real-world, which leads to an unbalanced data distribution in the disease diagnosis task. To address this problem, we propose a new learning framework, Typical sample-Driven Graph Neural Network (TD-GNN) for disease knowledge representation and classification. In our framework, different from previous methods, each disease (label) is concretized and learned from several corresponding well-representative samples rather than full imbalance data. In addition, the contrastive learning strategy is utilized to enhance the distinguishable features learning between different diseases. In this study, we construct a real-world dataset covering 350 common diseases to evaluate the proposed learning method. The experimental results demonstrate that the proposed TD-GNN significantly outperforms the state-of-the-art baselines, especially for the majority of diseases in which only small samples can be collected from the real world. Additionally, our method can provide a sample-based interpretation for disease prediction learning.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.