Abstract

AbstractWith the objective to automatically detect diseases from symptoms in free‐text data, a methodology to extract symptom‐diagnosis knowledge from online medical textual data in Q&A domain is proposed in this paper: (1) a term frequency‐inverse document frequency and PRECISION method is adopted to retrieve symptom words from unstructured text; (2) a variable precision rough set based genetic algorithm is applied to reduce redundant symptom words, and a rough set based rule is utilized for adding discriminative symptom words assisting to discriminate diseases sharing similar symptoms; (3) by employing fuzzy linguistic variables to express the risk level of disease or severity level of symptoms, a knowledge base with fuzzy belief structure is generated. Using data extracted from a Chinese medical Q&A forum for training and testing, some classical gastrointestinal diseases serve as a case study to evaluate the efficiency of the proposed methodology. Subsequently performance comparisons are made between the proposed methodology and some other classifiers, such as the decision tree algorithms including ID3 and J45, and the Bayesian network classifier. The comparative results demonstrate that the proposed methodology outperforms the decision tree algorithms and the Bayesian network classifier.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.