Abstract

More than 7000 rare diseases affect over 400 million people, posing significant challenges for medical research and healthcare. The integration of precision medicine with artificial intelligence offers promising solutions. This work introduces a classifier developed to discern whether research and news articles pertain to rare or non-rare diseases. Our methodology involves extracting 709 rare disease MeSH terms from Mondo and MeSH to improve rare disease categorization. We evaluate our classifier on abstracts from PubMed/MEDLINE and an expert-annotated news dataset, which includes news articles on four selected rare neurodevelopmental disorders (NDDs)-considered the largest category of rare diseases-from a total of 16 analyzed. We achieved F1 scores of 85% for abstracts and 71% for news articles, demonstrating robustness across both datasets and highlighting the potential of integrating artificial intelligence and ontologies to improve disease classification. Although the results are promising, they also indicate the need for further refinement in managing data heterogeneity. Our classifier improves the identification and categorization of medical information, essential for advancing research, enhancing information access, influencing policy, and supporting personalized treatments. Future work will focus on expanding disease classification to distinguish between attributes such as infectious and hereditary diseases, addressing data heterogeneity, and incorporating multilingual capabilities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call