Abstract

The healthcare data available on social media has exploded in recent years. The cures and treatments suggested by non-medical experts can lead to more damage than expected. Assuring the credibility of the information conveyed is an enormous challenge. This study aims to categorize the credibility of online health information into multiple classes. This paper proposes a model named Text Analysis of Web-based Health Information (TA-WHI), based on an algorithm designed for this. It categorizes health-related social media feeds into five categories: sufficient, fabricated, meaningful, advertisement, and misleading. The authors have created their own labeled dataset for this model. For data cleaning, they have designed a dictionary having nouns, adverbs, adjectives, negative words, positive words, and medical terms named MeDF. Using polarity and conditional procedure, the data is ranked and classified into multiple classes. The authors evaluate the performance of the model using deep-learning classifiers such as CNN, LSTM, and CatBoost. The suggested model has attained an accuracy of 98% with CatBoost.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call