Abstract

The development of intelligent systems that use social media data for decision-making processes in numerous domains such as politics, business, marketing, and finance, has been made possible by the popularity of social media platforms. However, the utilization of textual data from social media in the healthcare management industry is still somewhat limited when it is compared to other industries. Investigating how current machine learning and natural language processing technologies can be used in the healthcare industry to gauge public sentiment is an important study. Earlier works on healthcare sentiment analysis have utilized traditional word embedding models trained on the general and medical corpus. However, integration of medical knowledge to pre-trained word embedding models has not been considered yet. Word embedding models trained on the general corpus led to the problem of lacking medical knowledge and the models trained on the small size of the medical corpus have limitations in capturing semantic and syntactic properties. This research proposes a new word embedding model named Word Embedding Integrated with Medical Knowledge Vector (WE-iMKVec). The proposed model integrates sentiment lexicons and medical knowledgebases into the pre-trained word embedding to enrich the properties of word embedding. A new medical-aware sentiment polarity score is proposed for the utilization in learning neural-network sentiment and these vectors incorporate with the original pre-trained word vectors. The resulting vectors are enriched with lexicon vectors and the medical knowledge vectors: Adverse Drug Reaction (ADR) vector and Unified Medical Language System (UMLS) vector are used to build the proposed WE-iMKVec model. WE-iMKVec is validated on the five different social media healthcare review datasets and the empirical results showed its superiority over traditional word embedding models in medical sentiment analysis. The highest improvement can be found in the patients.info medical condition dataset where the proposed model outperforms three conventional word2vec models (Google-News, PubMed-PMC, and Drug Reviews) by 12.7 %, 31.4 %, and 25.4 % respectively in terms of F1 score.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.