Abstract

Medical concept normalization (MCN) i.e., mapping of colloquial medical phrases to standard concepts is an essential step in analysis of medical social media text. The main drawback in existing state-of-the-art approach (Kalyan and Sangeetha, 2020b) is learning target concept vector representations from scratch which requires more number of training instances. Our model is based on RoBERTa and target concept embeddings. In our model, we integrate a) target concept information in the form of target concept vectors generated by encoding target concept descriptions using SRoBERTa, state-of-the-art RoBERTa based sentence embedding model and b) domain lexicon knowledge by enriching target concept vectors with synonym relationship knowledge using retrofitting algorithm. It is the first attempt in MCN to exploit both target concept information as well as domain lexicon knowledge in the form of retrofitted target concept vectors. Our model outperforms all the existing models with an accuracy improvement up to 1.36% on three standard datasets. Further, our model when trained only on mapping lexicon synonyms achieves up to 4.87% improvement in accuracy.

Highlights

  • Medical concept normalization (MCN) involves learning a model which can assign medical concept from a standard lexicon for the given health related mention

  • We deal with medical concept normalization in noisy usergenerated texts like tweets and online discussion forum posts

  • As social media text is highly noisy with irregular grammar and colloquial words, medical concept normalization in social media text is more challenging

Read more

Summary

Introduction

Medical concept normalization (MCN) involves learning a model which can assign medical concept from a standard lexicon for the given health related mention. We deal with medical concept normalization in noisy usergenerated texts like tweets and online discussion forum posts. With the rising popularity of social media platforms, common public are using these platforms to share information. In twitter people share their health experiences and in websites like AskAPatient.com, public post reviews for the drugs they consume. This valuable health information available in social media platforms can be exploited in applications like pharmacovigilance, public health monitoring etc (Kalyan and Sangeetha, 2020c). As social media text is highly noisy with irregular grammar and colloquial words, medical concept normalization in social media text is more challenging

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.