Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings

Katikapalli Subramanyam Kalyan,Sivanesan Sangeetha

doi:10.18653/v1/2020.louhi-1.3

Abstract

Medical concept normalization helps in discovering standard concepts in free-form text i.e., maps health-related mentions to standard concepts in a clinical knowledge base. It is much beyond simple string matching and requires a deep semantic understanding of concept mentions. Recent research approach concept normalization as either text classification or text similarity. The main drawback in existing a) text classification approach is ignoring valuable target concepts information in learning input concept mention representation b) text similarity approach is the need to separately generate target concept embeddings which is time and resource consuming. Our proposed model overcomes these drawbacks by jointly learning the representations of input concept mention and target concepts. First, we learn input concept mention representation using RoBERTa. Second, we find cosine similarity between embeddings of input concept mention and all the target concepts. Here, embeddings of target concepts are randomly initialized and then updated during training. Finally, the target concept with maximum cosine similarity is assigned to the input concept mention. Our model surpasses all the existing methods across three standard datasets by improving accuracy up to 2.31%.

Highlights

Internet users use social media to voice their views and opinions
Medical concept normalization aims at discovering standard medical concepts in free-form text
Health related mentions are mapped to standard concepts in a clinical knowledge base

Summary

Background

Internet users use social media to voice their views and opinions. Medical social media is a part of social media in which the focus is limited to health and related issues (Pattisapu et al, 2017). Text similarity approach of Pattisapu et al (2020) is the need to generate target concept embeddings separately using graph embedding methods This is time and resource consuming when different vocabularies are used for mapping in different data sets (e.g., SNOMED-CT is used in CADEC (Karimi et al, 2015) and PsyTAR (Zolnoori et al, 2019) datasets, MedDRA (Mozzicato, 2009) is used in SMM4H2017 (Sarker et al, 2018)). By learning the representations of target concepts along with input concept mention, our model a) exploits target concepts information unlike existing text classification approaches (Tutubalina et al, 2018; Miftahutdinov and Tutubalina, 2019; Kalyan and Sangeetha, 2020a) and b) eliminates the time and resource consuming process of separately generating target concept embeddings unlike existing text similarity approach (Pattisapu et al, 2020). Our model achieves the best results across three standard data sets surpassing all the existing methods with an accuracy improvement of up to 2.31%

Model Description

Evaluation Metric

Implementation Details

Datasets

Results

Merit Analysis

Demerit Analysis

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 25	License type: cc-by

Similar Papers

Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts
Katikapalli Subramanyam Kalyan ... Sivanesan Sangeetha
-
Katikapalli Subramanyam Kalyan, et. al.Katikapalli Subramanyam Kalyan ... Sivanesan Sangeetha
01 Jan 2020
01 Jan 2020

MCN: A comprehensive corpus for medical concept normalization.
Yen-Fu Luo ... Weiyi Sun
Journal of Biomedical Informatics | VOL. 92
Yen-Fu Luo, et. al.Yen-Fu Luo ... Weiyi Sun
22 Feb 2019
Journal of Biomedical Informatics | VOL. 92

The 2019 National Natural language processing (NLP) Clinical Challenges (n2c2)/Open Health NLP (OHNLP) shared task on clinical concept normalization for clinical records.
Yen-Fu Luo ... Yanshan Wang
Journal of the American Medical Informatics Association | VOL. 27
Yen-Fu Luo, et. al.Yen-Fu Luo ... Yanshan Wang
24 Sep 2020
Journal of the American Medical Informatics Association | VOL. 27

Supervised identification and linking of concept mentions to a domain-specific ontology
Gabor Melli ... Martin Ester
-
Gabor Melli, et. al.Gabor Melli ... Martin Ester
26 Oct 2010
26 Oct 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings

Abstract

Highlights

Summary

Talk to us

Similar Papers