Abstract

The application of Natural Language Processing (NLP) tasks to the attractive social media corpus is very challenging because social media users often prefer communicating with casual language using out- of-vocabulary (OOV) words and internet abbreviations (Slang). That's why, we have to boost the performance of NLP tasks when applied to social media text. So, we are interested in improving the very major fundamental NLP task, Named Entity Recognition (NER), which assign to each entity a label whether it's a (person, location, organization, etc.) from Twitter. NER will be improved by converting non-standard entities to their canonical form called the Named Entity Normalization (NEN). In this paper, we propose a novel weakly supervised joint approach for named entity recognition and normalization for noisy text. We jointly conduct weakly supervised NER and normalization of both single-token OOV words and multitoken Slang to recognize and restore any type of named entities to their canonical form. This approach can give better results than existing state-of-art NER systems, NEN systems and pipe line approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.