Abstract
The current paper investigates the problem of multimodal named entity recognition from Twitter data. Named entity recognition (NER) is an important task in natural language processing and has been carefully studied in recent decades. NER from tweets is particularly challenging because of 1) tweets are limited in length, 2) contains noisy text, and 3) contains hashtags. Moreover often tweets are associated with images and hyperlinks. Existing works on tweet-NER mostly concentrate on multimodal deep learning based models neglecting the use of hand-crafted features and usage of hyperlinks. The current paper investigates the incorporation of hand-crafted features extracted from different modalities like images, hyperlinks while extracting named entities from tweet-text. A large set of hand-crafted features are extracted from different modalities (images, hyperlinks) and those are added with the features extracted by a hybrid deep-neural model, bi-directional LSTM and CNN, followed by a conditional random field to perform this task. Several variants of these models in association with different hand-crafted feature sets are designed. Extensive experimentations on a multimodal Twitter data (containing text, images and urls) illustrate that character level hand-crafted features significantly improve the performance of the systems. In a part of the paper, results of the proposed models are also shown on a standard NER dataset, CoNLL 2003 dataset.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.