Abstract

We study the task of entity linking for Vietnamese tweets, which aims at detecting entity mentions and linking them to corresponding entries in a given knowledge base. Unlike authored news or textual web content, tweets are noisy, irregular, and short, which causes entity linking in tweets much more challenging.We propose an approach to build an end-to-end entity linking system for Vietnamese tweets. The system consists of two stages. The first stage is to detect mentions and the second one performs entity disambiguation. We create a dataset including 524 Vietnamese tweets with 1,061 mentions and evaluate the system on this dataset. Our system achieves 69.2% F1-score. In order to show that our system is language-independent,we evaluate the system on a public dataset including 562 English tweets. The experiment results show that our system achieves 54.5% F1-score and outperforms the state-of-the-art end-to-end entity linking methods for tweets. To the best of our knowledge, this is the first attempt to build an end-to-end entity linking system for Vietnamese tweets and the system achieves very encouraging performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.