Abstract

Social events comprise some of the most popular topics in social media. Automatically identifying planned social events and extracting structured information, such as event title, date, and location, would enable more effective index, display and search for social events. However, the informal and noisy nature of language used in social media can degrade the quality of event extraction, resulting in broken titles, incorrect or absent attributes - making the resulting event databases not suitable for realistic applications. Previous work mostly focused on event identification and categorization in Twitter. Yet, event title extraction, arguably one of the most useful and difficult tasks in this domain, has never been investigated. In this paper, we address the task of identifying and extracting structured information (titles, dates, locations) for planned social events, and introduce SEEFT, a social event extraction system, which uses social media content to discover events. To extract the event title and other attributes, SEEFT fuses the original social media content and the content of other Tweets and webpages. Experiments over multiple popular event types and more than a thousand of event instances show that SEEFT significantly outperforms the previous state-of-the-art system in event identification. Moreover, by fusing information from multiple sources, SEEFT is able to extract event titles with high accuracy, providing the foundation for practical applications such as event discovery, search, and recommendation.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.