Abstract

The wide adoption of social networking and microblogging platforms by a large number of users across the globe has provided a rich source of unstructured information for understanding users’ behaviors, interests and opinions at both micro and macro levels. An active area in this space is the detection of important real-world events from user-generated social content. The works in this area identify instances of events that impact a large number of users. However, a more nuanced form of an event, known as life event, is also of high importance, which in contrast to real-world events, does not impact a large number of users and is limited to at most a few people. For this reason, life events, such as marriage, travel, and career change, among others, are more difficult to detect for several reasons: i) they are specific to a given user and do not have a wider reaching reflection; ii) they are often not reported directly and need to be inferred from the content posted by individual users; and iii) many users do not report their life events on social platforms, making the problem highly class-imbalanced. In this paper, we propose a semantic approach based on word embedding techniques to model life events. We then use word mover’s distance to measure the similarity of a given tweet to different types of life events, which are used as input features for a multi-class classifier. Furthermore, we show that when a sequence of tweets that have appeared before and after a given tweet of interest (temporal stacking) are considered, the performance of the life event detection task improves significantly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.