Abstract

The growing popularity of social media provides a huge volume of social data including Tweets. These collections of social data can be potentially useful, but the extent of meaningful data in these collections has not been sufficiently researched, especially in South Korea Twitter data. In general, the South Korea Twitter data has been researched as a source of political media. Nonetheless, previous research on South Korea Twitter data has not adequately covered what kind of trend Twitter represents in terms of major topic categories such as politics, economics, or sports. In this paper, we present a cross-media approach to define the nature of South Korea Tweets by inferring the topic category distribution through short-text categorization. We select newspapers as cross-media, examine the categorization of news articles from major newspapers, and then train our classifier based on the features from each topic category. In addition, for grafting news topics onto South Korea Tweets, we propose a word clustering and filtering approach to exclude those words that do not provide semantic content for the topic categories. Based on the proposed procedures, we analyze the South Korea Tweets to determine the primary topic category focus of Twitter users. We observe the special behaviors of the South Korea Twitter users based on various parameters such as date, time slot, and day of the week. Because our research includes a macroscopic analysis of Twitter data using a cross-media strategy, our research can provide useful resources for other social media analysis as well.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.