Abstract
Recently the continuous increase in data sizes has resulted in many data processing challenges. This increase has compelled data users to find automatic means of looking into databases to bring out vital information. Retrieving information from ‘Big data’, (as it is often referred to) can be likened to finding ‘a needle in the haystack’. It is worthy of note that while big data has several computational challenges, it also serves as gateway to technological preparedness in making the world a global village. Social media sites (of which Twitter is one) are known to be big data collectors as well as an open source for information retrieval. Easy access to social media sites and the advancement of technology tools such as the computer and smart devices have made it convenient for different entities to store enormous data in real time. Twitter is known to be the most powerful and most popular microbloging tool in social media. It offers its users the opportunity of posting and receiving instantaneous information from the network. Traditional news media follow the activities on Twitter network in order to retrieve interesting tweets that can be used to enhance their news reports and news updates. Twitter users include hashtags symbols (#) as prefix to keywords used in tweets to describe its content and to enhance the readability of their tweets. This chapter uses the Apriori method for Association Rule Mining (ARM) and a novel methodology termed Rule Type Identification-Mapping (RTI-Mapping) which is inherited from Transaction-based Rule Change Mining TRCM (Adedoyin-Olowe et al., 2013) and Transaction-based Rule Change Mining-Rule Type Identification (TRCM-RTI) (Gomes et al., 2013) to map Association Rules (ARs) detected in tweets’ hashtags to evolving news reports and news updates of traditional news agents in real life. TRCM uses Association Rule Mining (ARM) to analyse tweets on the same topic over consecutive periods t and t + 1 using Rule Matching (RM) to detected changes in ARs such as emerging, unexpected, new and dead rules. This is obtained by setting user-defined Rule Matching Threshold (RMT) to match rules in tweets at time t with those in tweets at t + 1 in order to ascertain rules that fall into the different patterns. TRCM-RTI is a methodology built from TRCM, it identifies rule types of evolving ARs present in tweets’ hashtags at different time periods. This chapter adopts RTI-Mapping from methodologies in (Adedoyin-Olowe et al., 2013) and (Gomes et al., 2013) to map ARs with online evolving news of top traditional news agents in order to detect and track news and news updates of evolving events. This is an initial experiment of ARs mapping to evolving news. The mapping is done manually at this stage and the methodology is validated using four events and news topics as case studies. The experiments show substantial result on the selected news topics.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.