Abstract

Social media is an interactive personal tool to articulate an individual's cognizance. This project involves one such micro blogging platform, Twitter. Trends can simply be defined as the frequently mentioned topics throughout the stream of user activities. Mining twitter data for identifying trending topics provides an overview of the topics and issues that are currently popular within the online community. Therefore, the most effective and suitable methodology should be implemented to identify the short term high intensity discussion topic. The trigrams or higher order n-grams are used to determine the trending topic. Twitter Streaming API is used to collect data from the Twitter accounts using API keys and the formatted tweets are stored in a non SQL database. Subsequent steps include data cleansing followed by stemming. The processed data is subjected to trend prediction algorithms like DB Scan, Frequent Pattern Mining, Trees(fuzzy/inductive/decision), Soft frequent pattern mining and empirical statistics such as Frequency metric, TF-IDF, Normalized term frequency and Entropy based on the key parameters to identify the most trending event within a period of time. Thus, the trending topics can be detected with a reasonably close approximation to the expected outcome. This can be used in detecting and predicting events for an early warning system (or) prediction tools and also artificially intelligent services like web search system or recognition systems.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call