Abstract

Social networks are real‐time platforms formed by users involving conversations and interactions. This phenomenon of the new information era results in a very huge amount of data in different forms and modalities such as text, images, videos, and voice. The data with such characteristics are also known as big data with 5‐V properties and in some cases are also referred to as social big data. To find useful information from such valuable data, many researchers tried to address different aspects of it for different modalities. In the case of text, NLP researchers conducted many research studies and scientific works to extract valuable information such as topics. Many enlightening works on different platforms of social media, like Twitter, tried to address the problem of finding important topics from different aspects and utilized it to propose solutions for diverse use cases. The importance of Twitter in this scope lies in its content and the behavior of its users. For example, it is also known as first‐hand news reporting social media which has been a news reporting and informing platform even for political influencers or catastrophic news reporting. In this review article, we cover more than 50 research articles in the scope of topic detection from Twitter. We also address deep learning‐based methods.

Highlights

  • Topic detection and tracking, which is called TDT, is techniques and methods used for detecting news or document related topics best fitting their relevant intellectual material and tracking these events or detected topics through dedicated media

  • Topic detection and tracking has been widely applied to documents, offline corpus, and newswire, including a pilot study running from 1996 till 1997 and sponsored by Defense Advanced Research Projects Agency (DARPA) [3]

  • In case of Twitter, the data exchange metrics predict that 7,454 tweets are sent per second which are about 644,025,600 tweets per day [4]. is metric for 2013 was reported by Twitter officials to be more than 500,000,000 per day [5]. Importance of this large amount of data that has large variety of topics which users tend to talk about comes to light when researchers revealed that users are most likely to talk about real-world events in social media networks

Read more

Summary

Introduction

Topic detection and tracking, which is called TDT, is techniques and methods used for detecting news or document related topics best fitting their relevant intellectual material and tracking these events or detected topics through dedicated media. Is metric for 2013 was reported by Twitter officials to be more than 500,000,000 per day [5] Importance of this large amount of data that has large variety of topics which users tend to talk about comes to light when researchers revealed that users are most likely to talk about real-world events in social media networks. Detection of a real-world event with large volume and velocity of data requires more research than finding an event on selected and filtered datasets [6] Another problem with this media is noisiness of posted tweets. Ese tweets, unlike news articles and intellectual documents, are not well written and contain misspelling, grammatical errors, and even words or expressions like “yaaaaaay” that are not literary Expressed problems of this media make TDT task much harder.

Twitter
Categorization of Methods
Preprocessing
Event Detection in Twitter
Data and Evaluation Issues
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.