Abstract

Tweets exchanged over the Internet are an important source of information even if their characteristics make them difficult to analyze (e.g., a maximum of 140 characters; noisy data). In this paper, we address the problem of extracting relevant topics through tweets coming from different communities. More precisely we are interested to address the following question: which are the most relevant terms given a community. To answer this question we define and evaluate new variants of the traditional TF-IDF. Furthermore we also show that our measures are well suited to recommend a community affiliation to a new user. Experiments have been conducted on tweets collected during French Presidential and Legislative elections in 2012. The results underline the quality and the usefulness of our proposal.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call