Abstract

Online social networks such as Twitter have emerged as an important mechanism for individuals to share information and post user generated content. However, filtering interesting content from the large volume of messages received through Twitter places a significant cognitive burden on users. Motivated by this problem, we develop a new automated mechanism to detect personalised interestingness, and investigate this for Twitter. Instead of undertaking semantic content analysis and matching of tweets, our approach considers the human response to content, in terms of whether the content is sufficiently stimulating to get repeatedly chosen by users for forwarding (retweeting). This approach involves machine learning against features that are relevant to a particular user and their network, to obtain an expected level of retweeting for a user and a tweet. Tweets observed to be above this expected level are classified as interesting. We implement the approach in Twitter and evaluate it using comparative human tweet assessment in two forms: through aggregated assessment using Mechanical Turk, and through a web-based experiment for Twitter users. The results provide confidence that the approach is effective in identifying the more interesting tweets from a user’s timeline. This has important implications for reduction of cognitive burden: the results show that timelines can be considerably shortened while maintaining a high degree of confidence that more interesting tweets will be retained. In conclusion we discuss how the technique could be applied to mitigate possible filter bubble effects.

Highlights

  • Microblogging services, with Twitter as a prime example, have facilitated a massive interconnection of the world over the past few years [1]

  • The remainder of the paper is structured as follows: in Section 2 we identify the key related work; Section 3 introduces the measure of interestingness, as a general metric to capture the notion of interest, beyond expectation, to a significant sub-group of Twitter users; Section 3.2 describes the application of machine learning techniques to characterise retweet behaviour based on selected features; Section 4 involves validating the interestingness metric as a technique and benchmarking it against human selection of interesting content

  • In this paper we have introduced a method for scoring tweet interestingness using non-semantic methods, and we have demonstrated its ability to infer interesting tweets from the volume and noise within a user’s timeline

Read more

Summary

Introduction

Microblogging services, with Twitter as a prime example, have facilitated a massive interconnection of the world over the past few years [1]. The networked nature of Twitter means that message forwarding (or retweeting), when considered in context of the agents and the network structure, holds potentially valuable accumulated perceptions about the quality, relevance and interest of the content. This represents an implicit form of crowdsourcing [11]. The remainder of the paper is structured as follows: in Section 2 we identify the key related work; Section 3 introduces the measure of interestingness, as a general metric to capture the notion of interest, beyond expectation, to a significant sub-group of Twitter users; Section 3.2 describes the application of machine learning techniques to characterise retweet behaviour based on selected features; Section 4 involves validating the interestingness metric as a technique and benchmarking it against human selection of interesting content

Related work
Inferring interestingness
Predicting the expected retweet count
Categorisation of retweet counts for machine learning
Experimentation and validation
Collective assessment of interestingness
Individual assessment of interestingness
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call