Abstract

Social media data are increasingly perceived as alternative sources to public attitude surveys because of the volume of available data that are time-stamped and (sometimes) precisely located. Such data can be mined to provide planners, marketers and researchers with useful information about activities and opinions across time and space. However, in their raw form, textual data are still difficult to analyse coherently and Twitter streams pose particular interpretive challenges because they are restricted to just 140 characters. This paper explores the use of an unsupervised learning algorithm to classify geo-tagged Tweets from Inner London recorded during typical weekdays throughout 2013 into a small number of groups, following extensive text cleaning techniques. Our classification identifies 20 distinctive and interpretive topic groupings, which represent key types of Tweets, from describing activities or informal conversations between users, to the use of check-in applets. Our motivation is to use the classification to demonstrate how the nature of the content posted on Twitter varies according to the characteristics of places and users. Topics and attitudes expressed through Tweets are found to vary substantially across Inner London, and by time of day. Some observed variations in behaviour on Twitter can be attributed to the inferred demographic and socio-economic characteristics of users, but place and local activities can also exert a considerable influence. Overall, the classification was found to provide a valuable framework for investigating the content and coverage of Twitter usage across Inner London.

Highlights

  • With a global reach of 500 million Tweets transmitted each day by over 300 million users globally (Twitter, 2015), Twitter could potentially provide a valuable source of social data

  • The research has demonstrated that unregulated and nonconventional for quantitative analysis, Twitter data can be harvested into a simple classification which can be useful to planners, marketers and researchers

  • The findings revealed distinctive traits of Tweets across space and time, and between Tweeters themselves. It identified the influence of land-use and activity on the content of Tweets which, whilst not surprising in many instances, documents influences that have not been explored on the extensive scale enabled by the methodological approaches presented

Read more

Summary

Introduction

With a global reach of 500 million Tweets transmitted each day by over 300 million users globally (Twitter, 2015), Twitter could potentially provide a valuable source of social data. It is probable that the nature of posts on Twitter varies systematically according to location, and the time of day — because of the nature of popular activities, and the loci of activities of individuals that have different social characteristics Because of their ready availability, analysis of Twitter data has received much attention from the academic community. Whilst Twitter data have previously been used to identify unusual events in space and time (Chae et al, 2012), and as a tool to model sentiment across space (Quercia, Ellis, Capra, & Crowcroft, 2012), there remains a dearth of research at intra-urban scales of analysis –one notable example is Andrienko et al.'s (2013) study of Seattle, but this falls short of linking spatial variation in content to user characteristics

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.