Advancements in technology have enabled researchers to gather large-scale mobility information cost-effectively. In fact, with millions of active users, location-based social media (LBSM) platforms such as Facebook, Twitter, Instagram, Flickr, etc., have become potential big data sources to measure individual behaviour. Despite such passive data collection techniques primarily not providing individual-level information, the sheer volume of such data facilitates a better understanding of aggregate patterns. Besides, conducting the conventional transportation survey during the initial waves of the COVID-19 pandemic was nearly impossible due to social- distancing and lockdown rules. In such context, the present research showcases a method for extracting the mobility traces and identifying the travel patterns of visitors and residents in Delhi from geo-labelled posts on Twitter.Initially, a heuristic classification strategy has been developed based on a few spatiotemporal assumptions to identify and differentiate visitors and residents based on user coordinates. Also, three supervised machine learning techniques, i.e., support vector machine (SVM), k-nearest neighbours (kNN) and decision tree, were used to classify users based on their historical coordinates. Afterwards, the spatial variation of their destination preferences was studied using K-Means, DBSCAN, and Means-Shift clustering techniques, out of which the K- Means clustering method performed best. Lastly, the travel patterns from tweets during pre and during pandemic (COVID-19) were compared using respective clusters.We observed that the performance of the proposed heuristic classifier is comparable with the supervised machine learning (ML) technique used for classification. Furthermore, the results indicate that the proposed model can successfully identify the cluster coordinates for visitors' spots as well as the locations of residents. During the pandemic situation, the mean distance travelled by users is significantly reduced. The study also shows that the number of long-distance trips has also decreased. Also, during COVID, tweets were done from very few unique tourist spots. This suggests lower tendencies of people to travel for tourism purposes. The proposed methods for classification and clustering in the present study will be crucial to obtain individual travel patterns from LBSM data.
Read full abstract