Abstract

CDR (Call Detail Record) data are one type of mobile phone data collected by operators each time a user initiates/receives a phone call or sends/receives an sms. CDR data are a rich geo-referenced source of user behaviour information. In this work, we perform an analysis of CDR data for the city of Milan that originate from Telecom Italia Big Data Challenge. A set of graphs is generated from aggregated CDR data, where each node represents a centroid of an RBS (Radio Base Station) polygon, and each edge represents aggregated telecom traffic between two RBSs. To explore the community structure, we apply a modularity-based algorithm. Community structure between days is highly dynamic, with variations in number, size and spatial distribution. One general rule observed is that communities formed over the urban core of the city are small in size and prone to dynamic change in spatial distribution, while communities formed in the suburban areas are larger in size and more consistent with respect to their spatial distribution. To evaluate the dynamics of change in community structure between days, we introduced different graph based and spatial community properties which contain latent footprint of human dynamics. We created land use profiles for each RBS polygon based on the Copernicus Land Monitoring Service Urban Atlas data set to quantify the correlation and predictivennes of human dynamics properties based on land use. The results reveal a strong correlation between some properties and land use which motivated us to further explore this topic. The proposed methodology has been implemented in the programming language Scala inside the Apache Spark engine to support the most computationally intensive tasks and in Python using the rich portfolio of data analytics and machine learning libraries for the less demanding tasks.

Highlights

  • Network analysis refers to the tools applied on network-based data towards the discovery of useful knowledge

  • Other properties used to evaluate human dynamics that we have explored in this study are graph based properties such as: betweenness centrality, weighted degree, PageRank and core number

  • When filtering is applied with α = 0.01 threshold, Adjusted Random Index (ARI) drops slightly to 0.81 which indicates a high similarity between clustering, while the number of detected communities increases 20% which is significant compared to the increase in communities when filtering with α = 0.05 is applied

Read more

Summary

Introduction

Network analysis refers to the tools applied on network-based data towards the discovery of useful knowledge. The research area of network analysis enjoys widespread use, mainly because there are numerous and significant diverse applications that require the manipulation and analysis of network-based data, such as social network analysis, searching and mining the Web, pattern mining in bioinformatics and neuroscience. We use information collected from CDRs to generate a network, representing the communication traffic between different parts of the mobile network. This network is represented by a graph G(V, E), where V is the set of nodes (vertices) and E is the set of edges (links). Mobile phone data can be of great value for urban policy making as they contain valuable information about users’ mobility and activity. The activity detected through mobile phone data is changing over different day types and over different day times [1]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.