Abstract

Mobile network data has been proven to provide a rich source of information in multiple statistical domains such as demography, tourism, urban planning, etc. However, the incorporation of this data source to the routinely production of official statistics is taking many efforts since a diversity of highly entangled issues (access, methodology, IT tools, quality, skills) must be solved beforehand. To do this, one-off studies with concrete data sets are not enough and a standard statistical production process must be put in place. We propose a concrete modular process structured into evolvable modules detaching the strongly technological layer underlying this data source from the necessary statistical analysis producing outputs of interest. This architecture follows the principles of the so-called ESS Reference Methodological Framework for Mobile Network Data. Each of these modules deals with a different aspect of this data source. We apply hidden Markov models for the geolocation of mobile devices, use a Bayesian approach on this model to disambiguate devices belonging to the same individual, compute aggregate numbers of individuals detected by a telecommunication network using probability theory, and model hierarchically the integration of auxiliary information from the telco market and official data to produce final estimates of the number of individuals across different territorial regions in the target population. A first simple illustrative proposal has been applied to synthetic data providing preliminary software tools and accuracy indicators monitoring the performance of the process. Currently, this exercise has been applied to the estimation of present population and origin-destination matrices. We present an illustrative example of the execution of these production modules comparing results with the simulated ground truth, thus assessing the performance of each production module.

Highlights

  • Mobile network data, i.e. digital data generated in a mobile telecommunication network by the interaction between a mobile station and a base transceiver station [1], constitutes a rich source of information for Social Science, in (2021) 10:20 general, and for Official Statistics, in particular

  • The need for a process-oriented production system instead of a product-oriented or even domainoriented system is well-known in Official Statistics, where important initiatives have been carried out in the last decade to avoid so-called stove pipe models driving National Statistical Offices (NSOs) to production in silos, models which reduce the cost-efficiency to the point of endangering the future feasibility of the production of official statistics [31]

  • We have applied this approach to our simulated data set with N = 500 individuals in the target population, Nnet = 186 individuals detected by the network, and ND = 218 mobile devices

Read more

Summary

Introduction

I.e. digital data generated in a mobile telecommunication network by the interaction between a mobile station (device) and a base transceiver station (antenna, in loose terms) [1], constitutes a rich source of information for Social Science, in (2021) 10:20 general, and for Official Statistics, in particular. We make a proposal for an end-to-end statistical process going from the raw telco data generated at the mobile telecommunication networks to the final target population count estimates.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.