Abstract
The thesis concerns the use of big data in Official Statistics; the aim is to bring some new experimental studies into the big data literature. In particular, the purpose is to evaluate if and how big data could be used in Official Statistics. Only a few experiments exists on the use of big data for statistical purposes; it is a challenge task and a lot of experimentation is needed to find out evidence and solutions to use big data for statistical purposes. The analysis performed in the thesis goes into two different directions: 1. combining a traditional data source with a big data source to verify the potential of the latter to replicate official results; 2. analyzing a big data source per se and then trying to combine with an Official Statistics source to identify common patterns. The thesis initially proposes a literature review of definitions of big data and experiments, in particular concerning the use of the new sources combined with traditional data sources. Then, three original studies have been performed: the first two concern mobility in Lombardy region using mobile phone data. They both refer to the same issue (mobility patterns), but they differ in the traditional data source used: Origin/Destination matrix in the first case, an integrated version of the O/D matrix in the second. The objective of these two studies is trying to put in a unique interpretative framework one traditional statistical source and one typical kind of big data in order to evaluate some informative potentialities of this approach. In particular, we wanted to check if the two sources show common patterns, to evaluate future uses of the big data source in Official Statistics. The third study shows the pilot that was carried out during the traineeship I had the opportunity to attend at Eurostat, in collaboration with the Task Force Big Data. It concerns the use of Wikipedia, free online encyclopedia, for Tourism Statistics. The aim is to evaluate the use of Wikipedia page views as a source of information for the identification of factors that drive tourism to an area and whether it is possible to predict tourism flows using these data. A final chapter proposes conclusions and future remarks on the use of big data in Official Statistics. Two of the studies (the first on mobility patterns and the one on Wikipedia) have been or are being published, in a shorter and revised version. The three experiments show some potential in the use of big data in Official Statistics. The study needs more in-depth analysis, many more experiments and considerations will be necessary before we can achieve some definitive and convincing approaches.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.