Using huge amounts of road sensor data for official statistics

Marco J H Puts,1 Center For Big Data Statistics, Statistics Netherlands, Heerlen, The Netherlands ,Piet J H Daas,Martijn Tennekes,3 Institute For Computing And Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands ,Chris De Blois,2 Department Of Tra C And Transport Statistics, Statistics Netherlands, The Netherlands

doi:10.3934/math.2019.1.12

Marco J H Puts, 1 Center For Big Data Statistics, Statistics Netherlands, Heerlen, The Netherlands + Show 5 more

Open Access

https://doi.org/10.3934/math.2019.1.12

Copy DOI

Abstract

On the Dutch road network, about 60,000 road sensors are located of which 20,000 sensors are on the Dutch highways. Both vehicle counts and average speed are collected each minute and stored in the National Traffic Daffic statistics several methodological challenges needed to be solved. The first was developing a method to check and improve the data quality as quite some sensors lacked data for many minutes during the day. A cleaning and estimation step was implemented that enabled a precise and accurate estimate of the number of vehicles actually passing the sensors for each minute. The second challenge was monitoring the stream of incoming and outgoing data and controlling this fully automatic statistical process. This required defining quality indicators on the raw and processed sensor data. The fourth challenge was determining calibration weights based on the geographic locations of the road sensors on the roads. This was needed because road sensors are not uniformly distributed over the road network. As the number of active sensors fluctuates over time, the weights need to be determined periodically. As a result of these steps accurate numbers could be produced on the traffic intensity during various periods on regions in the Netherlands.

Highlights

Big data is a very interesting data source for official statistics
Its use brings a lot of challenges on how to create statistics based on such data sources [1]
The core statistical process that will be considered in this paper is the cleaning process of road sensor data

Summary

Introduction

Big data is a very interesting data source for official statistics. its use brings a lot of challenges on how to create statistics based on such data sources [1]. In some cases the amount is so large that even checking a small fraction of the data is a huge task In such cases, we can only check the quality and clean big data using a fully automated process. In the Netherlands, minute based vehicle counts are gathered at 24,000 sites by approximately 60,000 road sensors. Since vehicles pass sensors at different speeds and the sampling frequency is limited to ’only’ one sample per minute, one does not find a large correlation between the data of two sensors; even if they are 250 meters apart This makes it hard to clean the data purely based on comparing the findings of close-by sensors.

Cleaning the road sensor data

Discrepancy between data and signal

Transforming the data into a signal

Monitoring quality

Calibrating the data

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: AIMS Mathematics	Publication Date: Dec 19, 2018
Citations: 11	License type: cc-by

R Discovery Prime

R Discovery Prime

Using huge amounts of road sensor data for official statistics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: AIMS Mathematics

Lead the way for us

Similar Papers

Spatiotemporal periodical pattern mining in traffic data
Tanvi Jindal ... Jiawei Han
-
Tanvi Jindal, et. al.Tanvi Jindal ... Jiawei Han
11 Aug 2013
11 Aug 2013

Применение вейвлет-анализа для исследования интенсивности транспортного потока
M G Boyarshinov ... A S Vavilin
Intelligence. Innovations. Investment | VOL. -
M G Boyarshinov, et. al.M G Boyarshinov ... A S Vavilin
01 Jan 2021
Intelligence. Innovations. Investment | VOL. -

Beware Thy Bias: Scaling Mobile Phone Data to Measure Traffic Intensities
Johan Meppelink ... Jens Van Langen
Sustainability | VOL. 12
Johan Meppelink, et. al.Johan Meppelink ... Jens Van Langen
01 May 2020
Sustainability | VOL. 12

A framework for population inference: Combining machine learning, network analysis, and non-probability road sensor data
Jonas Klingwort ... Joep Burger
Computers, Environment and Urban Systems | VOL. 103
Jonas Klingwort, et. al.Jonas Klingwort ... Joep Burger
16 May 2023
Computers, Environment and Urban Systems | VOL. 103

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using huge amounts of road sensor data for official statistics

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: AIMS Mathematics