Abstract

Air pollution is an important problem for public health. The spatiotemporal analysis is a crucial step for understanding the complex characteristics of air pollution. Using many sensors and high-resolution time-step observations makes this task a big data challenge. In this study, unsupervised machine learning algorithms were applied to analyze spatiotemporal patterns of air pollution. The analysis was conducted using PM10 big data collected from almost 100 sensors located in Krakow, over a period of one year, with data being recorded at 1-h intervals. The analysis results using K-means and SKATER clustering revealed distinct differences between average and maximum values of pollutant concentrations. The study found that the K-means algorithm with Dynamic Time Warping (DTW) was more accurate in identifying yearly patterns and clustering in rapidly and spatially varying data, compared to the SKATER algorithm. Moreover, the clustering analysis of data after kriging greatly facilitated the interpretation of the results. These findings highlight the potential of machine learning techniques and big data analysis for identifying hot-spots, cold-spots, and patterns of air pollution and informing policy decisions related to urban planning, traffic management, and public health interventions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call