Abstract

It is a great challenge to offer a fine-grained and accurate PM2.5 monitoring service in urban areas as required facilities are very expensive and huge. Since PM2.5 has a significant scattering effect on visible light, large-scale user-contributed image data collected by the mobile crowdsensing bring a new opportunity for understanding the urban PM2.5. In this article, we propose a fine-grained PM2.5 estimation method based on random forest with data announced by meteorological departments and collected from smartphone users without any PM2.5 measurement devices. We design and implement a platform to collect data in the real world including the image provided by users. By combining online learning and offline learning, the method based on random forest performs well in terms of time complexity and accuracy. We compare our method with two kinds of baselines: subsets of the whole data sets and six classical models (such as logistic, naive Bayes). Six kinds of evaluation indexes (precision, recall, true-positive rate, false-positive rate, F-measure, and receiver operating characteristic curve area) are used in the evaluation. The experimental results show that our method achieves high accuracy (precision: 0.875, recall: 0.872) on PM2.5 estimation, which outperforms the other methods.

Highlights

  • Since atmospheric pollutants may cause respiratory diseases such as lung cancer and severe environmental problems (NASA climate research, http://climate.nasa. gov/causes/), urban air pollution is a critical issue in both developed and developing countries

  • As most of the data are unlabeled and the image data have irregular real-time characteristics, an algorithm based on semi-supervised random forest (SRF) and online random forest (ORF) is proposed to estimate the urban area air quality (PM2:5)

  • The overall results are discussed first, which have proved that MCS-RF performs best on the overall average index

Read more

Summary

Introduction

Since atmospheric pollutants may cause respiratory diseases such as lung cancer and severe environmental problems (NASA climate research, http://climate.nasa. gov/causes/), urban air pollution is a critical issue in both developed and developing countries. Other studies have tried to provide fine-grained air quality estimation based on a variety of data sources. To achieve the fusion between irregular real-time data (the photos collected by spontaneous volunteers and motivated participants) and fixed data (the official data published by relevant agencies), two main problems need to be solved: (1) correlation analysis of data and (2) modeling and inference for heterogeneous data. We apply the MCS-RF model to infer the real-time and fine-grained PM2:5 throughout Beijing based on heterogeneous data sets. As most of the data are unlabeled and the image data have irregular real-time characteristics, an algorithm based on semi-supervised random forest (SRF) and online random forest (ORF) is proposed to estimate the urban area air quality (PM2:5). The precision of all data sets can be 87.5% and the recall is 87.2%

Related work
À s2 GE ł r s2
T ð20Þ
Evaluation
Results
Conclusion and future work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.