Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Yin Zhao,Yahya Abu

doi:10.14569/ijacsa.2013.040503

Abstract

Pollutant forecasting is an important problem in the environmental sciences. Data mining is an approach to discover knowledge from large data. This paper tries to use data mining methods to forecast ?PM?_(2.5) concentration level, which is an important air pollutant. There are several tree-based classification algorithms available in data mining, such as CART, C4.5, Random Forest (RF) and C5.0. RF and C5.0 are popular ensemble methods, which are, RF builds on CART with Bagging and C5.0 builds on C4.5 with Boosting, respectively. This paper builds ?PM?_(2.5) concentration level predictive models based on RF and C5.0 by using R packages. The data set includes 2000-2011 period data in a new town of Hong Kong. The ?PM?_(2.5) concentration is divided into 2 levels, the critical points is 25µg/m^3 (24 hours mean). According to 100 times 10-fold cross validation, the best testing accuracy is from RF model, which is around 0.845~0.854.

Highlights

Air pollution is a major problem for some time
According to 100 times 10-fold cross validation, the best testing accuracy is from Random Forest (RF) model, which is around 0.845~0.854
Because the target data is from a new town in Hong Kong, which means there are lots of people living in this area, so it is need to be a stricter standard of air pollution in such area

Summary

INTRODUCTION

Air pollution is a major problem for some time. Various organic and inorganic pollutants from all aspects of human activities are added daily to the air. One of the most important pollutants is particulate matter. Particulate matter (PM) can be defined as a mixture of fine particles and droplets in the air and this can be characterized by their sizes. Because the target data is from a new town in Hong Kong, which means there are lots of people living in this area, so it is need to be a stricter standard of air pollution in such area. We try to build models for predicting day's concentration level by using two popular tree-based classification algorithms, which are, Random Forest (RF) [4-. While RF and C5.0 are ensemble methods based on CART and C4.5, and each of them has a bunch of basic decision trees in the model.

Methods

METHODOLOGY

DATA PREPARATION

EXPERIMENTS

RESULT

Comparison

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2013
Citations: 8	License type: cc-by

R Discovery Prime

R Discovery Prime

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Feature Selection Using Firefly Algorithm With Tree-Based Classification In Software Defect Prediction
Vina Maulida ... Rudy Herteno
Journal of Electronics, Electromedical Engineering, and Medical Informatics | VOL. 5
Vina Maulida, et. al.Vina Maulida ... Rudy Herteno
11 Aug 2023
Journal of Electronics, Electromedical Engineering, and Medical Informatics | VOL. 5

Kapalı Ortamlarda Gerçek Zamanlı Kişi Tespitinde Makine Öğrenmesi Algoritmalarının Karşılaştırmalı Başarım Analizi
Pelin Yildirim Taşer ... Vahid Akram
Academic Platform Journal of Engineering and Science | VOL. 9
Pelin Yildirim Taşer, et. al.Pelin Yildirim Taşer ... Vahid Akram
29 Jan 2021
Academic Platform Journal of Engineering and Science | VOL. 9

Komparasi Performa Tree-Based Classifier Untuk Deteksi Anomali Pada Data Berdimensi Tinggi dan Tidak Seimbang
Kurniabudi Kurniabudi ... Abdul Harris
Jurnal media informatika Budidarma | VOL. 6
Kurniabudi Kurniabudi, et. al.Kurniabudi Kurniabudi ... Abdul Harris
25 Jan 2022
Jurnal media informatika Budidarma | VOL. 6

A multicenter random forest model for effective prognosis prediction in collaborative clinical research network.
Jin Li ... Kefeng Ding
Artificial Intelligence In Medicine | VOL. 103
Jin Li, et. al.Jin Li ... Kefeng Ding
05 Feb 2020
Artificial Intelligence In Medicine | VOL. 103

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications