Respirable solid particles and liquid droplets suspended in the air, known as particulate matter (PM), may have a significant impact on human health, urban infrastructure, and natural and agricultural systems. The adverse effects of PM have raised public concern, especially in heavily polluted areas in the world, making it imperative the development of strategies to keep the concentration levels of these pollutants below harmful thresholds. Traditional machine learning approaches have been used to forecast PM concentrations. However, complex chemical processes may be involved in the composition of PM in the atmosphere and influenced by many meteorological parameters. Thus, underlying data distributions of PM data, uninterruptedly collected, may evolve over time. This phenomenon, known as concept drift, implies an important challenge for traditional machine learning techniques since they do not have mechanisms to handle changes on data distribution at the running time, thus limiting their forecasting capabilities. The overall goal of this work is to evaluate whether the incorporation of mechanisms to deal with concept drift, together with online sequential learning approaches, can improve the accuracy of PM forecasting. To do so, new mechanisms that enable online dynamic ensembles to handle and retain knowledge from different concepts for more time were proposed and adapted to EOS and DOER algorithms, resulting in three approaches: EOS-rank, EOS-D and DOER-rank. These ensemble strategies, which were based on Online Sequential Extreme Learning Machines (OS-ELM), were compared with five algorithms from the literature. To evaluate their performance, real-world and artificial datasets, with known dynamic behaviors, and PM concentration datasets from different cities of the State of São Paulo, Brazil, were used in the experiments. The obtained results showed that the proposed approaches can handle dynamic environments with different rates of drift and that EOS-rank was capable of outperforming most approaches from the literature in scenarios with higher rates of drift. The results also indicate that PM data distributions slowly evolve over time and, consequently, the proposed mechanisms that keep information of past concepts and slowly adapt the ensemble tend to present better results when applied to forecast PM concentration.