In the present work, it is the first time an interpretable machine learning model has been developed for the estimation of Particulate Matter 10µm (PM10) concentrations over India using Aerosol Optical Depth (AOD) from two different satellites, i.e. INSAT-3D and Moderate Resolution Imaging Spectroradiometer (MODIS) for the period of 7years (2014 to 2020). Ground datasets of AOD are taken from the Aerosol Robotic Network (AERONET) for the validation of satellite-retrieved AOD. The observation of particulate matter (PM) data is acquired from the Central Pollution Control Board (CPCB) station across India. Analysis has been performed on a monthly basis for the given time period. The result shows that AOD products of MODIS exhibit good correlation with AERONET AOD whereas INSAT-3D AOD is not well correlated with AERONET AOD. However, after applying an error envelope and threshold-based filtering technique, we have found that INSAT-3D shows significant correlation with ground-level AOD with approximate correlation of 0.66 for Jaipur and 0.57 for Kanpur exhibiting almost similar performance as MODIS-derived AOD. Satellite AOD data together with ground PM concentration data is used to train the machine learning model (random forest) for the estimation of the PM distribution across India for the year 2020. An encouraging correlation of R-squared (R2) value 0.78 has been observed between the estimated and observed PM10 concentrations. The model demonstrates effective training, mitigating huge overestimation and underestimation. However, despite closely tracking the trends of estimated PM10 with observed PM10, few instances of overestimation persist. This suggests the need for an expanded training dataset to further refine and enhance the model's accuracy. Finally, the machine learning model used for PM10 estimation is found to be optimal for a calibrated satellite AOD product.
Read full abstract