Abstract

WM2-O-07 Introduction: The presence of missing data can be a major problem in air pollution research, especially in time-series analyses where continuous data sequences may be required or statistically advantageous. The multiple imputation (MI) method is now a well-established technique for analyses of missing data in social research, but environmental applications remain very limited. An alternative to MI method, optimal linear estimators, a weighting by the probability of observing data, is theoretically simple, involves fewer modeling assumptions, and is efficient although sensitive to the choice of error and weighting models. These 2 methods are evaluated for their robustness in handling missing and uncertain urban air toxics (UAT) data in an application aimed at developing exposure measures for a longitudinal investigation of urgent care utilization for asthma among children. Methods: UATs including carbonyls and volatile organic compounds (VOCs) were collected on a daily basis at a Dearborn air monitoring site (Michigan) from April 2001 to April 2002. These data included numerous replicates and interlaboratory analyses. From the replicate measurements, error models were derived and incorporated in the optimal estimators. Additional variables incorporated into both MI and optimal estimators, besides VOCs and carbonyls, included other pollutants (PM2.5, PM10) and local surface meteorologic observations. The performance of the estimators for different missingness patterns was evaluated using the index of agreement (d2), correlation coefficient (r), the root mean square error (RMSE), and the mean absolute error (MAE). Results: A total of 69 VOC and carbonyl compounds were measured. Because of large number of compounds that consistently fell below detection limits, 20 toxic compounds were selected for this study. Several error models were developed for these compounds, eg, the median absolute relative error was 12% for VOCs and 20% for carbonyls. Preliminary results obtained for benzene with 25% randomly selected missing data indicated that both models gave similar performance, although the optimal linear estimator was slightly outperformed (d2 = 0.87 vs. 0.84, R2 = 0.62 vs. 0.53, MAE = 0.03 ppbv vs. 0.17 ppbv). A complete analyses will be presented at the conference. Discussion: To our current knowledge, this is the first study that examined interlaboratory comparisons and that used imputation techniques for air toxics. Using the error models and imputation methods, complete data are obtained from which exposure measures are derived for our epidemiologic study linking UATs exposures and asthma. The procedures also help to quantify and reconcile uncertainties in air toxics data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call