Abstract

Positive matrix factorization (PMF) techniques have been applied in many environmental studies. The commercial version of the PMF method has a relatively moderate practical limit for the size of the input data matrix, since the computer memory and time needed for the commercial method increases quadratically with the number of elements of solution matrices. To extend the applications of the PMF techniques to large datasets, we exercised alternative methods that demand less computer memory and time. One such method, called non-negative matrix factorization (NMF) here, is extremely memory efficient, compared with the commercial PMF method. Both NMF and PMF methods are sensitive to the initialization of solution matrices, and the use of random numbers in the initialization usually starts with a large prediction error, and requires a number of model runs with different random seeds. A novel, chemical mass balance method (ROC) is introduced here to provide a reasonable initialization for the NMF method for large data sets. Both NMF and ROC methods were validated with an ideal Cross example and the benchmark example of the commercial PMF method. The NMF-ROC method was further evaluated, in terms of computer time and the prediction error, in the preliminary application to a data set that contains particle-phase polar organic compounds analyzed for a number of samples collected in Central California during the California Regional PM 10/PM 2.5 air quality study (CRPAQS, 1999–2001). The NMF-ROC method was demonstrated to perform better than the NMF, PMF and PMF-ROC methods in the CRPAQS data set. This performance enhancement is expected to be magnified for larger data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call