Abstract

We propose a variant of Principal Component Analysis (PCA) that is suited for real-time applications. In the real-time version of the PCA problem, we maintain a window over the most recent data and project every incoming row of data into a lower-dimensional subspace, which we generate as the output of the model. The goal is to reduce the reconstruction error of the output from the input and to retain major components pertaining to previous distributions of the data. We use the reconstruction error as the termination criteria to update the eigenspace as new data arrives. We then propose two variants of this algorithm that are progressively more time efficient. To verify whether our proposed model can capture the essence of the changing distribution of large datasets in real time, we have implemented the algorithms and compared performance against carefully designed simulations that change distributions of data sources over time in a controllable manner. Furthermore, we have demonstrated that proposed algorithms can capture the changing distributions of real-life datasets by running simulations on datasets from a variety of real-time applications, e.g., localization, activity recognition, customer expenditure, and so forth. Results show that straightforward modifications to convert PCA to use a sliding window of datasets do not work because of the difficulties associated with determination of optimal window size. Instead, we propose algorithmic enhancements that rely on spectral analysis to improve dimensionality reduction. Results show that our methods can successfully capture the changing distribution of data in a real-time scenario, thus enabling real-time PCA.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.