Abstract

Sequential or online dimensional reduction is of interests due to the explosion of streaming data based applications and the requirement of adaptive statistical modeling, in many emerging fields, such as the modeling of energy end-use profile. Principal Component Analysis (PCA), is the classical way of dimensional reduction. However, traditional PCA coincides with maximum likelihood interpretation only when data follows Gaussian distribution. The Bregman Divergence was introduced to extend PCA with maximum likelihood in exponential family distribution. In this work, we study this generalized form PCA for Bernoulli variables, which is called Logistic PCA (LPCA). We extend the batch-mode LPCA to a sequential version (SLPCA). The convergence property of this algorithm is discussed compared to the batch version (BLPCA), as well as its performance in reducing the dimension for multivariate binary-state systems. Its application in building energy end-use profile modeling is also investigated.

Highlights

  • Sequential data mining has received considerable attention recently as the development in wireless-sensor information technology facilitates the collection of huge amount of streaming data – This brings about several challenges on the efficiency in computation, storage and the performance of statistical learning algorithms [2]

  • Variables, which we are interested in, the generalized Principal Component Analysis (PCA) can be viewed as Logistic PCA (LPCA)

  • We extend the LPCA to the sequential version, based on the sequential convex optimization theory [7] [8]

Read more

Summary

INTRODUCTION

Sequential data mining has received considerable attention recently as the development in wireless-sensor information technology facilitates the collection of huge amount of streaming data – This brings about several challenges on the efficiency in computation, storage and the performance of statistical learning algorithms [2]. Among the dimensional reduction techniques, Principal Component Analysis (PCA) is most widely-known. PCA coincides with maximum likelihood reconstruction only when the data are consistently Gaussian distributed. It is natural to consider alternatives of traditional PCA when data largely deviates from Gaussian distribution [4]. Bregman Divergence is introduced to achieve a generalized PCA framework for a family of exponential distributed data (i.e. ePCA) [4]. This paper is organized as follows: In Section II, the background and the detail of the algorithm is given, including PCA, exponential family, the Bregman Divergence and eventually the sequential LPCA (i.e. SLPCA) which we propose.

Principal Component Analysis
Exponential Family
Exponential Family PCA
CONVERGENCE ANALYSIS
Convergence Analysis
Simulated Binary-State System
Phase I
Building End-Use Energy Modeling
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.