Abstract

In this paper, we develop a sparse method for unsupervised dimension reduction for data from an exponential-family distribution. Our idea extends previous work on Generalised Principal Component Analysis by adding L1 and SCAD penalties to introduce sparsity. We demonstrate the significance and advantages of our method with synthetic and real data examples. We focus on the application to text data which is high-dimensional and non-Gaussian by nature and discuss the potential advantages of our methodology in achieving dimension reduction.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call