Abstract

High-Dimensional (HD) processes have become prevalent in many data-intensive scientific domains and engineering applications. The monitoring of HD categorical data, where each variable of interest is evaluated by attribute levels or nominal values, however, has seldom been studied. As the joint distribution of HD categorical variables can be fully characterized by a high-way contingency table or a high-order tensor, we propose a Probabilistic Tensor Decomposition (PTD) which factorizes a huge tensor into a few latent classes (rank-one tensors) to dramatically reduce the number of model parameters. Moreover, to enable high interpretability of this latent-class-type PTD model, a novel polarization regularization is devised, which makes each latent class focus on only a few vital combinations of attribute levels of categorical variables. An Expectation-Maximization algorithm is designed for parameter estimation from a historical normal dataset in Phase I, and an exponentially weighted moving average control chart is built in Phase II to monitor the proportions of latent classes that act as surrogates for each original categorical vector. Extensive simulations and a real case study validate the superior inference and monitoring performance of our proposed efficient and interpretable method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call