Hidden Markov model (HMM) has proven effective for inferential sensing of dynamical and non-Gaussian industrial processes. However, large-scale process data introduce tremendous computational costs to training the HMM due to the centralized serial learning mode. What is worse, industrial time-series data chains are frequently broken due to sensor or communication system failures, whereas the current learning paradigm for the HMM cannot handle this issue, resulting in poor generalization performance. To overcome these challenges, this article proposes a distributed semisupervised HMM (DisSsHMM). The DisSsHMM first divides the whole data into continuous data blocks (DBs), based on which the computations in both forward learning and backward learning are segmented. Then, based on the expectation–maximization algorithm, a fusion scheme is derived to integrate information extracted from each DB. This enables distributed training and making full utilization of available data. The performance of the DisSsHMM is evaluated using both numerical and real-world industrial cases, demonstrating that when compared with the traditional serial learning paradigm, the DisSsHMM can significantly improve the computational efficiency as well as the estimation accuracy for inferential sensing.
Read full abstract