Multi-task learning by hierarchical Dirichlet mixture model for sparse failure prediction

Simon Luo,Jianlong Zhou,Raymond K Wong,Yang Wang,Victor W Chu,Zhidong Li,Fang Chen

doi:10.1007/s41060-020-00219-z

Abstract

Sparsity and noisy labels occur inherently in real-world data. Previously, strong assumptions were made by domain experts to use their experience and expertise to select parameters for their models. Similar approach has been adopted in machine learning for hyper-parameter setting. However, these assumptions are often subjective and are not necessarily the optimal choice. To address this problem, we propose a data-driven approach to automate model parameter learning via a Bayesian nonparametric formulation. We propose hierarchical Dirichlet process mixture model (HDPMM) as a multi-task learning framework. It is used to learn the common parameters across different datasets in the same industry. In our experiments, we verified the capability of HDPMM for multi-task learning in infrastructure failure predictions. It was done by combining HDPMM with hierarchical beta process, which is our failure prediction model. In particular, multi-task learning was used to gain additional knowledge from failure records of water supply networks managed by other utility companies to improve prediction accuracy of our model. Notably, we have achieved superior accuracy for sparse predictions than previous state-of-the-art models. Moreover, we have demonstrated the capability of our proposed model in supporting preventive maintenance of critical infrastructure.

Full Text