Abstract
In the actual working site, the equipment often works in different working conditions while the manufacturing system is rather complicated. However, traditional multi-label learning methods need to use the pre-defined label sequence or synchronously predict all labels of the input sample in the fault diagnosis domain. Deep reinforcement learning (DRL) combines the perception ability of deep learning and the decision-making ability of reinforcement learning. Moreover, the curriculum learning mechanism follows the learning approach of humans from easy to complex. Consequently, an improved proximal policy optimization (PPO) method, which is a typical algorithm in DRL, is proposed as a novel method on multi-label classification in this paper. The improved PPO method could build a relationship between several predicted labels of input sample because of designing an action history vector, which encodes all history actions selected by the agent at current time step. In two rolling bearing experiments, the diagnostic results demonstrate that the proposed method provides a higher accuracy than traditional multi-label methods on fault recognition under complicated working conditions. Besides, the proposed method could distinguish the multiple labels of input samples following the curriculum mechanism from easy to complex, compared with the same network using the pre-defined label sequence.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have