Remaining useful life prediction models are a central aspect of developing modern and capable prognostics and health management systems. Recently, such models are increasingly data-driven and based on various machine learning techniques, in particular deep neural networks. Such models are notoriously “data hungry”, i.e., to get adequate performance of such models, a substantial amount of diverse training data is needed. However, in several domains in which one would like to deploy data-driven remaining useful life models, there is a lack of data or data are distributed among several actors. Often these actors, for various reasons, cannot share data among themselves. In this paper a method for collaborative training of remaining useful life models based on federated learning is presented. In this setting, actors do not need to share locally held secret data, only model updates. Model updates are aggregated by a central server, and subsequently sent back to each of the clients, until convergence. There are numerous strategies for aggregating clients’ model updates and in this paper two strategies will be explored: 1) federated averaging and 2) federated learning with personalization layers. Federated averaging is the common baseline federated learning strategy where the clients’ models are averaged by the central server to update the global model. Federated averaging has been shown to have a limited ability to deal with non-identically and independently distributed data. To mitigate this problem, federated learning with personalization layers, a strategy similar to federated averaging but where each client is allowed to append custom layers to their local model, is explored. The two federated learning strategies will be evaluated on two datasets: 1) run-to-failure trajectories from power cycling of silicon-carbide metal-oxide semiconductor field-effect transistors, and 2) C-MAPSS, a well-known simulated dataset of turbofan jet engines. Two neural network model architectures commonly used in remaining useful life prediction, long short-term memory with multi-layer perceptron feature extractors, and convolutional gated recurrent unit, will be used for the evaluation. It is shown that similar or better performance is achieved when using federated learning compared to when the model is only trained on local data.
Read full abstract