Abstract

In recent years, the data available from IoT devices have increased rapidly. Using a machine learning solution to detect faults in these devices requires the release of device data to a central server. However, these data typically contain sensitive information, leading to the need for privacy-preserving distributed machine learning solutions, such as federated learning, where a model is trained locally on the edge device, and only the trained model weights are shared with a central server. Device failure data are typically imbalanced, i.e., the number of failures is minimal compared to the number of normal samples. Therefore, re-balancing techniques are needed to improve the performance of a machine learning model. In this paper, we present FL-M-SMOTE, a new approach to re-balance the data in different non-IID scenarios by generating synthetic data for the minority class in supervised learning tasks using a modified SMOTE method. Our approach takes <i>k</i> samples from the minority class and generates <i>M</i> new synthetic samples based on one of the nearest neighbors of each <i>k</i> sample. An experimental campaign on a real IoT dataset and three well-known public datasets show that the proposed solution improves the balance accuracy without compromising the model&#x2019;s accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call