This research aims to find an optimal balance between privacy and performance in forecasting mental health sentiment. This paper investigates federated learning (FL) augmented with a novel data obfuscation (DO) technique, where synthetic data is used to "mask" real data points. Bidirectional Encoder Representations from Transformer (BERT) is used for sentiment analysis, forming a new framework, FL-BERT+DO, that addresses the privacy-performance trade-off. With FL, data remains decentralized, ensuring that user-sensitive information is retained on local devices rather than being shared with the FL server. The integration of BERT gives our system an enhanced feature of context sense-making from text conduct, and our model is extremely proficient in emotion categorization tasks. The experiments were performed on combined (real and replica synthetic) datasets containing emotions and showed significant enhancements compared to baseline methods. The proposed FL-BERT+DO framework shows the following metrics: prediction accuracy, 82.74%; precision, 83.30%; recall, 82.74%; F1-score, 82.80%. Further, we assessed its performance in the adversarial setup using membership inference and linkage attacks to ensure the privacy-preserved performance did not suffer deeply. It demonstrates that, even for large datasets, providing privacy-preserving prediction is possible and can significantly improve existing methods of addressing personal issues, like mental health support. Based on the results of our work, we can propose the development of secure decentralized learning systems that are capable of providing high accuracy of sentiment analysis and meeting strict privacy constraints.
Read full abstract