Federated learning enables training models on distributed, privacy-sensitive medical imaging data. However, data heterogeneity across participating institutions leads to reduced model performance and fairness issues, especially for underrepresented datasets. To address these challenges, we propose leveraging the multi-head attention mechanism in Vision Transformers to align the representations of heterogeneous data across clients. By focusing on the attention mechanism as the alignment objective, our approach aims to improve both the accuracy and fairness of federated learning models in medical imaging applications. We evaluate our method on the IQ-OTH/NCCD Lung Cancer dataset, simulating various levels of data heterogeneity using Latent Dirichlet Allocation (LDA). Our results demonstrate that our approach achieves competitive performance compared to state-of-the-art federated learning methods across different heterogeneity levels and improves the performance of models for underrepresented clients, promoting fairness in the federated learning setting. These findings highlight the potential of leveraging the multi-head attention mechanism to address the challenges of data heterogeneity in medical federated learning.
Read full abstract