Graph generative models have become increasingly prevalent across various domains due to their superior performance in diverse applications. However, as their application rises, particularly in high-risk decision-making scenarios, concerns about their fairness are intensifying within the community. Existing graph-based generation models mainly focus on synthesizing minority nodes to enhance the node classification performance. However, by overlooking the node generation process, this strategy may intensify representational disparities among different subgroups, thereby further compromising the fairness of the model. Moreover, existing oversampling methods generate samples by selecting instances from corresponding subgroups, risking overfitting in those subgroups owing to their underrepresentation. Furthermore, they fail to account for the inherent imbalance in edge distributions among subgroups, consequently introducing structural bias when generating graph structure information. To address these challenges, this paper elucidates how existing graph-based sampling techniques can amplify real-world bias and proposes a novel framework, Fair Graph Synthetic Minority Oversampling Technique (FG-SMOTE), aimed at achieving a fair balance in representing different subgroups. Specifically, FG-SMOTE starts by removing the identifiability of subgroup information from node representations. Subsequently, the embeddings for simulated nodes are generated by sampling from these subgroup-information-desensitized node representations. Lastly, a fair link predictor is employed to generate the graph structure information. Extensive experimental evaluations on three real graph datasets show that FG-SMOTE outperforms the state-of-the-art baselines in fairness while also maintaining competitive predictive performance.
Read full abstract