Topology Matters in Fair Graph Learning: a Theoretical Pilot Study
摘要
评审与讨论
In this paper, the authors study the problem when and why the bias enhancement happens in GCN-like aggregation within the context of CSBM. The authors relate and define the bias enhancement with mutual information and show conditions when the bias enhancement occurs within the CSBM. In addition, the authors propose a modified learning objective that can adjust this bias enhancement.
优点
- It is an interesting and important problem to study
- there are some plausible ideas in the theoretical analysis
缺点
-There exists quite some gaps with respect to the claims and the results. The authors define and measure the topology bias of GCN as the "similarity" in the learnt representation. As representation/embedding is just an intermediate product of GCN, and the focus here is not the expressiveness, it is not sure how this measure should be related to other measures of interest, e.g., how it effects the prediction of GCN.
-The results are not novel and a bit self-evident from the setting. The authors start with a setting of homophily dataset where nodes with similar features are more likely connected. As GCN adopts a subgraph/neighbour aggregation, it is not suitable to further magnify such similarity (as input contains more similar features).
-The analytical setting has some strong weakness for it to be practical. For example, it does not consider the effect on model weight. It is not clear how the proposed result would influence the learning process of GCN and vice versa.
-Missing related work: Jiaqi Ma, Junwei Deng, and Qiaozhu Mei. "Subgroup generalization and fairness of graph neural networks." Advances in Neural Information Processing Systems 34 (2021): 1048-1061.
问题
— How is the "similarity in the learnt representation" measure of the topology bias of GCN related to other measures of interest, e.g., how it effects the prediction of GCN?
—How would the proposed result influence the learning process of GCN and vice versa?
In this paper, the authors provide a theoretical understanding of when and why the bias enhancement happens in GCN like aggregation within contexture stochastic block model (CSBM) and propose a fair graph rewiring algorithm, named FairGR, using topology-related fair loss to achieve fairness for node classification tasks.
优点
This paper theoretically analyzes when and why bias enhancement happens in GNNs and provides a condition provably for GCN-like aggregation under CSBM. Motivated by the derived data-centric insights, this paper develops a graph rewiring method to achieve fairness for node classification tasks.
缺点
- The literature reviews in this paper is not exhaustive, as recent relevant studies were not thoroughly discussed and summarized. The authors need to add a discussion and summary of recent relevant works, which includes but is not limited to [1-4].
- Experiments are insufficient and are lack of comparison with related representative methods, such as [1-2]. The authors need to add comparison experiments with methods from most recent years, which related to this paper.
- There are several works about theoretical understanding of fair in graph ML, such as [5]. The authors need to analyze the differences from them, which relates to the contribution of this paper.
[1] Agarwal C, Lakkaraju H, Zitnik M. 2021. Towards a unified framework for fair and stable graph representation learning. In UAI
[2]Dong Y, Liu N, Jalaian B, et al. 2022. Edits: Modeling and mitigating data bias for graph neural networks. In WWW.
[3] Yushun Dong, Song Wang, Yu Wang, Tyler Derr, and Jundong Li. 2022. On structural explanation of bias in graph neural networks. In KDD.
[4] Weihao Song, Yushun Dong, Ninghao Liu, and Jundong Li. 2022. Guide: Group equality informed individual fairness in graph neural networks. In KDD.
[5] Yushun Dong, Song Wang, Jing Ma, Ninghao Liu, and Jundong Li. 2023. Interpreting Unfairness in Graph Neural Networks via Training Node Attribution. In AAAI.
问题
In this paper, the authors use the mutual information to evaluate group fairness and the motivation for using this metric function is unclear. Why not use the typical Wasserstein distance metric?
This paper investigates fairness in GNNs focusing on the topology. The authors explore when and why the bias enhancement occurs in GCN-like aggregation both theoretically and empirically. Theoretically, they derive the bias-enhance condition. Empirically, they generate synthetic graphs and explore the impacts of various metrics. Based on the findings, they propose a rewiring algorithm to learn an unbiased typology that is close to the original graph and has a high label homophily coefficient and a low sensitive feature homophily coefficient. Experiments show that the proposed preprocessing rewiring improves fairness when applied to the base GNNs and can be utilized along with other fairness methods to achieve a better fairness-accuracy tradeoff.
优点
S1. Originality: This work is the first work to theoretically investigate the reason for bias enhancement in GCN-like aggregation. This provides novel perspectives.
S2. Quality: In general the work is of good quality. The theoretical and empirical analysis of when and how the bias enhancement occurs are convincing under the assumptions for the node feature and graph generations. Experiments validate the effectiveness of the rewiring method.
S3. Clarity: The paper is largely clear and well-structured with the logic flow being smooth and easy to follow. However, related to W3 below, there are some areas for improvement.
S4. Significance: This work will be interesting to the fair graph learning community. The theoretical analysis could potentially inspire future works to propose other fair methods in the topology aspect. The proposed rewiring broadens existing preprocessing methods and can help improve fairness in GNNs.
缺点
W1. Concerns about Table 1 result (GAT): as observed by the authors, GAT-GR suffers a lot in its accuracy. In addition to the mentioning of “For GAT backbone, although the bias can be mitigated, the accuracy drop is significant due to the fact that GAT is more sensitive to graph topology rewire.” How to tune the fairness-accuracy tradeoff if we want to obtain better accuracy and slightly worse fairness? The ablation study reported in Figure 4 and Figure 5 in the appendix does not reveal a comparable accuracy with the base GAT model.
W2. Discussion about the theoretical analysis is encouraged: The bias enhancement condition is derived based on some assumptions related to node features and graph models. Although the empirical results have shown that the proposed method, which is designed based on insights from the theoretical analysis, is effective. Whether the theoretical analysis can generalize to more general cases is unclear.
W3. Clarity can be further improved: The paper organization can be adjusted for a better reading experience. (1) Some figures are far away from the text that mentions it. For example, Figure 1 is on top of page 6 and its analysis is at the end of page 7; a similar thing happens for Figure 2. Aligning figures more closely with the text that references them would enhance readability. (2) Some notations are confusing (refer to Q1 and Q2)
问题
Q1. it is not clear in definition 3.4 “the average connection degree for the node with the same sensitive attribute is the same” where the average connection degree is first mentioned here. Additionally, why introduce another notation l to represent sensitive attributes that are originally denoted as s?
Q2. What is the meaning of n-1 in theorem 3.5?
Unfortunately, the authors did not attempt to address the weaknesses or questions. Thus, I have updated my score from 6 to 5.
This paper delves into fairness issues in Graph Neural Networks (GNNs), specifically focusing on topology-related bias enhancement in GCN-like aggregation. The authors offer both theoretical insights, establishing conditions for bias enhancement within the contexture stochastic block model (CSBM), and empirical investigations using synthetic graphs and diverse metrics. Their findings lead to the development of a rewiring algorithm, FairGR, designed to achieve unbiased topology while maintaining graph similarity, high label homophily, and low sensitive feature homophily. Experimental results demonstrate that applying FairGR as a preprocessing step enhances fairness in base GNNs, offering an improved tradeoff between fairness and accuracy.
While the paper's theoretical analysis demonstrates some merit, there is room for improvement through a more comprehensive discussion of related work. Additionally, the concerns raised during the rebuttal process were not adequately addressed. Therefore, I encourage the authors to carefully revise the paper, incorporating insights from the reviewer's comments, and consider resubmitting it to a future venue.
为何不给更高分
All reviewers vote for rejection; I agree with the rating and also vote for rejection.
为何不给更低分
N/A
Reject