5.0

/10

Rejected3 位审稿人

最低3最高6标准差1.4

2.7

置信度

ICLR 2024

Understanding Community Bias Amplification in Graph Representation Learning

Shengzhong Zhang,Wenjie Yang,Yimin Zhang,Hongwei Zhang,Liang Yan,Zengfeng Huang

OpenReview PDF

提交: 2023-09-22更新: 2025-11-06

摘要

关键词

Graph Representation LearningCommunity Bias Amplification

评审与讨论

审稿意见

评分: 3置信度: 32023-10-24

For the graph learning, this paper discovers the phenomenon which this paper calls community bias amplification. This paper provides theoretical analysis on this phenomenon. Using this theoretical insights, this paper proposes a graph learning algorithm to mitigate this bias. The experimental results show that the proposed method outperforms the existing ones.

优点

A problem this paper focuses on is interesting. Seeing Fig.1, it is convincing that a bias between classes surely exists, and this itself is novel as far as I believe, at least in the GNN realm.

缺点

I do not agree with the authors claim on where the bias comes from. The datasets for illustrative examples in Fig.1 contain the independent components. Oono and Suzuki (2020) and its earlier work [1] shed a light on the phenomenon over-smoothing -- if we stack layers, the stacked adjacency matrix is dominant by the eigenspace associated with the largest eigenvalues. In the Cora and Citeseer cases, that space is a set of indicate vectors of independent components. Also, the claim is this worsens the performance, which is understandable, since the only this eigenspace of the independent components are too simplified as an underlying structure. However, the underlying graph structure does not necessarily correspond to the classes, while of course graph structure and the classes are loosely related. If they are, we observe somewhat comparative performance only using the graph of Cora and Citeseer, but we do not observe such performance by conducting for example the simple spectral clustering on graph. Thus, the community amplification bias of the underlying graphs is not the primal reason why we observe the unfairness of Fig. 1. Instead, I believe that the bias is more nuanced -- hope to see what is the dominant.
Also, even if the community bias were primal reason, the argument of Eq. (3) is weak since they only compare the value of the eigenvalues of two clusters. Also, how the example of Appendix A reflects the Cora and Citeceer dataset? Do we observe such things in the real datasets? How do you argue that?

[1] Li et al. Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning. Proc AAAI 2018.

问题

How do you defend that the community amplification bias of the underlying graph is the primal reason we observe an unfairness between classes? As stated in the weakness section, I feel like there exists some gap between them.

伦理问题详情

N/A

评论- Response to Reviewer vF6X[1/3]

2023-11-20

Thank you for your review. It seems you may have misunderstood some part of our motivation and main contribution. Therefore, we would like to clarify more in the rebuttal.

One of our main contribution is identifying a novel phenomenon in GCL named community bias amplification, which is the observation that class imbalance increases after applying GNN encoders (compared to MLP). Understanding this amplification effect is the problem we investigated. However, it seems that the reviewer has misunderstood the goal of our work is to interpret why class imbalance exists in the downstream task. Note that we focus on the "amplification" effect of GNN: whether there is bias in the original features or not, we always observe that GNN makes the accuracy imbalance between classes worse. Moreover, in the review, the reviewer keep referring to "community bias amplification" studied in this work as "community amplification bias", which suggests there may be some misunderstandings.

Research background and problem. We use the term "community bias amplification" to describe exacerbation of class performance gap caused by the graph structure information. Specifically, for instance, the maximum performance gap when classifying different classes of nodes using MLP is $\Delta_0$ , and the maximum performance gap when classifying the embeddings learned by the GNN encoder is $\Delta_1$ . We then studied the reasons why $\Delta_1 > \Delta_0$ and how to reduce $\Delta_1$ - $\Delta_0$ . We did not focus on the reasons for $\Delta_0$ .

Next, we will address each of the issues raised by the reviewer one by one.

W1.1: I do not agree with the authors claim on where the bias comes from. The datasets for illustrative examples in Fig.1 contain the independent components.

In our analysis, we always assume the graph is connected. Different communities in our analysis are not disconnected, but are clusters in a connected graph. Although real-world graph datasets may consist of multiple components, there is always a main component (the largest connected subgraph) and this largest connected subgraph contains the vast majority of nodes. Our analysis interprets the reason why community bias amplification occurs on this single component. This conclusion also holds for graphs with multiple connected components, as each component can be analyzed in the same way independently.

W1.2: Oono and Suzuki (2020) and its earlier work [1] shed a light on the phenomenon over-smoothing -- if we stack layers, the stacked adjacency matrix is dominant by the eigenspace associated with the largest eigenvalues. In the Cora and Citeseer cases, that space is a set of indicate vectors of independent components. Also, the claim is this worsens the performance, which is understandable, since the only this eigenspace of the independent components are too simplified as an underlying structure.

As clarified above, our analysis is for a single connected graph. So the comments here actually misinterprets our theoretical analysis. Moreover, we have emphasized in the submission that existing spectral analysis on over-smoothing, which assumes the number of layers goes to infinity, is not suitable in our setting. Here, we would like to clarify more on this issue.

Distinct from oversmoothing. We want to emphasize that what we are studying are the phenomena that occur when oversmoothing does not happen. Oversmoothing is a phenomenon that is only observed when there are enough stacked adjacency matrices present. In our theoretical analysis, we are considering more realistic shallow GCN encoders. This is also in line with reality, as in common GCL models, GCN encoders usually only have few layers of graph convolution, at which point oversmoothing does not occur. At a high level, our justification for bias amplification is that when oversmoothing has not occurred, the level of smoothness at different regions of the graph could differ significantly due to disparities in local structures. Overall, the reviewer's hypothesis that community bias amplification could be caused by oversmoothing is not valid.

评论- Response to Reviewer vF6X[2/3]

2023-11-20

To further explain this distinction, we tested the specific performance of each class for MLP and DGI (with graph convolution layers of 1, 2, 4, 16). We can see that for MLP, the maximum performance gap between classes is only 0.20. For DGI, When L=1 and 2, the performance of each class of nodes has improved, but the maximum performance gap between classes has risen to 0.24 and 0.25. Clearly, the performance gap between classes at this time does not come from oversmoothing, because oversmoothing has not occurred. When L=4 and L=16, oversmoothing gradually occurs (the overall performance decreases), but the maximum performance difference does not increase significantly. Therefore, the main problem we are studying has no relation to oversmoothing.

Table 1: Performance of each class for MLP and DGI.

	C0	C1	C2	C3	C4	C5	C6	$\Delta$
MLP	0.49	0.68	0.69	0.51	0.61	0.58	0.59	0.20	no oversmoothing
DGI(L=1)	0.69	0.87	0.93	0.73	0.81	0.75	0.82	0.24	no oversmoothing
DGI(L=2)	0.71	0.89	0.96	0.75	0.81	0.76	0.82	0.25	no oversmoothing
DGI(L=4)	0.69	0.86	0.94	0.74	0.80	0.74	0.83	0.25	mild oversmoothing
DGI(L=16)	0.66	0.78	0.92	0.68	0.72	0.65	0.80	0.27	oversmoothing

W1.3: If they are, we observe somewhat comparative performance only using the graph of Cora and Citeseer, but we do not observe such performance by conducting for example the simple spectral clustering on graph. Thus, the community amplification bias of the underlying graphs is not the primal reason why we observe the unfairness of Fig. 1. Instead, I believe that the bias is more nuanced -- hope to see what is the dominant.

These GCL models are usually applied to homophily graphs, in which intra-class edges significantly outnumber inter-class edges, hence the graph structure and classes are strongly correlated.

What we want to express through Figure 1 is that the performance gap $\Delta_1$ of the GCL model is significantly higher than the performance gap $\Delta_0$ of MLP. We are studying the reasons why $\Delta_1 > \Delta_0$ , and how to reduce this community bias amplification, but not the main reason for $\Delta_0$ .

W2:Also, even if the community bias were primal reason, the argument of Eq. (3) is weak since they only compare the value of the eigenvalues of two clusters. Also, how the example of Appendix A reflects the Cora and Citeceer dataset? Do we observe such things in the real datasets? How do you argue that?

After recognizing the community bias amplification in GCLs, we summarize the evidences for this phenomenon as three folds:

(1) Eq 3 shows that a upper bound of the representation density shrinks with a rate that depends on the second largest eigenvalue. We would like to point out that the rationale on two clusters can also be extended to multiple clusters with one-vs-the-rest (OvR) multiclass strategy. Secondly, The comparison of eigenvalues is not ``weak" due to the structural properties they imply [1], and it is widely used in the analysis of GNNs [2-5].

(2) For random graphs: In appendix A, we show the proof for bias amplification on the CSBM. Random graphs like CSBM is widely adopted for GNN analysis (e.g., [6-8]), and it is very suitable for illustrating bias amplification.

(3) Emprically: We would like to highlight Table 2 in our original submission. In Table 2, we show the standard deviation of community density, computed on the node representations produced by various GCL models. It is clear that, for the embeddings of baseline models, densities differ a lot among communities. This is a direct evidence of community bias on two real datasets, Cora and Citeceer. Furthermore, it shows that our method RGCCL effectively mitigates this issue.

评论- Response to Reviewer vF6X[3/3]

2023-11-20

Q1: How do you defend that the community amplification bias of the underlying graph is the primal reason we observe an unfairness between classes? As stated in the weakness section, I feel like there exists some gap between them.

In summary, we discover a phenomenon called community bias amplification: After applying a shallow GNN, the representation densities of communities differs a lot due to their distinct structure (which is validated both theoretically and empirically, c.f., our response to W2). Different representation densities are proved to be unfair in Proposition 1 and Figure 3. The reviewer's concern on oversmoothing and eigenvalues are carefully addressed in our response to W1.2 and W2, respectively: (1) oversmoothing will not occur in shallow GNN; (2) the second largest eigenvalue contains rich information of structural properties, thus comparison on it is not ''weak'' and is widely adopted in the literature.

Thank you again for your review! We have attempted to address your questions in our reply. If you have any further doubts, please contact us. we will gladly address any additional questions/concerns you may have.

[1] Cvetković, D., & Simić, S. (1995). The second largest eigenvalue of a graph (a survey). Filomat, 449-472.

[2] Chen, M., Wei, Z., Huang, Z., Ding, B. and Li, Y. Simple and deep graph convolutional networks. ICML 2020.

[3] Keriven, N. Not too little, not too much: a theoretical analysis of graph (over) smoothing. Neurips 2022.

[4] Rong, Y., Huang, W., Xu, T. and Huang, J. Dropedge: Towards deep graph convolutional networks on node classification. ICLR 2020.

[5] He, M., Wei, Z. and Xu, H. Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. Neurips 2021.

[6] Wei, R., Yin, H., Jia, J., Benson, A. R., & Li, P. Understanding non-linearity in graph neural networks from the bayesian-inference perspective. Neurips 2022.

[7] Wu, X., Chen, Z., Wang, W. W., & Jadbabaie, A. A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks. ICLR 2023.

[8] Su, J., Zou, D., Zhang, Z., & Wu, C. Towards Robust Graph Incremental Learning on Evolving Graphs. ICML 2023.

2023-11-21

Dear Reviewer vF6X,

We thank you again for your review. We have worked hard and have thoroughly addressed your comments in the rebuttal.

As the discussion period soon comes to an end, we are looking forward to your feedback to our response and revised manuscript. Many thanks again for your time and efforts.

Best regards,

Authors of Submission 5464

评论- Further Response to Reviewer vF6X[1/2]

2023-11-23

"especially on the observation on accuracy difference is big for some pair of the classes. Not on the accuracy itself."

Sorry, we have to emphasize again that we are not focusing on the phenomenon of "accuracy difference is big". We investigate the reason why the difference becomes even bigger after applying graph convolutions. The feed-forward pass of a GCL encoder is typically a combination of graph convolution and MLP. We compared the "accuracy difference" of MLP and that of GCL (i.e., graph convolution+MLP). The accuracy difference becomes bigger for GCL, and we hypothesize that the graph convolution operation "amplifies the bias". Please see our detailed explanation on the problem definition in the previous rebuttal.

Q1:"The major problem of this paper is inconsistent analyses around "connected" assumptions. ... For this inconsistency only, I vote for rejection."

We respectfully disagree with this. There is no inconsistency here, as we have already explained why analyzing a connected graph is without loss of generality in the previous rebuttal. Here we would like to clarify more.

Suppose the graph contains multiple components. In the computation of a GNN encoder, computations on different components are uncorrelated. For example, the propagation matrix corresponds to a graph with two connected components is a block diagonal matrix: $\begin{bmatrix} \hat{A_1} & \\\\ & \hat{A_2} \end{bmatrix}$ . Apply the operator $k$ times results in $\begin{bmatrix} \hat{A_1}^k & \\\\ & \hat{A_2}^k \end{bmatrix}$ .

So $\hat{A_1}$ and $\hat{A_2}$ can be analyzed independently. Our analysis works for both $\hat{A_1}$ and $\hat{A_2}$ , and thus our conclusion holds on each of the components, i.e., community bias amplification can happen on each component. To sum up, although we only present an analysis on connected graphs, the conclusion extends easily to disconnected graphs, so there is inconsistency between of analysis and the empirical results presented in Fig 1. To empirically support that our conclusion holds on each component, we single out the largest connected components on Cora and Citeseer; the accuracy on these two connected graphs are shown in Table 1 and Table 2, which is consistent with our analysis.

Table 1. The performance on Cora.

	C0	C1	C2	C3	C4	C5	C6	$\Delta$
MLP	0.47	0.66	0.65	0.46	0.54	0.57	0.45	0.21
GGD	0.71	0.90	0.96	0.80	0.85	0.80	0.86	0.25

Table 2. The performance on Citeseer.

	C0	C1	C2	C3	C4	C5	$\Delta$
MLP	0.21	0.23	0.41	0.32	0.56	0.34	0.35
GGD	0.28	0.59	0.83	0.71	0.85	0.75	0.55

Among the datasets used for the experiments, the largest dataset Ogbn-arxiv is connected (thus the statement of the reviewer "none of the datasets used for experiments is connected" is wrong). And in the remaining datasets, the largest connected component also dominates.

Q2: "Also, as far as I understand, in the response the authors does not rebut on "However, the underlying graph structure does not necessarily correspond to the classes, while of course graph structure and the classes are loosely related." Again, the classes for the dataset are not strictly corresponds to the community structure. Therefore, we do not know if $|\hat{B_1}|$ and $|\hat{B}_{2}|$ are close to 0. "

As we stated in the response to W1.3 (see Response to Reviewer vF6X[2/3]), the homophily of the graph is the most common property in graph datasets. These GCL models are usually applied to homophily graphs, in which intra-class edges significantly outnumber inter-class edges, hence the graph structure and classes are strongly correlated. And all the datasets we used also satisfy this condition. We acknowledge that the homophily assumption is not universal, and investigating bias amplification on heterophily graphs is an interesting future direction.

For the spectral analysis part, our conclusion is that when the connectivity of different parts of the graph differs drastically, the distributions of embeddings computed by GNN encoders can exhibit very different level of concentration. Note, this result only concerns embedding distributions and has nothing to do with classes of nodes.

评论- A friendly reminder for discussion

2023-11-23

Dear Reviewer vF6X,

The rebuttal phase ends today and we have not yet received feedback from you. We believe that we have addressed all of your previous concerns. We would really appreciate that if you could check our response and updated paper.

Looking forward to hearing back from you.

Best Regards,

Authors

评论- Thank you for the response

2023-11-23

Thank you for the review.

I apologize for the "community amplification bias." However, other than that, I believe that I understood in a way like the authors wrote, especially on the observation on accuracy difference is big for some pair of the classes. Not on the accuracy itself.

The major problem of this paper is inconsistent analyses around "connected" assumptions. The connection of underlying graph has a major impact on the eigenspace. However, while authors did analysis on a connected graph as stated in a rebuttal, the illustration Fig. 1 is demonstrated on Cora, whose graph is not connected. Also, none of the datasets used for experiments is connected. We may observe that the bias "community amplification bias" authors claim in Sec. 3, but in this paper we do not see this empirically since Fig.1 is demonstrated on disconnected graph and again none of the datasets in the experiments is connected. For this inconsistency only, I vote for rejection.

Also, as far as I understand, in the response the authors does not rebut on "However, the underlying graph structure does not necessarily correspond to the classes, while of course graph structure and the classes are loosely related." Again, the classes for the dataset are not strictly corresponds to the community structure. Therefore, we do not know if $|\hat{B}_{1}|$ and $|\hat{B}_{2}|$ are close to 0. For some case we may consider for the citation network is that writing is similar but citation topology is different; this is what we observe in the research community. My suggestion is that you would do this analysis not the typical GNN datasets which contains features and graph. Instead, authors may want to use community detection datasets which only contains graph.

评论- Further Response to Reviewer vF6X[2/2]

2023-11-23

Q3: For some case we may consider for the citation network is that writing is similar but citation topology is different; this is what we observe in the research community. My suggestion is that you would do this analysis not the typical GNN datasets which contains features and graph. Instead, authors may want to use community detection datasets which only contains graph.

The power of GCL is its ability to combine node feature and structure information. Node embeddings from GCL contains both topological and feature information, which attains much better performance than topology only or feature only methods. Loosely speaking, our motivation is that although structural information enhances performance, it also brings the risk of bias amplification. We verify this risk empirically, provide a theoretical explanation and an effective solution. That being said, GNN datasets which contains features and graph suit our purpose better that those only contain graphs.

审稿意见

评分: 6置信度: 32023-11-01

In this paper, the authors explore a phenomenon called "community bias amplification" in graph representation learning. This phenomenon refers to the exacerbation of performance bias between different classes by graph representation learning methods. The researchers conduct a thorough theoretical investigation of this phenomenon, approaching it from a novel spectral perspective. Their analysis reveals that the structural bias between communities leads to varying local convergence speeds for node embeddings, resulting in biased classification outcomes in downstream tasks. To address this issue, the authors propose a solution called random graph coarsening. This technique is demonstrated to be effective in mitigating the problem of bias amplification. Furthermore, the authors introduce a novel graph contrastive learning model named Random Graph Coarsening Contrastive Learning (RGCCL). This model utilizes random graph coarsening as a form of data augmentation and alleviates community bias by contrasting the coarsened graph with the original graph. Extensive experiments conducted on various datasets highlight the effectiveness of their proposed method in addressing community bias amplification in graph representation learning tasks.

优点

The problem of community bias amplification in GRL is very interesting and has not been extensively studied before.
The analysis is theoretically sound with appropriate discussion and remarks.
Experiments on several benchmark graphs show the effectiveness (better node representations) and efficiency (less memory usage) of the proposed method.

缺点

One of the research questions has not been answered in a good way, i.e., why community bias amplification exists in existing GCL method? Although some theoretical analyses have been provided in this paper, they are based on general (and simplified) graphs and it is not clear why there is such bias in GCL methods.
How to quantitatively measure the (community) bias amplification is not clear in the experiments. More analysis should be conducted to better illustrate why the proposed method is able to mitigate the issue of bias amplification. For example, some measures [1] can be used for the quantitative analysis.

[1] Angelina Wang and Olga Russakovsky.Directional Bias Amplification.ICML 2021

问题

What are the answers to the first research question, i.e., why community bias amplification exists in existing GCL method?
It is possible to give some quantitative analysis on how the proposed method can mitigate the issue of community bias amplification rather than simply comparing the performance between different classes/communities?
Discussion on extending this key idea to general GRL. I asked this question because the theoretical analysis is general enough on any GRL methods.

评论- Response to Reviewer bi59

2023-11-20

W1&Q1: One of the research questions has not been answered in a good way, i.e., why community bias amplification exists in existing GCL method? Although some theoretical analyses have been provided in this paper, they are based on general (and simplified) graphs and it is not clear why there is such bias in GCL methods.

Thanks for the comments. Yes, our analysis is for embeddings produced by general GCN encoders, which provides insights on why community bias amplification exists in typical GCN encoders. Since most mainstream and competitive GCL models all use GCN and its variants as the encoder; thus current GCL models suffers community bias amplification.

Likewise, supervised GNN as long as GCN is used as the encoder, community bias amplification will occur. The reason why we only focus on GCL is that supervised GNNs is also provided with label information at training time, and thus various techniques designed for handling class imbalance in supervised GNNs can be effectively applied, e.g., TAM [1] and GraphENS [2]. On the contrary, since there is no label information during the training process for unsupervised GCL, it is more difficult to mitigate the imbalance of embeddings from GCN encoders. And our paper is the first work that explicitly identifies this problem in GCL and provides an effective solution.

W2&Q2: It is possible to give some quantitative analysis on how the proposed method can mitigate the issue of community bias amplification rather than simply comparing the performance between different classes/communities?

Thanks for the question, and we agree that simply comparing the performance is not enough. We would like to highlight Table 2 in our original submission. In this table, we computed representation densities for each communities, and reported the average and standard deviation of these densities among communities. A lower standard deviation means that different communities are with similar degree of concentration, which is more fair for classification as illustrated in Figure 3 and Proposition 1 in our original submission. The results in Table 2 quantitatively shows that the proposed method RGCCL can mitigate the issue of community bias amplification (which is also proved by Lemma 4.1 in our original submission).

Since the metrics in [3] are primarily about fairness and causal inference, they are not applicable to our task. Furthermore, suggested by reviewer K8c3, we conducted experiments using the Matthew's coefficient to measure bias. We present the results of representative GCL models and our RGCCL regarding the Matthew's coefficient in Table 1. The results also demonstrate the effectiveness of RGCCL in mitigating the amplification of community bias.

Table 1: Matthew’s coefficient for RGCCL and baselines.

	Cora	CiteSeer	PubMed
DGI	73.0 $\pm$ 1.5	64.5 $\pm$ 1.2	57.8 $\pm$ 4.3
GRACE	67.9 $\pm$ 1.6	60.0 $\pm$ 2.0	62.3 $\pm$ 3.8
CCA-SSG	74.5 $\pm$ 1.6	64.6 $\pm$ 1.4	64.0 $\pm$ 4.1
GGD	77.0 $\pm$ 1.6	64.9 $\pm$ 1.2	63.7 $\pm$ 4.1
GRADE	75.7 $\pm$ 1.5	62.1 $\pm$ 1.3	58.1 $\pm$ 3.4
RGCCL	78.9 $\pm$ 0.9	66.3 $\pm$ 0.8	65.6 $\pm$ 4.2

Q3:Discussion on extending this key idea to general GRL. I asked this question because the theoretical analysis is general enough on any GRL methods.

As we answered in Q1, for general GRLs, such as supervised GCN, GAT models, etc., since we can use label information during the training process, we can refer to methods like TAM [1], GraphENS [2], etc., to mitigate the community bias amplification. As for classic unsupervised general GRLs like Node2vec [4] and SDNE [5], since they do not use GCN encoders, community bias amplification does not exist.

[1] Song, J., Park, J. and Yang, E. TAM: topology-aware margin loss for class-imbalanced node classification. ICML 2022.

[2] Park, J., Song, J. and Yang, E. Graphens: Neighbor-aware ego network synthesis for class-imbalanced node classification. ICLR 2022.

[3] Angelina Wang and Olga Russakovsky.Directional Bias Amplification.ICML 2021.

[4] Grover, A. and Leskovec, J. node2vec: Scalable feature learning for networks. KDD 2016.

[5] Wang, D., Cui, P. and Zhu, W. Structural deep network embedding. KDD 2016.

评论- A friendly reminder for discussion and we appreciate your input

2023-11-21

Dear Reviewer bi59,

We thank you again for your insightful and constructive review. We have worked hard and have thoroughly addressed your comments in the rebuttal.

As the discussion period soon comes to an end, we are looking forward to your feedback to our response and revised manuscript. Many thanks again for your time and efforts.

Best regards,

Authors of Submission 5464

评论- A friendly reminder for discussion

2023-11-23

Dear Reviewer bi59,

Looking forward to hearing back from you.

Best Regards,

Authors

评论- Thanks for the response

2023-11-23

Thanks for the authors' responses especially the added tables with more quantitative results. These responses addressed some of my concerns. For Q3, I may not agree with the claim that "since they do not use GCN encoders, community bias amplification does not exist". Overall, I would like to increase my rating.

审稿意见

评分: 6置信度: 22023-11-01

This paper identifies a problem in graph representation learning on graphs with community structures. When the communities have different strengths (e.g., edge densities), the representations of nodes from different communities convergence at different speeds. In the resulting representations, this may lead the representations of nodes from weaker communities to be further apart, which may result in poor classification results on these classes downstream. The paper introduces a contrastive learning approach using random graph coarsening to ameliorate this issue. Experiments show that this learning model outperforms existing representation learning methods.

优点

The paper is well written, with only few language errors and typos.

The problem that is identified is interesting and subtle.

The proposed solution seems elegant.

缺点

It took me quite a while into reading the paper before I understood what was actually meant with this community bias. I think the introduction of this bias can be made a bit more concrete to improve this.

The message passing operator $\hat{A}$ is mentioned but not introduced properly. Is this supposed to be the renormalized version of $\tilde{A}$ that is shown in (1) or should it be something else?

The performance in the experiments are measured by the Accuracy and Macro-F1 measure. Both of these methods have biases w.r.t. class sizes [1,2]. I understand why you use Accuracy, as you also use this to motivate the community bias, but perhaps you could replace Macro-F1 by the Matthew's coefficient.

[1] Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics, 21(1), 1-13. [2] Gösgens, M., Zhiyanov, A., Tikhonov, A., & Prokhorenkova, L. (2021). Good classification measures and how to find them. Advances in Neural Information Processing Systems, 34, 17136-17147.

问题

Consider removing "Understanding" from the title, as it sounds a bit generic and makes the title seem less strong.

Instead of using this contrastive learning approach, can't we just cluster the obtained representations by a density-based clustering method like DBSCAN? The problem of different clusterings having different densities doesn't seem like a new problem in clustering.

It would be interesting to compare the performance of these representation learning methods to community detection methods like the Louvain algorithm [1] or Bayesian community detection methods [2]. This latter method also addresses the issue of different communities having different densities, but does so in a statistical framework.

[1] Blondel, V. D., Guillaume, J. L., Lambiotte, R., & Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, 2008(10), P10008. [2] Zhang, L., & Peixoto, T. P. (2020). Statistical inference of assortative community structures. Physical Review Research, 2(4), 043271.

评论- Response to Reviewer K8c3

2023-11-20

Thank you for your insightful comments. We would like to address your questions/concerns below:

W1&Q1: It took me quite a while into reading the paper before I understood what was actually meant with this community bias. I think the introduction of this bias can be made a bit more concrete to improve this. I think the introduction of this bias can be made a bit more concrete to improve this. Consider removing "Understanding" from the title, as it sounds a bit generic and makes the title seem less strong.

Thank you for your feedback. We will improve the readability of the paper in the revised version and appropritely modify our title.

W2:The message passing operator $\hat{A}$ is mentioned but not introduced properly. Is this supposed to be the renormalized version of $\tilde{A}$ that is shown in (1) or should it be something else?

Yes, $\hat{A}$ is the renormalized version of $\tilde{A}$ . We will add its introduction in the revised version.

W3:The performance in the experiments are measured by the Accuracy and Macro-F1 measure. Both of these methods have biases w.r.t. class sizes [1,2]. I understand why you use Accuracy, as you also use this to motivate the community bias, but perhaps you could replace Macro-F1 by the Matthew's coefficient.

We agree with your statement that Matthew's coefficient will better eliminate the bias of classification results w.r.t class sizes. We present the results of representative GCL models and our RGCCL regarding the Matthew's coefficient in Table 1. The results also demonstrate the effectiveness of RGCCL in mitigating community bias amplification.

Table 1: Matthew’s coefficient for RGCCL and baselines.

	Cora	CiteSeer	PubMed
DGI	73.0 $\pm$ 1.5	64.5 $\pm$ 1.2	57.8 $\pm$ 4.3
GRACE	67.9 $\pm$ 1.6	60.0 $\pm$ 2.0	62.3 $\pm$ 3.8
CCA-SSG	74.5 $\pm$ 1.6	64.6 $\pm$ 1.4	64.0 $\pm$ 4.1
GGD	77.0 $\pm$ 1.6	64.9 $\pm$ 1.2	63.7 $\pm$ 4.1
GRADE	75.7 $\pm$ 1.5	62.1 $\pm$ 1.3	58.1 $\pm$ 3.4
RGCCL	78.9 $\pm$ 0.9	66.3 $\pm$ 0.8	65.6 $\pm$ 4.2

Q2:Instead of using this contrastive learning approach, can't we just cluster the obtained representations by a density-based clustering method like DBSCAN? The problem of different clusterings having different densities doesn't seem like a new problem in clustering.

GCL and clustering methods have fundamental differences. The goal of GCL is to learn node embeddings for various downstream tasks, including node classification, clustering and link prediction. Clustering is typically an unsupervised method. Although GCL learns embeddings in an unsupervised manner, in a downstream task, say node classification, a classifier will be trained in a supervised manner, and thus can achieve much better accuracy than purely unsupervised methods. In this paper, our objective is to address the bias present in the learned node embeddings by GCL. We aim to make embeddings learned by GCLs more unbiased. We think combining ideas from density-based clustering methods, as suggested by the reviewer, with graph representation learning is an interesting direction to address the bias problem. We will investigate how to use such clustering methods in end-to-end GCL models in future work.

Q3:It would be interesting to compare the performance of these representation learning methods to community detection methods like the Louvain algorithm [1] or Bayesian community detection methods [2]. This latter method also addresses the issue of different communities having different densities, but does so in a statistical framework.

We appreciate the reviewer's insightful observation highlighting the common concern of community bias in representation learning and the variations in community densities encountered in community detection. However, it is crucial to acknowledge the inherent distinctions between these tasks. In graph representation learning, our focus is on learning node embeddings for various downstream tasks, not limited to clustering tasks alone. Integrating these community detection algorithms into the GCL model is one of our future directions of research.

评论- A friendly reminder for discussion and we appreciate your input

2023-11-21

Dear Reviewer K8c3,

We thank you again for your insightful and constructive review. We have worked hard and have thoroughly addressed your comments in the rebuttal.

As the discussion period soon comes to an end, we are looking forward to your feedback to our response and revised manuscript. Many thanks again for your time and efforts.

Best regards,

Authors of Submission 5464

评论- A friendly reminder for discussion

2023-11-23

Dear Reviewer K8c3,

Looking forward to hearing back from you.

Best Regards,

Authors

AC 元评审

2023-12-06

The paper addresses the issue of "community bias amplification" in GRL. This phenomenon refers to the magnification of performance bias between different classes by GRL methods, particularly when communities within graphs have different strengths (e.g., edge densities). The representations of nodes from different communities converge at different speeds, leading to biased classification results for nodes from weaker communities downstream. To mitigate this issue, the paper introduces the concept of random graph coarsening and an associated graph contrastive learning model. The proposed method effectively alleviates community bias amplification and outperforms existing GRL methods in experiments.

Strengths of the paper:

The paper addresses an interesting and relatively unexplored problem in GRL, namely, community bias amplification, which is important for real-world applications.
The paper presents experimental results that show the effectiveness of the proposed methods compared to existing GRL techniques on various benchmark datasets.

Weaknesses of the paper:

Reviewers mention that the concept of community bias amplification could be made clearer, and the introduction of this bias could be explained more concretely to enhance reader understanding.
Two reviewers note that the paper does not fully answer the question of why community bias amplification exists in existing GRL methods, and more quantitative analysis is needed to demonstrate how the proposed method mitigates this issue.
Reviewers suggest exploring alternative methods like density-based clustering (e.g., DBSCAN) and comparing the proposed GRL methods with community detection algorithms (e.g., Louvain or Bayesian community detection) to provide a broader context for the research.

为何不给更高分

While the paper addresses an intriguing problem, there is room for improvement in terms of clarity, theoretical analysis, and the choice of evaluation metrics.

为何不给更低分

N/A

最终决定Reject

2024-01-16

Reject