Graph Inverse Style Transfer for Counterfactual Explainability

Bardh Prenkaj,Efstratios Zaradoukas,Gjergji Kasneci

提交: 2025-01-23更新: 2025-07-24

TL;DR

Graph Inverse Style Transfer (GIST) pioneers the first-ever backtracking framework for graph counterfactuals, combining spectral style transfer with backward refinement to achieve realism and validity of the produced counterfactuals..

摘要

关键词

Counterfactual ExplanationsStyle TransferBacktrackingGraph StructuresExplainable AI

评审与讨论

审稿意见

评分: 32025-03-06

The authors introduce GIST their novel framework that generates counterfactual graph explanations. They leverage spectral style transfer to generate valid counterfactual explanations. Their architecture consists of two components: attention based node embeddings, and edge probabilities from the Gumbel-softmax trick.
After designing their algorithm they implement it on several real-world and synthetic datasets. They use the data sets BBBP, BZR, ENZYMES, MSRC21, PROTEINS, BA-SHAPES, and COLORS-3. They conduct experiments with respect to several SOTA baselines and compute relevant metrics such as validity and fidelity. They also conduct an ablation study on the interpolation factor which adjusts the distance from the decision boundary. They show a remarkable improvement in both validity and fidelity.

update after rebuttal

Thanks for the authors' efforts in the rebuttal. I intend to keep my rating.

给作者的问题

See above issues

论据与证据

The claims in the paper are mostly correct. The claims are supported by theory and preliminary experiments that show the framework’s potentially ideal behavior.

方法与评估标准

The chosen datasets are good choices for their work. However, the authors should consider also adding additional datasets such as NCI1, MUTAG etc as these are common datasets. The authors also have left out a baseline method in counterfactual explanation methods [1]. [1] Bajaj, Mohit, et al. "Robust counterfactual explanations on graph neural networks." Advances in Neural Information Processing Systems 34 (2021): 5644-5655.

理论论述

Briefly checked over theoretical claims and some proofs.

实验设计与分析

Checked the design of the experimental setups and they are sound.

补充材料

I mostly reviewed the supplementary section and verified the claims.

与现有文献的关系

The paper does miss a reference in counterfactual graph explanations see [1] [1] Bajaj, Mohit, et al. "Robust counterfactual explanations on graph neural networks." Advances in Neural Information Processing Systems 34 (2021): 5644-5655.

遗漏的重要参考文献

paper [1] from Neurips 2021 is quite relevant and is neither referenced nor used as a baseline [1] Bajaj, Mohit, et al. "Robust counterfactual explanations on graph neural networks." Advances in Neural Information Processing Systems 34 (2021): 5644-5655.

其他优缺点

The paper is well written, theoretically rigorous and well founded.

其他意见或建议

n/a

作者回复

2025-03-28

We thank you for the effort made to review our paper, and for the nice score you chose to give it. Thank you for pointing out RCExplainer [2].

W1: Missed Bajaj reference: We were aware of the paper, and decided to reviewed RCExplainer again to see whether we were missing something or not. We cross-validated it with what described in [1] specifically in Table 2 (page 12). As per [1], we confirm that it is a heuristic + learned-based method.

In our related work section, we specifically stated that we concentrate on only learning-based and generative methods, hence the choice of the methods we compared against. However, we see the value of RCExplainer with its multiple learnt linear decision boundaries and then the search over these boundaries to find robust explainers, and will include it in our related work at camera ready.

Unfortunately, the official code doesn't run (the authors load some Huawei third party python packages that aren't available anywhere) even after debugging and trying to port it to the GRETEL framework to have the same evaluation pipeline. Then, we also tried to run an unofficial code repository at https://github.com/idea-iitd/gnn-x-bench/blob/main/source/rcexplainer.py (we are not the authors so anonymity isn't breached if you want to take a look), however, the code doesn't support BBBP, BZR, ENZYMES, MSRC21, and COLORS-3 datasets. It's fine that it doesn't support COLORS-3 since this is multiclass and RCExplainer only does binary classification. However, not supporting the other listed datasets hinders us to compare it against other SoTA and GIST.

I believe a good compromise here is to recognize RCExplainer's validity as a heuristic+learned counterfactual explainer and list it in our related work section and highlight the fact that we only treat pure learned-based approaches. What do you think?

Q1: Where are MUTAG and NCI1?: We excluded all those datasets from TUDataset (https://chrsmrrs.github.io/datasets/docs/datasets/) that do not have any node attributes. As per GNNs message passing mechanism, the nodes share their feature vectors with their neighbors, hence then having meaningful embeddings. Given a graph $G=(X,A)$ , GIST overshoots to $G^e=(X^e,A^e)$ whose node features $X^e$ go through TransConv layers. If $X^e$ are missing, then the conv. layer doesn't produce anything meaningful to then estimate the edge probabilities (see Fig. 2). To surpass this hurdle, we added 7 features regarding centralities: node degree, betweeness, closeness, harmonic centrality, clustering coefficient, Katz centrality, Laplacian centrality. In this way, at least we have something interesting to work with and not rely only on the topology of the graphs. Here are the performance against SoTA in terms of validity and fidelity on 5-fold cross-validations where the oracle $\Phi$ is a 3-layer GCN with test accuracy of 86.8% for MUTAG. Unfortunately, even after hypeparam optimization was done on NCI1 with the introduced node features, any kind of GCN (with any layer) and UGFormer [3] with the hyperparameter search space introduced in the original paper do not reach more than 40% of accuracy in the test set. We ran experiments with these oracles for NCI1, however the fidelity of the explainers was negative, which suggests that the explainers are actually doing adversarial attacks rather than explanations on the oracle [1]. Hence, we decided to discard NCI1 and show only MUTAG. We want to point out that these two datasets aren't suitable for benchmarking purposes since, again, message passing mechanisms in GNNs rely on node feature aggregations on the neighbors. These two datasets don't have node features, and we are a bit puzzled how SoTA methods used them to compare against each other.

	Validity	Fidelity
CF $^2$	0.026 $\pm$ 0.026	0.026 $\pm$ 0.026
CF-GNNExp	0.447 $\pm$ 0.026	0.237 $\pm$ 0.132
CLEAR	0.921 $\pm$ 0.026	0.395 $\pm$ 0.079
iRand	0.026 $\pm$ 0.026	0.026 $\pm$ 0.026
RSGG-CE	0.947 $\pm$ 0.000	0.737 $\pm$ 0.158
GIST	1.0 $\pm$ 0.000	0.737 $\pm$ 0.105

[1] Prado-Romero et al. A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation, and Research Challenges. ACM CSUR 2024.

[2] Bajaj et al. Robust counterfactual explanations on graph neural networks. NeurIPS'21

[3] Nguyen et al. Universal graph transformer self-attention networks.WWW'22

审稿意见

评分: 22025-03-06

GIST introduces a backtracking approach for graph counterfactual explainability using spectral style transfer. Unlike forward perturbation methods, it refines graphs to preserve global style and local content. GIST achieved excellent results experimentally.