Enhancing Graph Classification Robustness with Singular Pooling
We theoretically study the effect of the pooling operation on adversarial robustness, which allows us to introduce RS-Pool, a new pooling method capable of smoothing the effect of attacks.
摘要
评审与讨论
The paper investigates the adversarial robustness of Graph Neural Networks (GNNs), specifically examining how different flat pooling methods (sum, average, and max pooling) affect their resilience in graph classification tasks. The authors first provide a theoretical analysis of these pooling operations, deriving upper bounds on their adversarial risks and identifying distinct vulnerabilities depending on the type of adversarial attack and graph structure.
Motivated by this analysis, the authors propose a novel pooling strategy named Robust Singular Pooling (RS-Pool). This method constructs robust graph-level representations by leveraging the dominant singular vector of the node embedding matrix. RS-Pool is theoretically shown to enhance robustness by minimizing the sensitivity to adversarial perturbations through spectral stability properties. The proposed pooling approach is efficient, model-agnostic, and practically implemented using the power iteration method. The paper highlights the critical yet overlooked role pooling plays in the robustness of graph classification tasks and introduces a practical and effective approach to improving robustness via pooling.
优缺点分析
Strengths
The paper provides a thorough theoretical treatment of adversarial robustness, establishing explicit bounds for standard pooling methods and the proposed RS-Pool. Such clear theoretical grounding significantly strengthens the paper’s contribution. The empirical results are robust and cover a diverse range of datasets (bioinformatics, molecular, social, and image-based graphs), enhancing the credibility of the claimed benefits of RS-Pool.
Given the increasing adoption of GNNs in critical applications (e.g., drug discovery, cybersecurity), enhancing robustness against adversarial attacks is highly impactful. Addressing the overlooked influence of pooling operations rather than focusing solely on message-passing layers introduces an important new dimension to the robustness discussion.
The paper is methodically organized, making complex theoretical derivations accessible. Definitions and results are clearly stated, and the flow from theoretical insights to experimental validation is logical and easy to follow.
Weaknesses (only minor, adressed during rebuttal)
The paper does not consider or discuss malicious bit-flip attacks (BFAs), a relevant and increasingly studied form of adversarial perturbation on GNNs (see references below). BFAs are not acknowledged or reviewed within the related work section, limiting the paper's comprehensiveness regarding current adversarial threats.
Conclusion
I think this is a very strong paper. The proposed pooling method has a solid theoretical foundation and comes with proven guarantees regarding its adversarial risk, yet it remains technically simple, efficient, and elegant. The write-up is exceptionally clear. I enjoyed reading this paper a lot and consider it inspiring and highly relevant to my own line of work. I would be genuinely surprised if it is not accepted and would attend the talk if it was accepted as oral.
References
Kummer et al, "On the Relationship Between Robustness and Expressivity of Graph Neural Networks", AISTATS, 2025
Kummer et al., "Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Leman Go Indifferent", SIGKDD, 2024
Wu et al., "Securing Graph Neural Networks in MLaaS: A Comprehensive Realization of Query-based Integrity Verification", IEEE Symposium on Security and Privacy (SP), 2024
Jiau et al., "PyGFI: Analyzing and Enhancing Robustness of Graph Neural Networks Against Hardware Errors", CoRR abs/2212.03475 2022
问题
The authors should address the mentioned weakness above. Specifically, since the theoretical framework and method proposed by the authors could potentially be adapted to also address BFAs on GNNs, I suggest that they briefly mention BFAs in the related work section.
局限性
Mostly. The authors should note in the "Limitations" paragraph that investigating the robustness of their method against attacks other than the discussed evasion methods (e.g., BFAs) remains an open research topic.
EDIT: Adressed during rebuttal.
最终评判理由
My initial review of the paper was highly positive and it remained throughout the rebuttal phase. I have read the other reviewers remarks as well as the authors replies and while I do agree with some minor technical issues the other reviewers point out, I find the paper as a whole -- the fundamentally novel theoretical perspective and thorough analysis, the proposed elegant practical solution as well exceptional write up -- still sufficient to strongly recommend acceptance. I genuinely believe this work will have high impact across several sub domains of graph learning with GNNs and make a significant contributions towards rendering GNNs safer and thus more applicable in real world settings. This expectation of a broader impact has been reinforced by the authors preliminary experiments during rebuttal, which showed that RS-Pool also leads to a favorable robustness against BFAs, which are an attack concept orthogonal to what was originally considered by the authors.
格式问题
I have no concerns.
We thank the reviewer for their review and for acknowledging the novelty of our theoretical analysis and the effectiveness of our proposed RS-Pool. We are indeed very grateful for the kind comments about our manuscript, and we hope that it can help increase the robustness of GNNs especially in critical applications as identified by the reviewer.
We additionally thank the reviewer for pointing out the Bit-flip attacks, which we were not aware of. It is indeed a nice direction of research for adversarial attacks, which is very relevant in the case of GNNs, and especially in the case of graph classification given the limited number of attacks focusing on this specific task. We will add a thorough discussion of this line of research in our related work section. We have additionally found an implementation of this specific attack in the context of GNNs, which takes into account the TU Dataset family, on which we base our experimental setup. We will accordingly try these experiments and provide them in the updated version of our manuscript.
Thanks your reply! I would very much appreciate empirical results regarding the effectiveness of RS-Pool against BFAs and I would expect that it alleviates their adverse effects as well, at least to some degree. For the experiments, I would recommend to not quantize the networks and instead adapt BFA to floating point precision, as that is (at least from the defenders perspective) the more relevant and difficult setting. BFA literature developing new attacks usually looks at quantized models as they are inherently more difficult to degrade for numerical reasons but defenders - in my opinion - should always assume a more vulnerable floating point model.
Looking forward to your results!
We thank the reviewer again for their positive evaluation. We are grateful for the additional insightful feedback on our manuscript. We also appreciate their introduction of Bit-Flip Attacks, which we had not previously considered, and their guidance on how to practically interact and include such a comparison in our work.
Following their recommendations, we conducted a preliminary study by adapting a publicly provided code from the following paper [1]. Specifically, we modified the Injectivity Bit Flip Attack (by mainly editing the “IBFA.py” implementation with corresponding needed edits to “BFABase.py”) to remove all quantization operations and run directly on our floating-point model on the considered TU-Dasets. We followed our original experimental setup, using the same training and test folds as described in our manuscript, and we considered the GIN model and set the attack parameter to 5 (similar to the original author’s chosen parameters for the OGB-molHIV molecular dataset). The table below provides the results of our initial analysis and adaptation. Under these conditions and as expected by the reviewer, we see that our RS-Pool outperforms all the other pooling benchmarks.
| Dataset | Sum | Average | Max | SAG | TopK-P | PAN-P | Sort-P | RS-Pool |
|---|---|---|---|---|---|---|---|---|
| NCI1 | 64.7 ± 1.9 | 62.2 ± 1.9 | 65.9 ± 1.3 | 65.2 ± 1.3 | 61.2 ± 1.7 | 59.2 ± 1.7 | 63.5 ± 1.7 | 66.0 ± 1.6 |
| DD | 59.8 ± 4.1 | 61.8 ± 2.7 | 63.6 ± 2.6 | 63.1 ± 1.4 | 58.4 ± 2.1 | 56.2 ± 1.5 | 56.8 ± 2.5 | 68.3 ± 1.0 |
| ER_MD | 55.2 ± 2.5 | 52.3 ± 1.8 | 51.1 ± 1.8 | 52.4 ± 2.0 | 52.9 ± 2.7 | 52.2 ± 1.1 | 51.9 ± 2.7 | 55.5 ± 2.5 |
| PROTEINS | 57.0 ± 5.8 | 64.2 ± 1.5 | 64.8 ± 2.7 | 63.7 ± 0.9 | 61.2 ± 5.0 | 60.0 ± 4.9 | 66.7 ± 4.7 | 69.6 ± 1.1 |
We agree that a broader comparison will strengthen our claims. In the full revision, we will evaluate additional architectures (e.g. GCN), incorporate other BFA variants (RandomBFA and ProgressiveBFA), and report results across all datasets considered in our manuscript. We will also include our adapted attack scripts in the supplementary material to ensure full reproducibility. We thank the reviewer again for suggesting this direction, which further underscores the robustness advantages of RS-Pool. Please do not hesitate to let us know if you have any further comments. We are very grateful for your support.
[1] Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Leman Go Indifferent. - Kummer et al. SIGKDD. 2024.
Thanks a lot for the additional results! I think, though, they may have implications way beyond RS-Pool itself, as they suggest a potential relationship between BFAs and attacks using adversarial inputs.
Regarding your methodology, how exactly did you adapt the BFA from [1] to operate on floating point parameters? Did you allow bit flips along the whole floating point semantic (exponent, mantissa, sign) or did you implement any restrictions?
In any case, I think the relationship between pooling and BFA resilience is very important and interesting. This question is of course way beyond the scope of the current paper under review here, but I think it might even be possible to formally show different pooling methods will changes the formal bounds for GNN resilience to BFAs published in [2].
I will reach out to you for further discussions after double-blind review is over (i.e. either and the conference directly or once proceedings are published).
[1] Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Leman Go Indifferent. - Kummer et al. SIGKDD. 2024.
[2] On the Relationship Between Robustness and Expressivity of Graph Neural Networks. - Kummer et al. AISTATS 2025
We sincerely thank the reviewer for the dedicated time and the further interesting discussion.
In our initial study of the adaptation of the Injectivity Bit Flip Attack, we adapted to directly operate on the 32-bit floats by viewing each weight tensor as an int32 bit-pattern. In practice, the “flip_bit” function has been edited such that:
- it reinterprets “module.weight.data” as an np.int32 array (which is simply a list of integer bit‐patterns that can be directly used for the different bitwise operations).
- selects the top-k entries by gradient magnitude (similar to the original implementation). Instead of flipping a quantized bit, we now choose a specific bit position in the 32-bit float, which is by default the most significant mantissa bit.
- restore everything back to float32 (first through the np.float32 and then to torch.FloatTensor) so that the downstream computations can be performed. So overall, from the model’s perspective, only a single bit in each chosen weight has changed and the downstream computations see exactly those tiny, targeted perturbations to the floating-point parameters.
So in this case, we focus on the mantissa rather than allowing arbitrary all-bit flips. It seemed to us that this perspective was sufficient for an initial study of the effect of BFAs and our proposed RS-Pool. We plan to extend the attack mechanisms to other possibilities in our next evaluation, and we will provide all the experimental details both in the revised manuscript and the provided code.
We agree with the reviewer that these results are likely to have interesting implications beyond RS-Pool itself (for which we furthermore agree that these implications may best be left out of the scope of our submitted manuscript). Indeed, pooling onto the dominant singular vector seems like a good idea for these kinds of attacks too and we are grateful to the reviewer for having given us the opportunity to explore this.
Overall, we agree with the reviewer’s perspective on the interesting interaction between the general pooling operation (including RS-Pool) and the Bit-flips attacks, and we would be very much interested in pursuing such a direction (specifically adapting the theoretical analysis from the provided reference [2] to take into account the pooling operation). We thank the reviewer again for this very interesting perspective and for suggesting such an interesting direction, and we would be happy to discuss this further.
This paper investigates the adversarial robustness of Graph Neural Networks (GNNs) in graph classification tasks, focusing on the role of pooling operations. The authors present a theoretical analysis of standard flat pooling methods (sum, average, max). Motivated by these insights, they propose Robust Singular Pooling (RS-Pool), a novel pooling strategy that leverages the dominant singular vector of node embeddings to construct robust graph-level representations. Theoretical analysis shows that RS-Pool improves robustness by filtering noise-sensitive components, and empirical results on real-world benchmarks demonstrate its effectiveness against state-of-the-art adversarial attacks while maintaining competitive clean accuracy.
优缺点分析
Strengths: The paper addresses a barely explored aspect of GNN robustness by focusing on pooling operations, providing both theoretical and empirical insights. The proposed method is model-agnostic and computationally efficient (using power iteration).
Weakness: Your attack settings are based on both features and topology, but the main results only consider feature-based attacks. Moreover, Theorems 4.2 and 4.3 appear to be simply extensions of Theorem 4 in [10] to the task of graph classification.
问题
1.While , the proof of Theorem 4.2 (line 553) gives that . Is it because the graph topology remains unchanged? It is unclear about your settings and proofs, such as the punctuations (e.g., 632), and the representations of the matrix norm and vector norm.
2.In Theorem 5.1, It seems that the values of \tau and \sigma_1 - \sigma_2 are both small. And this bound still depends on the perturbation budget . So, from a theoretical point of view, what is the advantage of the strategy proposed in this paper compared with the three strategies mentioned in Theorem 4.2?
3.Theorem 5.1 assumes that , but in practice there may be multiple principal singular values (such as symmetric graph structures). Could you elaborate on this special case?
局限性
See weakness and questions.
最终评判理由
The rebuttal resolves my concerns about the overall attack framework and the meaning of Theorem 5.1. As the authors promise to include the related discussions and additional experiments in the manuscript, I think the paper can be received.
格式问题
No
We appreciate the reviewer’s time and effort in reviewing our paper and for recognizing the significance of studying the effect of pooling in terms of adversarial robustness. In what follows, we try to address the raised questions and weakness:
Q1 - On the distance in the proof of Theorem 4.2 and the punctuation:
We apologize for the raised confusion in this part. Indeed, in Theorem 4.2, we consider feature-based adversarial attacks, which perturb only the node features, and therefore the topology remains unaltered. Hence, the distance (in terms of the norm) between and is smaller than . We will work on the writing clarity, fix the punctuation and more clearly distinguish matrix and vector norms accordingly.
Q2 - On the derived bound in Theorem 5.1:
As spotted by the reviewer, the bounds derived in Theorem 5.1 and Theorem 4.2 share several terms that depend on the weight norms, the message-passing framework and the effect of the structure (expressed in terms of the walks within the graph). The distinctive contribution and the main advantage of RS-Pool becomes more apparent in Corollary 5.2, where the bound is further refined to expose the influence of the hyperparameter . This parameter, unique to RS-Pool, offers user control over the robustness-accuracy trade-off, allowing practitioners to directly control the upper bound on adversarial risk, and consequently the model’s underlying robustness. This is not possible with the classical flat pooling approaches considered in Theorem 4.2.
As highlighted in lines 286–287, a key insight is that RS-Pool functions as a robustness stabilizer, even when the preceding message-passing layers are not inherently robust. Unlike traditional pooling methods, which aggregate over potentially perturbed final node embeddings, our proposed pooling rather focuses on the dominant singular vector keeping therefore the dominant, low-frequency signal and attenuates high-frequency components, where adversarial perturbations typically arises [1]. This filtering contributes directly to the tightening of the bound and the improved empirical robustness demonstrated in our experiments. We will clarify this point further in the revised version of our manuscript.
[1] A Frequency Perspective of Adversarial Robustness. Maiya et al. arXiv. 2021.
Q3 - On the special case of singular value degeneracy: We thank the reviewer for pointing out this special case. We note that RS-Pool is applied to the final node embedding matrix produced by the message-passing scheme. At this stage, having a dominant singular value with high multiplicity is highly unlikely, even in symmetric graphs, because the learned embeddings do not depend just on the graph structure but also on initial node features and the GNN parameters, which typically break such dominant symmetries. Moreover, GNNs (and actually also Transformers) are known for their inherent smoothing tendencies, which lead to rank-reduction in the feature matrix and result in a non-trivial spectral gap in practice [1, 2, 3, 4, 5].
Nonetheless, as pointed out by the reviewer, we agree that the provided upper-bound doesn’t take into account this specific case, but an extension of Theorem 5.1 to the case where is immediate. Specifically, as previously discussed in Wedin’s Theorem, which is the main basis of our proof, extending to the case where the dominant singular value is of multiplicity shall result in an additional scalar (rather than ) in the derived upper-bound and the considered spectral gap is . While such a special case is unlikely to arise in practice, we will provide the details of this derivation in the revised manuscript for completeness.
[1] Anisotropy Is Inherent to Self-Attention in Transformers.- Godey et al. - EACL 2024.
[2] How do vision transformers work? - Park N. & Kim S. - ICLR 2022.
[3] Anti-oversmoothing in deep vision transformers via the fourier domain analysis: From theory to practice. - Wang et al. - ICLR 2022.
[4] Attention is not all you need: Pure attention loses rank doubly exponentially with depth. - Dong et al. - ICML 2021.
[5] How expressive are transformers in spectral domain for graphs? - Bastos et al. - TMLR 2022.
I thank the authors for their detailed rebuttal and for answering some of my questions. I am specifically grateful for the clarification on the special case and the insights of Theorem 5.1.
However, the unified attack settings defined in line 163, along with the main theorem grounded in node attacks and experiments centered on structural attacks, create confusion for readers and necessitate further revisions.
Thus, I have no intention of altering the score。
We thank the reviewer for the time allocated to our rebuttal and for engaging in the discussion. We are happy that our rebuttal has answered some of your questions and concerns regarding Theorem 5.1.
We also believe there has been a misunderstanding. In the response, Line 163 is referenced as “unified attack settings”. However, Line 163 does not appear to be related to our current conversation (“However, this step can introduce information loss, prompting the development of various pooling strategies”), and there has been no mention of a “unified attack setting” in our paper or in the subsequent discussion.
We discussed the fact that our theoretical study concerns node feature attacks, while the experiments focus on graph structure attacks, with Reviewer RqyR. This has allowed us to extend our experimental evaluation to feature-based attacks (see Reply "Q2 &W2 - Regarding feature-based attacks" specifically Table 1 of our reply to Reviewer RqyR). In agreement with Reviewer RqyR, we believe that these additional results, our accompanying explanation and the promise to include this clarification in the manuscript resolve any ambiguity on the types of attacks that we study.
Please let us know if any confusion remains; we are more than happy to further explain any aspect of our work.
I thank the authors for the further clarification. To elaborate, I place greater emphasis on the unity and completeness of the paper—specifically, the consistency between definitions, theoretical results, and experiments.
Having reviewed the additional experiments on feature-based attacks, I find they address some of my confusions. However, I remain concerned about the theoretical analysis on structural attacks. I think an additional analysis of this aspect would enhance the paper’s coherence and strengthen its overall presentation.
We thank the reviewer for their continued engagement and constructive comments during the discussion phase. We are glad that we managed to clarify some of the confusion.
We would like to additionally take the opportunity to clarify the final elements regarding the scope of our paper and the structural versus node-features based attacks. The main purpose of our work is to understand the effect of the pooling operation on the robustness of an underlying graph classifier (which to our knowledge is the first work to consider this connection). We hence upper bound the expected robustness of different flat pooling families and our proposed RS-Pool under node feature perturbations. Node feature perturbations directly impact the final node representation produced via message-passing (similarly to the structure-based attacks, as empirically validated in the additional results of feature-based attacks). To provide theoretical results on only one kind of perturbation is common in the graph adversarial literature, e.g., [1-3] focus on structural perturbations and for instance [4-7] focus on node feature-based attacks. In practice, such a theoretical focus does not preclude empirical generalization of the proposed method to the other non-considered attacks (for our RS-Pool we empirically show that the provided insights also extend to both structural and feature based).
We hope the reviewer agrees that, within this defined scope and set of theoretical choices we have made, our paper (after the promised rebuttal edits) offers a principled theoretical foundation on the effect of pooling in robustness, a practical method (RS-Pool), and comprehensive empirical support, which includes node-features based attacks and an extension to structural perturbations. We appreciate and are grateful for the reviewer’s suggestions which enhanced the clarity and consistency of our manuscript.
—
[1] Robust Graph Convolutional Networks Against Adversarial Attacks. - Zhu et al. - KDD 2019
[2] Robustness of Graph Neural Networks at Scale. - Geisler et al. - Neurips 2021.
[3] Adversarial Training for Graph Neural Networks: Pitfalls, Solutions, and New Directions. - Gosch et al. - Neurips 2023.
[4] Graph Neural Networks with Adaptive Residual. - Liu et al. - Neurips 2021
[5] Node Feature Kernels Increase Graph Convolutional Network Robustness. - Seddik et al. - AISTATS 2022.
[6] Bounding the expected robustness of graph neural networks subject to node feature attacks. - Abbahaddou & al. - ICLR 2024.
[7] Randomized Message-Interception Smoothing: Gray-box Certificates for Graph Neural Networks. - Scholten et al. - Neurips 2022.
Thanks for the further discussions, which dispel my concerns to a certain extent. While I think there are still some elements missing from the manuscript, the overall contribution is very useful and can be of some value to the community. Given the authors' rebuttal, I increase my score to 4.
We thank the reviewer for the interesting discussion and the engagement in reviewing and discussing our work. Your recognition about the value of our work to the community means a lot, and we appreciate your positive score. We will carefully incorporate the above discussions into the revised paper and we thank again the reviewer for their suggestions that enhanced the quality of our manuscript.
The focus of the paper is The paper first defines average-case robustness as the l2 distance in the output space between the clean and perturbed graphs averaged over all perturbation within a given distance. Then they derive the effect of different pooling operators on this quantity. Next, the authors propose a new pooling method RS-Pool that projects the final node embeddings onto the first right singular vector of the embedding matrix. They also derive an upper bound for the "adversarial" risk for this pooling and argue that it is more favorable.
优缺点分析
The proposed approach is scalable, easy to implement, and well motivated. The parameter allows for some control over the trade-off between clean accuracy and robustness. The focus on the pooling aspect is very interesting and under studied in the literature. The theoretical results seem correct but I did not check the proofs in detail.
One of the my main problems with the paper is the mismatch between the actual and the claimed results. This shows up in several different places:
- As the authors already mention they analyse the average case and not the worst-case. Calling this adversarial robustness is highly misleading and not in line with the existing terminology in the literature. To be clear, I think the average case results are interesting and useful, but we should not be calling this adversarial. Relatedly, the statement "(ϵ, γ)-robust in the average case sense, it is also (ϵ, γ)-robust under the worst case criterion" is not quite accurate. The worst-case criterion would presumably have a instead of in Eq. 1, just because we have a low value on average it doesn't mean we have a low value in the worst-case.
- When they set up the threat model the authors define perturbations w.r.t. both and , however, the theorems say "Under a feature-based adversarial attack". Even more confusingly, in the experiments the author state "For all attacks, we apply a perturbation budget of ϵ = 0.3, allowing up to 30% of the edges in each graph to be modified.". However, as far as I understood, Theorem 5.1 and Theorem 4.2 only apply to feature-based permutations. This discrepancy is not ideal (see question 2). Moreover, the threat model looks at the L2 norm of the adjacency matrix which is different from the number of perturbed edges (L0 norm).
The experimental evaluation is limited. For example Table 1 shows results for a single attack budget which is relatively large. While they consider different in Figure 1 this should be extended to the rest of the "main" experiments.
Some relevant works are not (properly) discussed. For example, while [1, 2] are not exactly addressing the same problem, they are relevant enough to warrant a short discussion. Moreover, standard (non-adversarial) generalisation bounds for GNNs often have terms that depend on the product of (spectral) norms of the weights, similar to the expression derived in this paper. Showing how exactly robustness changes the bounds would be valuable. See e.g. [3] and related references.
You should label the y-axis on Figure 1 for improved clarity. Assuming that you are showing the "adversarial" risk , how do you actually compute it? The expectation defining in Eq. 1 is untractable to compute exactly as far as I know. Section 6.1. states "compute the distance be355 tween the pooled representations of clean and adversarially perturbed graphs" but how are these adversarially perturbed graphs obtained?
Depending on and it might be a bit cheaper to compute rather than precomputing since matrix-vector products are cheaper than matrix-matrix products, especially for low values of . This "trick" is also used in the standard implementation of spectral normalisation (available in PyTorch) which also computes the right singular vector as a side-product.
References:
- Cao et al., "Adversarial Robust Generalization of Graph Neural Networks"
- Sun and Lin, "PAC-Bayesian Adversarially Robust Generalization Bounds for Graph Neural Network"
- Liao et al., ":A PAC-bayesian approach to generalization bounds for graph neural networks"
问题
- Is the PGD attack adaptive? That is, do you directly attack RS-pool, differentiating through proposed pooling?
- Can you show empirical results against feature-based attacks that match the setup for the theory?
- Provide a detailed description of how the derived bounds differ from existing (not necessarily adversarial) bounds (see weakness)?
- Can you provide additional results for an even larger value of . Can you show a similar plot for other datasets?
局限性
The limitations section is there but could be expanded, e.g. about worst-case vs. average case.
最终评判理由
The authors addressed all of my concerns, hence I'm increasing my score.
格式问题
None.
We thank the reviewer for their constructive feedback.
W1 - On the “average” versus “worst-case” robustness: Indeed, as pointed out by the reviewer, we rather approach the subject from what we define as the “expected adversarial risk”, which differs from the more commonly used adversarial robustness definition. In this perspective, rather than considering only the worst-case adversarial examples, we consider the whole input’s neighborhood. We believe that our quantity is of complementary value to the typically considered “worst-case” robustness, since in certain settings it is preferable to be robust to “average” attacks rather than to the “worst-case” scenarios [1]. To clarify this ambiguity we will replace the term “adversarial robustness” by “expected adversarial robustness” throughout our revised manuscript. We further note that in our proof, we place no assumption on the sampling distribution of the perturbed graphs within the considered neighborhood defined by the budget . Hence, the upper-bound applies to any point within the considered neighborhood (refer to the equation following Line 553 in Appendix A). Therefore, with a suitable choice of the sampling distribution of the perturbed graphs, the derived bounds also applies to the worst case (as defined by different attack schemes) since by definition the set of worst-case perturbations are also part of the neighborhood (). Hence, while we have focused on the expected adversarial case, our proofs and the derived bounds can be easily adapted to take into account the worst-case adversarial robustness by adapting the definition of the expected adversarial risk. We will make sure to revise the manuscript accordingly to make clear such ambiguity and clarify that the provided bound can also be adapted to the worst-case (by adapting the definition in Eq. 1). We thank the reviewer for pointing this out.
[1] Robustness between the worst and average case. Rice, Bair, Zhang & Kolter. - NeurIPS 2021
Q2 & W2 - Regarding feature-based attacks:
Our empirical results, focused on structural perturbations, as structural attacks and defenses are well-implemented, documented and more commonly studied. Nonetheless, we agree with the reviewer that extending our experiments to feature based adversarial attacks allows us to show a direct connection between our theoretical and practical results, and we thank the reviewer for this suggestion.
We ran the requested experiments by considering two main attacks: The first is random attack (which consists of adding random sampled noise in the node features), and the second is the PGD attack (which uses the gradients). Table 1 provides the mean accuracy (and the standard deviation) of the different considered pooling methods. These new results show that our proposed RS-Pool also outperforms all the other pooling strategies for feature based attacks. We will revise the manuscript accordingly and add these results.
Table 1: Results on feature-based attacks.
| Attack | Sum | Average | Max | SAG | TopK-P | PAN-P | Sort-P | RS-Pool | |
|---|---|---|---|---|---|---|---|---|---|
| PROTEINS | PGD | 68.4 ± 4.0 | 64.6 ± 2.2 | 66.4 ± 4.1 | 64.3 ± 3.6 | 64.9 ± 6.9 | 68.8 ± 2.9 | 66.7 ± 2.9 | 69.9 ± 4.1 |
| PROTEINS | Random | 69.5 ± 5.2 | 67.6 ± 2.9 | 68.4 ± 3.2 | 69.0 ± 4.0 | 68.8 ± 6.5 | 69.2 ± 3.2 | 70.2 ± 2.6 | 70.9 ± 2.8 |
| D&D | PGD | 36.9 ± 3.6 | 36.9 ± 6.8 | 46.4 ± 6.4 | 49.9 ± 1.8 | 28.7 ± 3.7 | 37.5 ± 4.1 | 47.9 ± 4.8 | 51.8 ± 1.1 |
| D&D | Random | 49.0 ± 5.2 | 44.0 ± 6.5 | 45.7 ± 4.6 | 54.1 ± 3.0 | 40.9 ± 5.1 | 51.0 ± 4.6 | 49.7 ± 4.9 | 54.9 ± 2.3 |
| NCI1 | PGD | 38.6 ± 2.8 | 36.4 ± 3.5 | 37.9 ± 5.3 | 36.6 ± 2.2 | 35.1 ± 2.2 | 38.4 ± 1.8 | 36.2 ± 3.6 | 39.4 ± 1.9 |
| NCI1 | Random | 41.1 ± 2.0 | 38.1 ± 2.9 | 40.9 ± 5.5 | 40.1 ± 2.9 | 38.5 ± 1.0 | 40.9 ± 2.4 | 38.5 ± 4.8 | 41.7 ± 1.3 |
| ER_MD | PGD | 41.9 ± 3.2 | 45.3 ± 1.4 | 45.3 ± 4.6 | 45.7 ± 3.5 | 45.3 ± 3.8 | 45.3 ± 2.1 | 43.1 ± 3.5 | 55.1 ± 4.9 |
| ER_MD | Random | 54.3 ± 2.3 | 50.2 ± 4.3 | 52.1 ± 6.4 | 53.9 ± 5.5 | 52.1 ± 4.1 | 51.3 ± 5.2 | 53.6 ± 7.7 | 56.2 ± 5.1 |
| MSRC_9 | PGD | 84.8 ± 2.8 | 85.6 ± 4.0 | 85.6 ± 2.8 | 87.1 ± 1.1 | 83.3 ± 1.1 | 85.6 ± 3.9 | 84.1 ± 1.9 | 88.3 ± 1.1 |
| MSRC_9 | Random | 86.4 ± 3.7 | 88.6 ± 1.9 | 87.9 ± 2.8 | 89.4 ± 1.1 | 87.1 ± 2.1 | 87.9 ± 3.9 | 88.6 ± 3.2 | 90.2 ± 1.1 |
Q1 - Is the PGD attack adaptive ?
As pointed out by the reviewer, for each pooling method (including RS-Pool), we differentiate through the pooling, making the attack adaptive. We will add an algorithm describing the attack framework in the appendix of our revised manuscript (note that our implementation is available in the supplementary materials).
Q3 & W4-Linking to the existing bounds:
We thank the reviewer for the different provided references, we agree that citing them and discussing them in detail is important and we will edit the manuscript accordingly. We note that the (adversarial) generalization bounds provide understanding and quantify the difference in terms of the loss from the training to unseen (either perturbed or non-perturbed) test graphs. In our work, we rather consider the specific perturbation effect of the pooling layer choices at inference time. We hence consider a set of flat pooling methods to theoretically study their effect and accordingly we propose a new pooling RS-Pool, which we show theoretically and empirically to have some benefits in enhancing the overall robustness of the model. We believe that studying the generalization bounds of the pooling layer (also our RS-Pool) can be a great way to combine both directions in future work.
W3 - On the considered attack budgets:
We agree with the reviewer that extending to other budgets is indeed very useful to further showcase the value of our proposed method. We therefore have extended the results to 10% and 20% attack budgets in the case of the PGD attack (the most performant attack in Table 1 of our paper). We will additionally extend this study to the other considered attacks and models and update the appendix accordingly. Table 2 reports the mean and standard deviation of the resulting attacked classification accuracy. We see that our RS-Pool is still out-performing the other pooling methods in smaller budgets. We furthermore want to point out that for the binary adjacency matrices we consider, the and norms are equivalent and hence, there is no discrepancy between the norm considered by the threat model and the number of perturbed edges.
Table 2: Results for different attack budgets.
| Budget | Sum | Average | Max | SAG | TopK-P | PAN-P | Sort-P | RS-Pool | |
|---|---|---|---|---|---|---|---|---|---|
| PROTEINS | 0.1 | 63.6 ± 3.0 | 53.6 ± 1.9 | 48.2 ± 2.9 | 60.1 ± 1.5 | 60.7 ± 2.7 | 62.2 ± 2.3 | 62.5 ± 1.5 | 65.1 ± 2.3 |
| PROTEINS | 0.2 | 55.1 ± 2.6 | 40.2 ± 1.9 | 27.1 ± 0.8 | 55.7 ± 1.8 | 50.3 ± 2.8 | 45.2 ± 1.1 | 56.2 ± 2.9 | 56.9 ± 3.6 |
| D&D | 0.1 | 21.7 ± 1.0 | 45.4 ± 4.0 | 6.8 ± 3.0 | 44.2 ± 5.4 | 22.2 ± 4.2 | 23.9 ± 0.9 | 28.2 ± 4.1 | 51.3 ± 3.1 |
| D&D | 0.2 | 11.5 ± 3.2 | 32.3 ± 3.4 | 6.8 ± 1.8 | 26.2 ± 5.6 | 11.3 ± 2.7 | 10.1 ± 4.3 | 15.2 ± 5.0 | 38.6 ± 3.3 |
| ER_MD | 0.1 | 49.1 ± 4.3 | 53.9 ± 2.3 | 44.2 ± 2.6 | 53.2 ± 2.9 | 48.7 ± 2.5 | 47.9 ± 3.2 | 42.3 ± 2.6 | 58.4 ± 2.7 |
| ER_MD | 0.2 | 40.4 ± 4.1 | 47.2 ± 2.2 | 43.4 ± 2.9 | 49.1 ± 1.1 | 43.1 ± 2.8 | 36.0 ± 2.5 | 36.7 ± 2.3 | 54.3 ± 3.9 |
| MSRC_9 | 0.1 | 84.1 ± 1.9 | 87.9 ± 2.8 | 84.8 ± 1.1 | 86.4 ± 1.9 | 84.1 ± 1.9 | 85.6 ± 2.8 | 85.6 ± 2.8 | 88.6 ± 1.9 |
| MSRC_9 | 0.2 | 82.6 ± 2.8 | 84.8 ± 3.4 | 83.3 ± 4.3 | 85.6 ± 2.8 | 79.5 ± 2.1 | 85.6 ± 2.4 | 75.8 ± 3.5 | 86.4 ± 3.2 |
| ENZYMES | 0.1 | 10.6 ± 2.8 | 10.0 ± 1.4 | 9.4 ± 2.8 | 11.1 ± 3.4 | 6.1 ± 0.8 | 7.8 ± 1.6 | 12.2 ± 1.6 | 14.4 ± 2.8 |
| ENZYMES | 0.2 | 5.0 ± 2.7 | 3.9 ± 3.4 | 5.0 ± 2.7 | 8.9 ± 3.9 | 1.7 ± 1.4 | 3.9 ± 2.1 | 7.2 ± 1.6 | 12.8 ± 3.5 |
Q4 - On the effect of and extending the study to other datasets:
We note that is a hyper-parameter controlling the trade-off between clean and attacked accuracy (based on Corollary 5.2, recall that ). Typically, depending on the use-case and the user’s need to sacrifice clean accuracy for better robustness, this parameter should be tuned. While we have focused on the PROTEINS dataset in Figure 1 (e), the insights translate to other datasets. In response to your question, we considered 3 additional datasets and higher values of . We will add the required figures in the appendix (which will be based on the table below).
| 1 | 50 | 100 | 200 | 500 | 1000 | 2000 | ||
|---|---|---|---|---|---|---|---|---|
| D&D | Clean Accuracy | 74.6 | 75.1 | 74.8 | 74.7 | 74.7 | 74.3 | 73.6 |
| D&D | Success Rate | 66.1 | 51.3 | 50.8 | 50.4 | 45.6 | 40.5 | 28.4 |
| NCI1 | Clean Accuracy | 70.1 | 67.9 | 65.4 | 65.1 | 64.6 | 62.1 | 61.8 |
| NCI1 | Success Rate | 44.9 | 43.4 | 41.2 | 40.5 | 38.5 | 37.5 | 37.6 |
| ER_MD | Clean Accuracy | 65.9 | 66.2 | 64.0 | 63.8 | 63.4 | 58.8 | 56.9 |
| ER_MD | Success Rate | 16.4 | 16.3 | 15.1 | 14.8 | 14.6 | 5.9 | 4.1 |
On the trick: We want to warmly thank the reviewer for the proposed “trick”, indeed such an adaptation can reduce the training time of our method, we will edit the code accordingly and provide more details on this implementation in the revised manuscript.
Thank you for the detailed response and the additional experiments. Many of my concerns were addressed. I will update my score at the end of the rebuttal period based on this and the other reviews.
I still think that even the term “expected adversarial risk” is not appropriate. The term adversarial immediately implies "worst-case" in the majority of the literature so this would just be confusing for readers. Terms like "expected robustness" seem much more appropriate. Note [1] which you cite nicely summarises this in their abstract "The focus has largely been on two extremes of robustness: the robustness to perturbations drawn at random from within some distribution (i.e., robustness to random perturbations), and the robustness to the worst case perturbation in some set (i.e., adversarial robustness)". Your study focuses on the former setting which is fine but this is not adversarial in my opinion.
You also say "with a suitable choice of the sampling distribution of the perturbed graphs, the derived bounds also applies to the worst case". Can you give me an example of a suitable choice? One thing I can think of is a distribution that puts all the mass on the worst-case inputs (those that maximize the risk) and zero mass on all others. But this is just a round-about way of calculating worst-case adversarial risk and the difficulty is in actually finding these worst-case inputs in order to be able to specify this degenerate distribution.
[1] Robustness between the worst and average case. Rice, Bair, Zhang & Kolter. - NeurIPS 2021
We thank the reviewer for the time allocated to our rebuttal. We are happy that we managed to answer some of your concerns and questions. We will update the manuscript accordingly by adding the different elements, and we are grateful for the recognition of the value of our manuscript.
Regarding the definition, we appreciate the helpful clarification. We had originally used the term "adversarial" to refer to any perturbation that can alter the model's output (not necessarily in a worst-case sense), which has been used in this way before in our domain of graph-structured data. However, we recognize that in the broader adversarial robustness literature, particularly outside of graphs, the term is more narrowly associated with worst-case perturbations. We see the value in aligning our terminology with that of the wider deep learning community, and we agree that using the term "expected robustness" in our work is preferable. We will accordingly revise the manuscript to reflect this terminology and will include additional details on how this definition relates to the worst-case notion. We thank the reviewer for their constructive suggestion.
Regarding the theoretical link of our results to “worst-case” adversarial robustness, we agree with the point raised. To elaborate, let’s suppose having access to a certain oracle that can produce the worst-case adversaries within the considered input’s neighborhood (based on the attack budget ) (we note that the “worst-case” is not necessarily unique). In practice, such an oracle can be any adversarial perturbation in which the user has some interest (which is not necessarily guaranteed to provide and find the specific worst-case points). Then, as predicted by the reviewer, simply taking a dirac distribution over the produced worst-case set shall adapt the “expected robustness” to a rather “worst-case adversarial robustness”.
In this perspective, and as pointed out by the reviewer, while such a theoretical link between these two quantities exists, it assumes having access to the oracle and may be complexity-wise impossible (and therefore of little relevance in practice). We will make sure to clarify this link and its scope of relevance in our paper. We will furthermore clarify that we focus on the “expected robustness”. We sincerely thank the reviewer for their insights.
We sincerely thank the reviewer for their thoughtful feedback and engagement with our paper. We will carefully incorporate the different elements discussed in the rebuttal and our discussion into the revised paper and we thank again the reviewer for their suggestions that enhanced the quality and the clarity of our manuscript.
We hope that our responses have sufficiently addressed the questions and concerns raised, and that the clarifications provided may support a positive reconsideration of the score as the discussion period concludes.
This paper tackles the adversarial vulnerability of GNNs in graph classification by focusing on the pooling stage. After theoretically analyzing the weaknesses of standard pooling methods like sum, average, and max, the authors introduce Robust Singular Pooling (RS-Pool). This novel, model-agnostic method constructs a graph representation from the dominant singular vector of the node embeddings, which is inherently more stable against adversarial perturbations. Implemented efficiently via power iteration, RS-Pool is empirically shown to significantly enhance robustness against attacks while maintaining competitive accuracy on clean data.
优缺点分析
Strength:
-
Novel Focus and Theoretical Grounding: The paper provides a novel contribution by focusing on the often-overlooked pooling stage of GNNs in the context of adversarial robustness. Its arguments are supported by a strong theoretical framework that derives robustness bounds for common pooling methods before proposing a solution.
-
Well-Motivated and Efficient Solution: The proposed RS-Pool method is elegantly based on established matrix perturbation theory, which shows that dominant singular vectors are stable under perturbations. It is also computationally efficient, using the power iteration method to avoid a costly full SVD, making it practical for real-world applications.
-
Model-Agnostic and Versatile: RS-Pool is designed to be model-agnostic, and the paper demonstrates its effectiveness with different GNN backbones (GCN and GIN) and even when combined with other existing adversarial defense methods.
问题
-
What is the semantic meaning of the dominant singular vector? The paper uses the dominant singular vector as a robust feature summary. However, it doesn't investigate what this vector represents in terms of the graph's structure or semantics. Does it capture information about community structure, hub nodes, or specific functional motifs? Answering this could lead to more interpretable GNNs.
-
What information is lost by discarding other singular vectors? RS-Pool exclusively uses the top singular vector to achieve robustness. A key unaddressed question is what valuable, albeit less dominant, information is being discarded. These secondary components might be crucial for fine-grained classification tasks where subtle features matter more than overall structural stability.
-
How does network depth affect the spectral gap? The method's effectiveness hinges on a significant spectral gap in the node embedding matrix. The paper doesn't explore how the number of GNN layers impacts this gap. Deeper GNNs often lead to "over-smoothing," where node embeddings become more similar, potentially shrinking the spectral gap and reducing RS-Pool's effectiveness.
局限性
-
Dependence on Spectral Gap: The effectiveness of RS-Pool, both theoretically and in practice, relies on a sufficiently large spectral gap between the first and second singular values of the node embedding matrix. The paper acknowledges that a small gap, which can occur in some graph topologies, weakens the robustness guarantee and slows the convergence of the power iteration.
-
Computational Overhead: Despite being more efficient than a full SVD, RS-Pool incurs more computational overhead than standard pooling methods. This is especially true on very dense graphs, which could limit its use in resource-constrained applications.
格式问题
N/A
We thank the reviewer for their thoughtful comments and constructive feedback. We especially appreciate their recognition of the novelty and value of our proposed theoretical study and the proposed pooling method. Please find our responses to the raised concerns below:
Q1 - On the semantic meaning of the dominant singular vector:
We thank the reviewer for raising this very interesting question. We first want to recall that our RS-Pool method relies on the dominant singular vector of the node embedding matrix that is produced by several iteratively applied GNN message passing layers. In general, in a GNN, we learn a function that maps from the graph space (often represented via the adjacency matrix) and node feature space (node feature matrix) to the label space as accurately as possible. Consequently, semantic properties of the considered dominant singular vector are highly dependent on the label structure that we attempt to map to and we cannot provide any general semantic interpretation for this vector without regard for the learning task and dataset.
As you rightly point out we have nice semantic interpretations for the singular vectors obtained from the node feature and graph structural space. For the node feature matrix the singular vectors can be interpreted via the long-standing theory on principal components to provide the directions that capture maximal variance of the node features. And like you mention there is a rich theory on the spectral properties of adjacency and Laplacian matrices. These singular vectors (equal to the eigenvectors for symmetric matrices) have indeed been shown to encode community structure [1] and in some instances hubs [2]. Now, our pooling strategy also works with the principal singular vector that captures the direction in which our learned embeddings vary maximally. However, what kind of semantic information is placed in this vector by the trainable GNN layers is entirely dependent on the labels that the GNN is trained to predict. This flexibility to some degree marks the strength of GNNs, as is visible by the strong empirical performance of RS-Pool that we observe in Table 1.
In general, we agree that the semantic analysis of the dominant singular vector in the context of specific learning tasks and dataset is of interest and is a promising direction for future research on the interpretability of GNNs. We will make sure to mention this direction in the revised manuscript.
[1] A tutorial on spectral clustering. U. von Luxburg. Statistics and Computing. 2007.
[2] Authoritative Sources in a Hyperlinked Environment. J. Kleinberg. Journal of the ACM (JACM). 1999
Q2 - What kind of information is lost when considering only the dominant singular vector:
While it is correct that focusing on the dominant singular vector discards certain information, particularly high-frequency, node-local variations, we emphasize that in practice, this loss has limited impact on standard graph classification tasks. Empirically, as shown in Table 1 (GCN) and Table 2 (GIN), RS-Pool achieves clean accuracy comparable to that of sum pooling, which retains the full spectrum of signals. This suggests that the fine-grained information encoded in the subdominant singular vectors is rarely essential for achieving high classification performance on clean (non-attacked) graphs.
On the other hand, our motivation centers on robustness. Recent findings [3] indicate that adversarial perturbations tend to manifest in high-frequency components, precisely the directions discarded by RS-Pool. Thus, although the loss of high-frequency information is existent, it is their removal that enables RS-Pool to enhance adversarial robustness. In this sense, our method intentionally trades off a marginal amount of discriminative capacity (which is not always needed for graph classification tasks) for a significant gain in robustness. We thank the reviewer for raising this question, and we shall clarify this trade-off in the manuscript.
[3] A Frequency Perspective of Adversarial Robustness. Maiya et al. arXiv. 2021.
Q3 - On the effect of depth and over-smoothing: We appreciate the reviewer’s suggestion to investigate the role of depth in our framework. Traditionally, the adversarial robustness literature does not emphasize depth or other model’s hyperparameters, as models under attack are typically assumed to be well-trained and performant.
Nonetheless, we agree that in the context of GNNs, over-smoothing is an important subject to take into consideration. However, we would like to clarify that over-smoothing is primarily a concern in node classification tasks, where discriminative power between node embeddings is critical. In contrast, in graph classification, the focus of our work, the effect of over-smoothing is less pronounced and can even be regarded as positive [4] since the collapse of the node embeddings is only problematic if the collapsed state is not label-informative.
In the regime of oversmoothing, node embeddings across the graph tend to become similar, effectively collapsing the rank of the node representation matrix to 1. This leads to the information being concentrated within the dominant singular vector and consequently becomes dominant and subsequent singular values tend to vanish. Therefore, instead of shrinking the spectral gap will increase, resulting in better robustness guarantee of RS‑Pool as derived in Theorem 5.1, where the bound is inversely proportional to this gap.
To further illustrate our discussion, we conducted an ablation by increasing the depth of the GCN encoder and evaluated the impact on accuracy for both RS‑Pool and Sum Pooling. We pushed the number of layers until 10, which is the average graph diameter in the considered TUDataset, where at this point, most nodes have already aggregated information from the entire graph. Table 1 provides the mean and standard deviation of the accuracy, where we see that depth has not much effect on the resulting clean accuracy. We will update the manuscript accordingly.
Table 1: Performance of a GCN on different graph classification dataset for different number of layers (depth) for RS-Pool and Sum Pooling.
| Depth | Pooling | PROTEINS | NCI1 | MSRC_9 |
|---|---|---|---|---|
| 2 | Sum | 74.2 ± 3.1 | 70.6 ± 0.8 | 88.6 ± 3.2 |
| 2 | RS-Pool | 73.5 ± 2.9 | 70.1 ± 1.2 | 89.7 ± 1.1 |
| 4 | Sum | 73.2 ± 3.2 | 72.6 ± 2.0 | 89.4 ± 1.3 |
| 4 | RS-Pool | 72.6 ± 1.7 | 72.2 ± 1.5 | 90.6 ± 0.9 |
| 6 | Sum | 73.3 ± 2.1 | 72.8 ± 2.2 | 91.4 ± 0.8 |
| 6 | RS-Pool | 72.3 ± 2.6 | 72.6 ± 1.1 | 91.7 ± 1.1 |
| 8 | Sum | 73.8 ± 4.2 | 72.9 ± 2.6 | 87.1 ± 2.1 |
| 8 | RS-Pool | 72.6 ± 2.2 | 72.8 ± 0.8 | 87.9 ± 2.8 |
| 10 | Sum | 73.4 ± 3.4 | 72.9 ± 1.7 | 89.4 ± 5.7 |
| 10 | RS-Pool | 72.4 ± 2.6 | 72.8 ± 0.8 | 90.9 ± 3.2 |
[4] Understanding Virtual Nodes: Oversquashing and Node Heterogeneity. Southern et al. ICLR. 2025.
Thank the authors for addressing my questions, I am good to keep my score.
We thank the reviewer for the dedicated time in reviewing the paper and interacting with our provided rebuttal. We are glad that we managed to answer their concerns and questions, and we additionally appreciate the provided positive score. We will carefully incorporate the above discussed elements into the revised paper.
Dear Reviewer kMFs,
This is a friendly reminder that the reviewer–author discussion phase is approaching its end, and we notice you have not yet responded in the discussion thread. Please take a moment to engage with the authors' rebuttal by sharing your thoughts or follow-up questions. If your concerns have been addressed, kindly acknowledge that in the discussion thread. Before submitting the “Mandatory Acknowledgement,” please ensure that you have engaged in actual communication with the authors. Simply clicking the button without participating in the discussion is not permitted. Thank you again for your time and efforts.
Regards,
AC
This paper proposes a singular pooling mechanism to improve the robustness of graph classification models under adversarial perturbations. The method is well-motivated, theoretically grounded, and shows consistent improvements over baselines across multiple benchmarks. Reviewers appreciate the clarity of presentation, the solid empirical and theoretical evidence, and the potential impact on robust graph learning. Several concerns are raised about the scope of experiments and broader generality, but these are well-addressed in the rebuttal. Overall, the paper's contribution is novel, technically sound, and it establishes a useful and well-substantiated direction for developing more robust graph classification methods.