7.0

/10

Spotlight4 位审稿人

最低5最高8标准差1.2

4.0

置信度

正确性3.0

贡献度3.5

表达2.8

NeurIPS 2024

Enhancing Robustness of Graph Neural Networks on Social Media with Explainable Inverse Reinforcement Learning

Yuefei Lyu,Chaozhuo Li,Sihong Xie,Xi Zhang

OpenReview PDF

提交: 2024-05-15更新: 2024-11-06

摘要

关键词

graph adverisarial attackreinforcement learninginverse reinforcement learninggraph neural networks

评审与讨论

审稿意见

评分: 7置信度: 32024-06-17

The paper presents a novel approach to enhancing the robustness of Graph Neural Networks (GNNs) against adversarial attacks, specifically in social media contexts such as rumor detection. The authors propose an enhanced maximum entropy inverse reinforcement learning (IRL) method with a mixture-of-experts approach to tackle multi-source graph adversarial attacks. This method aims to reconstruct attack policies, integrate various attack models, and generate additional adversarial samples to improve the robustness of GNN-based detection models.

优点

The application of inverse reinforcement learning to reconstruct adversarial attack policies is novel and offers a highly interesting perspective on enhancing GNN robustness.

Combined with the Mixture-of-Experts, the method allows for the integration of various attack models, providing comprehensive feature-level explanations and robust adversarial samples for use in adversarial training.

The generation of good additional adversarial samples for training improves the GNN’s resilience to attacks, which is a significant step towards robust social media analysis.

The authors use real-world social media datasets to validate the proposed method.

缺点

The proposed method involves multiple components (IRL, mixture-of-experts, bidirectional updates), which can increase the computational complexity and may not be easily scalable.

The focus is primarily on rumor detection in social media, which, while important, might limit the generalizability of the method to other types of graphs and applications.

Some sections, particularly those involving the theoretical underpinnings of IRL and mixture-of-experts, could be more clearly explained to enhance understanding and accessibility.

No code is provided. This hinders the exact reproduction of results.

I think the authors use the term "Threat model" in an incorrect or at least unorthodox way that will likely be misunderstood in the security community and potentially beyond. Specifically, in line 229, the authors start a paragraph called “Threat model” and they proceed to describe that they use, GCN, number of hidden dimensions, optimizer, etc. This is not what is typically understood as a threat model in literature: a model of a threat actor's capabilities, possible courses of action they may take and how it will impact the operation of a computer system [1]. Speaking of it, including an actual threat model (or rather making the implicitly exiting model explicit) would certainly strengthen the paper and increase acceptance in the security community.

Minor issues: Typos/grammar in lines 13, 69, 109, 111

[1] https://www.sciencedirect.com/science/article/abs/pii/S0167404818307478

问题

Could this approach be transferred (with respective modifications) to bit flips, e.g., [1]?

Did you analyze the impact of varying Rich-club coefficients in the datasets?

[1] https://arxiv.org/abs/2311.01205

局限性

The method assumes that the perturbations captured during training are representative of real-world adversarial attacks.

Further testing on different graph types and applications is required to confirm the broader applicability of the proposed method.

作者回复

2024-08-07

Q1: Could this approach be transferred to bit flips?

A1: Thank you for your constructive question. We believe that it is feasible theoretically.

(1) Bit flip attack: This attack disrupts neural network operations by flipping bits in parameters or intermediate results. For example, [1] degrades GNN performance by selectively flipping bits in quantized weights and biases, targeting those most impactful to the model's injectivity.

(2) To transfer our work to bit flips, the key is to model bit flip attacks as a MDP for RL. Suppose there are $T$ -times bit flips to achieve an effective attack, defining the attack trajectory with $T$ steps. The RL elements are:

State: The current GNN parameters;
Action: Flipping the specific bit in the parameters, which could be represented by a mask matrix;
Reward: Estimated by the IRL module;
Policy: Given the current state, output the next action. A deep RL policy, such as Actor-Critic (AC), might be effective for handling the state from the deep models.

For IRL, it is suggested to use a neural network as the reward function because it is difficult to design interpretable linear features for bit flip attacks when targeting the internals of an end-to-end deep model. Given the target GNN model and some expert bit flip attack trajectories, the learner policy produces the attack actions and updates with the reward from the IRL module to approximate the expert policy.

[1] Attacking Graph Neural Networks with Bit Flips: Weisfeiler and Lehman Go Indifferent

Q2: Did you analyze the impact of varying Rich-club coefficients in the datasets?

A2: Thank you for your vaulable suggestion. The rich-club coefficients of Weibo and Pheme are shown in Figure 1 in the PDF. The analysis reveals that:

(1) As the degree increases, the rich-club coefficient first rises and then falls, indicating that nodes with moderate influence in the social network have higher connectivity with each other.

(2) Pheme exhibits higher rich-club coefficients compared to Weibo, suggesting that the social network on Pheme is more densely connected overall. As a result, attacks on Pheme generate more pronounced cascading effects [1], thereby complicating the recovery of attack policies, which aligns with the experimental results presented in Table 2. The model's performance on Pheme, which has a higher rich-club coefficient, is lower compared to the performance on Weibo with smaller rich-club coefficients.

[1] Cascading failures in complex networks

Q3: The proposed method involves multiple components, which can increase the computational complexity and may not be easily scalable.

A3: Thank you for your valuable suggestion. Please see Q1 in the global rebuttal for further details.

Q4: It might limit the generalizability of the method to other types of graphs and applications.

A4: Thank you for your suggestion. To verify the generalizability, we have adapted and implemented our method for the recommendation system task.

Dataset: Gowalla [1], includes 13,591 users, 14,322 items, and 227,193 interactions.
Interpretable Features: The domain-specific features used for rumor detection in our work were modified to suit recommendation systems.
Metric: ΔNDCG, which measures the reduction in NDCG@20 value after $T$ -step attacks.
Target Model: PinSage [2]. Without attacks, the NDCG value on the training set is 0.4139.
Attack Method: We improved AdRumor-RL with the features in (2) and generated three expert trajectories, each with 5 steps. It conducts the global attack and updates with the reward ΔNDCG. The average attack performance (ΔNDCG) is 0.0325.
Performance: Results of policy recovery are as follows. The learner policy achieved 90% of the performance of experts. Our method is effective in the recommendation system task.

	Expert	Apprenticeship	EntIRL	MoE-BiEntIRL
ΔNDCG	0.0325	0.0067	0.0268	0.0294

[1] Friendship and mobility: User movement in location-based social networks

[2] Graph convolutional neural networks for web-scale recommender systems

Q5: The theoretical underpinnings of IRL and MoE could be more clearly explained.

A5: Thank you for your suggestion. We provide explanations for EntIRL and MoE as follows.

(1) EntIRL infers the reward function in the environment by maximizing the entropy of the path distribution. Thus, the EntIRL objective is defined as $\max\sum -P(τ) \log P(τ)$ with trajactory $τ$ . It assumes that $P(τ|θ)∝\exp(R_θ(τ))$ , where $R_θ(τ)$ is the reward function with parameter $θ$ . The loss function for EntIRL is the likelihood $L(θ) = \sum \log P(τ|θ)$ because maximizing likelihood and maximizing entropy share the same goal [1].

In our work, we consider locally optimal examples, segmenting the trajectory as state-action pairs. The previous assumption becomes $p(a|s)∝\exp(Q^*(s, a))$ [1]. With a discount factor of 0, the action probability $p(a|s)$ is given by Eq. (2). The loss function then becomes $L(θ) = \sum \log P(a|s, θ)$ .

(2) MoE consists of a gating network $α()$ and $K$ experts $p_1(),\ldots,p_k()$ . It can be indicated as $p(y|x) = \sum_k α(x)p_k(y|x)$ . We model the RL policy as an MoE as stated in Eq. (4), with each expert corresponding to an action probability function as shown in Eq. (5). The IRL loss function for each expert is $L(θ_k) = \sum \log p(a|s, θ_k)$ , which aligns with the M-step goal in the EM algorithm as stated in Eq. (11). Thus, it optimizes the IRL policy using the EM algorithm.

[1] Maximum entropy inverse reinforcement learning

Q6: No code is provided.

A6: Thank you for your valuable suggestion. Please see Q2 in the global rebuttal.

Q7: Wrong description about “Threat model”.

A7: Thank you for your correction. "Threat model" should be replaced with "Target model".

2024-08-08

Thank you for your rebuttal. My concerns have been mostly alleviated and uncertainties clarified. I am raising my rating from 5 -> 7 and my confidence from 2 -> 3. I maintain my recommendation to briefly summarize the attackers' assumed capabilities and goals in a short paragraph, as this information currently scattered throughout the paper.

2024-08-12

Dear Reviewer sELL,

Thank you very much for your thoughtful feedback and for taking the time to review our rebuttal. We are pleased to hear that our responses have addressed most of your concerns and that you found our clarifications helpful.

We appreciate your suggestion to summarize the attackers' assumed capabilities and goals in a concise paragraph. We will ensure this is clearly presented in the revised manuscript.

Thank you again for your valuable insights and for raising your rating and confidence level. Your input has been instrumental in improving our work.

Best regards,

Submission 14020.

审稿意见

评分: 5置信度: 52024-07-11

This work studies the problem of reconstructing attack policies using collected adversarial samples to enhance the robustness of GNN-based models in social network tasks, specifically rumor detection. The authors propose the MoE-BiEntIRL framework, which employs a mixture-of-experts approach to learn optimal policies from diverse adversaries, and provides feature-level explanations by estimating interpretable linear reward functions. Experiments on two real-world rumor detection datasets validate the effectiveness of MoE-BiEntIRL.

优点

The authors investigate the rumor detection problem from the novel perspective of reconstructing attack policies.
The paper is well-written and well-organized, with motivating illustrations of the problem.

缺点

While the proposed problem and approach are generally novel and intriguing, the following issues regarding experiments require further clarification:

Table 2: What makes the policies on Pheme significantly harder to recover than the policies on Weibo?
Table 3: The results are not clearly illustrated and explained.
- For instance, it appears that the column under "w/o Att." reflects test accuracy (%), while results under other columns reflect accuracy decline in actual numbers. Please align the representations for consistency.
- If "w/o Att." refers to GCN's rumor detection performance without adversarial attacks, it is surprising to see that GCN only achieves ~70% test accuracy on the Weibo dataset with binary rumor / non-rumor labels. The authors claim that the Weibo dataset is adopted from existing work [1], which reported over 80% test accuracy on Weibo even using simple models such as TF-IDF or GRU. Please elaborate on the causes for this significant performance discrepancy, e.g, data differences and model structure differences.
Computational Efficiency: Given the complexity of the model structure illustrated in Figure 2, it would be beneficial to benchmark the computational efficiency of the proposed approach against the baselines in Table 3.

[1] Changhe Song, Cheng Yang, Huimin Chen, Cunchao Tu, Zhiyuan Liu, and Maosong Sun. Ced: Credible early detection of social media rumors. TKDE, 33(8):3035–3047, 2021.

问题

Please refer to the weaknesses section.

局限性

The authors have adequately addressed the limitations.

作者回复

2024-08-07

Q1: What makes the policies on Pheme significantly harder to recover than the policies on Weibo in Table 2?

A1: Thank you for your insightful question. We posit that the difficulty in policy recovery is related to the complexity of the underlying graph structures. As indicated in Table II of the PDF, we have devised several metrics to assess the complexities of the Weibo and Pheme datasets [1-3]. The data reveals that the graph structure of Pheme is more complex than that of Weibo, characterized by a larger variance in node degree and a higher graph density.

Returning to the proposed model, the success of Inverse Reinforcement Learning (IRL) in recovering expert policies is influenced by several factors, including the quality and quantity of expert demonstrations, the complexity of the expert policies, and the variability of the environment. In our humble view, the attack sequences within complex social graphs exhibit significant cascading effects [4], complicating the recovery of these policies. The more intricate the graph structure, the more substantial the cascading effect due to edge-flipping attacks. Such an attack can propagate from a target node to multiple others, thereby impacting the overall system performance because of the high interdependence and connectivity among nodes. Complex graphs increase the likelihood of cascading reactions triggered by attack behaviors.

Furthermore, sequential attacks on these graphs intensify cascading reactions, potentially altering the characteristics of numerous nodes and causing environmental changes within the reinforcement learning process. Attackers may also optimize their strategies, adding more decision conditions and steps to navigate the complex social graph effectively. Together, these factors contribute to the increased difficulty of IRL in recovering expert policies.

As presented in Table II, all baseline methods, including Apprenticeship Learning and EntIRL, exhibit suboptimal performance in recovering policies due to the complexity of the social graph in Pheme. In contrast, our approach, which integrates a mixture of experts to accommodate the attack sequences generated by complex social graphs, consistently achieves state-of-the-art performance on these challenging datasets.

[1] Complex networks: Structure and dynamics. 2006.

[2] The structure and function of complex networks. 2003.

[3] Graph measures and network robustness. 2013.

[4] Cascading failures in complex networks. 2020.

Q2: The unclear description and significant GCN performance discrepancy with [1] in Table 3. (Weakness 2 in reviews)

A2: Thank you for your correction and constructive question.

(1) We apologize for not clearly describing the columns in Table 3. As you mentioned, the column under "w/o Att." reflects test accuracy (%), while results under other columns reflect accuracy decline in actual numbers.

(2) The reasons for this discrepancy in attack effectiveness are multifaceted:

a) Dataset Differences: The weibo-all dataset referenced in [1] is sourced from the CED repository in Github, which includes the Rumdect dataset and the ChineseRumorDataset. However, the URL for Rumdect is no longer valid. Moreover, the weibo-all dataset in CED repository is pre-processed and lacks user IDs for message reposts, preventing the construction of the social following graph. Therefore, we utilized the original ChineseRumorDataset, which comprises 3,387 Weibo posts (1,538 rumors and 1,849 non-rumors), in contrast to the weibo-all dataset's 8,050 Weibo posts (3,581 rumors and 4,199 non-rumors).
b) Preprocessing Differences: In previous work [1], all reposting and comment information was either retained or filtered with the early rate (ER). Our approach focuses on retaining early repost messages and comments published within the first 25% of the average time interval and selectively retaining those with high information entropy. To achieve this, we removed stop words and punctuation, segmented the text, calculated the word information entropy for each text, and retained only those texts with entropy values higher than 90% of the highest observed entropy.
c) Model Differences: We used pre-trained word2vec embeddings and an embedding layer to encode the text content, whereas [1] used TF-IDF + NN as the encoder. We also supplemented the experimental results using TF-IDF encoding as shown in Table Ⅲ in the PDF. TF-IDF slightly outperforms word2vec embeddings. Pre-trained word2vec embeddings might suffer from poor representation of low-frequency words or differences between training corpora and the target task domain. Additionally, our proposal consistently outperforms the baselines with various text encoders, highlighting the effectiveness of the MoE framework.

[1] Ced: Credible early detection of social media rumors. 2018.

Q3: Computational Efficiency: Given the complexity of the model structure illustrated in Figure 2, it would be beneficial to benchmark the computational efficiency of the proposed approach against the baselines in Table 3.

A3: Thank you for your valuable suggestion. Please see Q1 in the global rebuttal for further details.

2024-08-13

Dear Reviewer QAYk,

I hope this message finds you well.

I am writing to kindly inquire about the status of your review for our submission 14020 as the rebuttal deadline swiftly approaching (just one day remaining).

We understand that you may have a busy schedule, and we sincerely appreciate your time and effort in reviewing our work. If you have any additional comments or concerns, we would be grateful to discuss them as soon as possible.

Thank you for your understanding and support.

Best regards,

Submission 14020.

2024-08-13

Thanks for your detailed response.

My concerns have been addressed, and I will raise my score from 4 to 5. Please incorporate all responses to enhance the clarity of your work.

2024-08-13

Dear Reviewer QAYk,

Thank you very much for your positive feedback and for taking the time to reconsider your evaluation of our work. We greatly appreciate your constructive comments, which have undoubtedly helped us improve the clarity and quality of our paper. Your input has been instrumental in improving our work.

Best regards,

Submission 14020.

审稿意见

评分: 8置信度: 42024-07-11

This paper addresses the challenge of adversarial attacks on Graph Neural Networks (GNNs) employed in social media tasks, such as rumor detection. The authors introduce MoE-BiEntIRL, a method that leverages a mixture-of-experts approach combined with inverse reinforcement learning (IRL) to reconstruct and explain adversarial attack policies. The objective of this method is to enhance the robustness of GNNs by generating additional adversarial samples for training, thereby improving resilience against attacks. MoE-BiEnt\IRL incorporates mechanisms for precise sample guidance and bidirectional updates, which are designed to optimize both the accuracy and the speed of policy learning.

优点

Innovative Approach: The introduction of MoE-BiEntIRL represents a significant innovation, particularly through its application of a mixture-of-experts approach to manage diverse adversaries and provide detailed feature-level explanations.
Real-world Validation: The method is validated on actual datasets from Weibo and Pheme, demonstrating its practical applicability for improving the robustness of GNNs in social media rumor detection scenarios.
Experimental results, focusing on policy reconstruction and adversarial training, effectively illustrate the method’s robustness and efficacy.
The approach facilitates a deeper understanding of attack behaviors through feature-level explanations, aiding platform operators in enhancing system defenses.

缺点

The proposed method involves multiple stages and sophisticated mechanisms, potentially complicating its implementation.
Scalability Discussion: The paper would benefit from a more extensive discussion on the scalability of the method, particularly concerning its applicability to large social media graphs.
Experimental Setup Details: Enhancing the description of the experimental setup would significantly improve the reproducibility of the study and aid other researchers in replicating the results.
Typos and grammar errors could be avoided.

问题

Scalability Analysis Could you elaborate on how your method scales when applied to very large social media graphs? Any additional insights or preliminary results on this matter would be highly informative.
Could you provide more detailed information regarding your experimental setup? Additional details would aid in understanding how to replicate your study effectively.

局限性

The paper's limitations are adequately addressed in the supplementary material. There are no further suggestions at this time.

作者回复

2024-08-07

Q1: Scalability Analysis: Could you elaborate on how your method scales when applied to very large social media graphs? Any additional insights or preliminary results on this matter would be highly informative.

A1: Thank you for your valuable suggestion.

For the scalability in the large social media graphs, we analyze the time complexity of our method. Please refer to Table I in the attached PDF, where we detail the time complexity and runtime of the MoE-BiEntIRL, alongside two baseline models.

Based on Table I in the accompanying PDF, the time complexity of our proposal and the baseline methods primarily diverges during the reward acquisition phase of IRL. Our approach introduces $K$ experts to manage multi-source attack trajectories with diverse motivations, thereby increasing the time complexity. Notably, the complexity associated with IRL reward acquisition is independent of the input graph size and hinges solely on the predefined hyperparameter $K$ , typically ranging from 1 to 10, rendering the increased complexity manageable. Moreover, the predominant computational effort across all models is concentrated in the attack stage, which further mitigates the impact of introducing multiple experts. Table I details the runtime of complete episodes and the IRL module. Despite the extended runtime of the MoE-BiEntIRL approach in the IRL module, it exerts negligible influence on the total runtime. Both the analysis of time complexity and experimental findings emphasize that the actual runtime is largely influenced by the attack stage.

Please see Q1 in the global rebuttal for further details about the time complexity analysis.

Based on the aforementioned time complexity analysis, the scalability of the IRL module easily scalable as it exhibits no direct correlation with the size of the graph. When applied to larger graphs, parameters such as $T$ , $K$ , $N$ , and $S$ may indirectly increase due to the diverse nature of attacks within these larger contexts. As illustrated in Table I, the runtime of the IRL module shows minimal variation across datasets of varying sizes (e.g., Weibo with 10,280 nodes and Pheme with 2,708 nodes). As discussed in time complexity analysis, the predominant factor influencing runtime is the attack stage. Thus, in our humble opinion, the scalability of the proposed model is acceptable.

Furthermore, as for the hyperparameters affected by the large graph indirectly, various methods can mitigate the time complexity and the runtime of the Inverse Reinforcement Learning (IRL) module through careful control of hyperparameters: (1) $K$ : Reduce the number of experts by phased experimentation, employing a coarse-to-fine division of experts. We design this method and conducts expriements, which details are shown in Q1 of the global rebuttal. (2) $T$ : Segment lengthy expert trajectories using predefined rules or models to minimize state correlation before and after each segmentation. (3) $N$ : Employ pre-classification or clustering of multiple expert samples to reduce the number of expert trajectories.

Q2: Could you provide more detailed information regarding your experimental setup? Additional details would aid in understanding how to replicate your study effectively.

A2: Thank you for your valuable suggestion. Please see Q2 in the global rebuttal for further details.

2024-08-12

Thank you for your thorough rebuttal. The complexity analysis and additional experiments have effectively addressed my concerns. I am inclined to maintain my original score. Thanks again.

2024-08-13

Dear Reviewer M7DL,

Thank you very much for your positive feedback and for taking the time to review our rebuttal. We greatly appreciate your thorough assessment and are pleased that our additional analysis and experiments have addressed your concerns.

We value your insights and are grateful for your support throughout this review process.

Best regards,

Submission 14020.

审稿意见

评分: 8置信度: 42024-07-12

The paper presents a novel method, MoE-BiEntIRL, which combines a mixture-of-experts approach with inverse reinforcement learning to enhance the robustness and explainability of adversarial attacks on GNNs. The method addresses the critical issue of stabilizing GNNs used in social media for rumor detection, demonstrating significant practical relevance. Strengths include its innovative approach, comprehensive mechanisms for improving attack policy accuracy, and robust evaluation results. However, the paper could benefit from clearer explanations of the method, detailed parameter sensitivity analysis, enhanced experimental reproducibility, expanded comparative baselines. Despite these minor weaknesses, the overall contribution and practical importance of the research are compelling.

优点

The MoE-BiEntIRL method presents a highly novel application of a mixture-of-experts approach combined with inverse reinforcement learning to address adversarial attacks on Graph Neural Networks (GNNs). This innovative approach stands out in its ability to not only enhance the robustness of GNNs but also to provide explainability to the attack policies.
The inclusion of precise sample guidance mechanisms and a bidirectional update mechanism demonstrates thoroughness in approach, aiming to improve both the accuracy of attack policy reconstruction and the speed of policy learning. This comprehensive approach adds substantial value to the proposed solution.
The evaluation methods employed in this study are robust, validating the effectiveness of the proposed method. The results are compelling, showing notable improvements in the robustness of GNNs.

缺点

Although the proposed method is innovative, some aspects of the algorithm could benefit from clearer explanations.
A minor issue is that the sensitivity of the model to various parameters is not thoroughly explored. A brief analysis or guidance on parameter selection could aid in the practical application of the method.
While the method is novel, there is little discussion on its computational complexity. Including an analysis of the computational cost and suggesting optimizations could enhance the practical feasibility of the approach.

问题

Could you provide a simple illustrative example or additional details to clarify how the precise sample guidance mechanism and the bidirectional update mechanism work in the MoE-BiEntIRL method?
Can you provide a brief overview of the computational complexity of your proposed method, along with any potential optimizations that could be considered to enhance practical feasibility?

局限性

The authors have listed the limitations in appendix.

作者回复

2024-08-07

Q1: Could you provide a simple illustrative example or additional details to clarify how the precise sample guidance mechanism and the bidirectional update mechanism work in the MoE-BiEntIRL method?

A1: Thank you for your suggestion. We have provided code and algorithms in global rebuttal Q2, which illustrate when and how the precise sample guidance mechanism and the bidirectional update mechanism operate.

During the early stages of learning, precise sample guidance and inverse updates are performed periodically. In an inverse update episode, the steps for executing the RL policy are replaced with precise sample guidance steps, and attacks are carried out according to the given expert samples. As for the reward, it is assigned the maximum historical reward. Given the sample and reward, the gradient ascent update of the IRL is replaced with the inverse update after the responsibility calculation.

Q2: Can you provide a brief overview of the computational complexity of your proposed method, along with any potential optimizations that could be considered to enhance practical feasibility?

A2: Thank you for your valuable suggestion. Please see Q1 in the global rebuttal for further details.

Q3: A minor issue is that the sensitivity of the model to various parameters is not thoroughly explored. A brief analysis or guidance on parameter selection could aid in the practical application of the method.

A3: Thank you for your suggestions. Here, we provide guidelines for selecting some parameters:

The sampling upper bound $μ$ : This parameter controls the similarity between the negative sample and the expert sample. Considering the validity of hard negative samples, we tend to choose samples with higher similarity, setting $μ=0.8$ .
The learning rates of the learner policy, inverse update, and gate function: These are determined using grid search.

We have included a sensitivity analysis of parameters in Figure 2 in the PDF. Our findings are as follows:

Sampling similar instances can be advantageous for policy reconstruction; however, selecting instances that are too similar can lead to the failure of Inverse Reinforcement Learning (IRL).

The learning rates for both the learner policy and the inverse update process are more sensitive compared to that of the gate function.

2024-08-13

After reviewing other referees' comments and the manuscript, I agree that the idea is novel, the technical content solid, and the contributions significant.

作者回复

2024-08-07

Q1: The complexity and scalability analysis of the proposed model.

A1: Thank you for your valuable suggestion. Please refer to Table I in the attached PDF, where we detail the time complexity and runtime of the MoE-BiEntIRL, alongside two baseline models. Herein, we will present and discuss these specifics.

(1) Time Complexity Analysis

The time complexity of MoE-BiEntIRL. The three principal stages of MoE-BiEntIRL are as follows:
- Attack: It conducts $T$ -step attacks. Each attack step requires feature updates for the involved subgraphs and nodes, including a forward pass on the target model. For the $L$ -layer GCN as the target model, the time complexity of the forward pass is approximately $O(LN_{\text{edges}}d + LN_{\text{nodes}}d^2)$ , where $N_{\text{nodes}}$ and $N_{\text{edges}}$ denote the numbers of nodes and edges in the input graph, and $d$ is the feature dimension.
- Reward Acquisition: The reward is estimated with the IRL module. The time complexity for an episode is $O(NTKd(S+K))$ , where $N$ is the number of expert trajectories, $T$ is the trajectory length, $K$ is the number of experts, $d$ is the feature dimension, and $S$ is the number of negative samples.
- Policy Update: With the LinUCB algorithm, the policy update time complexity is $O(d^2)$ .
- Overall, the total time complexity for an episode is approximately $O(TLN_{\text{edges}}d +TLN_{\text{nodes}}d^2+NTKd(S+K)+d^2)$ .
The time complexity of baselines. As in Table I , the complexities of the attack stage and policy update stage in the baselines are identical to those in the proposed model, differing only in the reward acquisition stage. The time complexity of the reward acquisition stage in baselines is:
- Apprenticeship Learning: $O(NTd)$ ;
- EntIRL: $O(NTdS)$ .
Summary. Based on Table I, the time complexity of our proposal and the baselines primarily diverges during the reward acquisition stage. We introduces $K$ experts to manage multi-source attack trajectories with diverse motivations, thereby increasing the time complexity. Notably, the complexity associated with IRL reward acquisition is independent of the input graph size and hinges solely on the predefined $K$ , typically ranging from 1 to 10, rendering the increased complexity manageable. Moreover, the main computational effort is concentrated in the attack stage, which further mitigates the impact of introducing multiple experts. Table I details the runtime of complete episodes and the IRL module. Despite the extended runtime of the MoE-BiEntIRL approach in the IRL module, it exerts negligible influence on the total runtime. Both the analysis of time complexity and experimental findings emphasize that the actual runtime is largely influenced by the attack stage. Therefore, the additional complexity introduced by employing multiple experts remains manageable.
Possible Solutions. Here we propose the MoE-BiEntIRL-K, focusing on reducing $K$ . It utilizes DBSCAN to determine $K$ and pre-cluster labels. By varying the intensity of clustering, we obtain coarse and fine pre-cluster labels. During the IRL process, we initially use the number of experts and pre-cluster labels corresponding to the coarse clustering. As the learning progresses, we increase the number of experts to align with the fine clustering. The initial expert parameters are determined by weighting the previous expert parameters with the current responsibility value. We applied this method on Weibo with $T=20$ and $N=3$ . The values of $K$ determined through coarse and fine clustering were 5 and 26, respectively. The results indicate a 19% reduction (10.67s to 8.64s) in runtime while achieving 92% of the original IRL's performance level (19.936 to 18.381 in $ΔL_A$ ).

(2) Scalability Analysis

Based on the time complexity analysis, the scalability of the IRL module is robust, as it shows no direct correlation with the size of the graph. When applied to larger graphs, parameters such as $T$ , $K$ , $N$ , and $S$ may indirectly increase due to the diverse nature of attacks within these larger contexts. As shown in Table I, the runtime of the IRL module shows minimal variation across datasets of varying sizes (e.g., Weibo with 10,280 nodes and Pheme with 2,708 nodes). As discussed in time complexity analysis, the predominant factor influencing runtime is the attack stage. Thus, in our humble opinion, the scalability of the proposed model is acceptable.

Q2: It is benefit to provide the clearer explanation of the algorithm, experimental setup and code. (Reviewer sELL, E98Q, M7DL)

A2: Thank you for your suggestion. The algorithm and code are provided here.

(1) Algorithm: MoE-BiEntIRL

Input:

Expert demonstration set $D$ , Number of experts $K$ , Length of trajectories $T$ , Number of episodes $E$ , Gate function $α$ , Reward function parameters $θ=[θ_1,..., θ_k]$ , Learner policy $π$ , Negative sample set $D'$ , Responsibility matrix $γ$ , Inverse update episode set $\Lambda$

Procedure:

For $e = 1,\ldots,E$ $e = 1, \dots, E$ :
1. $s = \text{env.reset}()$
2. For $t = 1,\ldots,T$ $t = 1, \dots, T$ :
  1. If $e\in\Lambda$ $e \in Λ$ :
    - $s',a=\text{PreciseSampleGuidance}(D)$
    - $r = \text{MaxReward}()$
  2. Else:
    - $s',a=\text{env.step}(s,π)$
    - $r = \text{ObtainReward}(α, s, a, θ)$ as Eq.~(14)
  3. $\pi = \text{UpdatePolicy}(s, a, s', r)$
  4. $s = s'$
3. $\gamma = \text{CalculateResponsibility}(α, D, D', θ)$ as Eq.~(8)
4. For $k = 1,...,K$ $k = 1, ..., K$ :
  1. If $e \in \Lambda$ $e \in Λ$ :
    - $θ_k = \text{InverseUpdate}(α,D,θ_k)$ as Eq.~(15)
  2. Else:
    - $θ_k = \text{GradientAscent}(\gamma,D,D',θ_k)$ as Eq.~(12)
5. $α = \text{GradientAscent}(γ)$ with the loss as Eq.~(10)

It requires to correct Eq.~(14) as $r_θ(s,a) = \sum\nolimits_{k=1}\nolimits^K α_k(s)θ_k^\top f(s,a)$ .

(2) Code and Experimental Setup have been provided to the AC in an anonymized link.

评论- Anonymous Link of Code

2024-08-08

Dear Reviewers,

Thank you for your thorough review and valuable feedback on our paper. Following the suggestion from the Area Chair (AC), we have shared our code through an anonymous link to facilitate a better evaluation of our work.

Please access our code and related resources via the following link: [https://anonymous.4open.science/r/Rumor_IRL-B11D/README.md]

We look forward to your further feedback and are happy to address any related questions.

Thank you!

Sincerely, Submission 14020.

最终决定Accept (spotlight)

2024-09-25

This paper proposes a method to enhance the robustness of rumor detection models by using a mixture-of-experts approach within maximum entropy inverse reinforcement learning to reconstruct attack policies from adversarial samples, thereby improving the generation of adversarial samples and strengthening model defenses through adversarial training and data augmentation. All reviewers conclude that the proposed approach is highly novel, in particular the innovative application of a MoE approach combined with inverse RL to address adversarial attacks on GNNs. The reviewers also praised the explainability of the attack policy reconstructions, strong experimental results and clear representation. (Somewhat minor) issues were raised on complexity of the suggested approach, especially given that multiple components such as IRL, MoE and bidirectional updates which may complicate implementation, but the authors managed to address these concerns to a satisfactory degree. After author rebuttal, all reviewers unanimously recommended acceptance; the AC read through all reviews and exchanges between the authors and the reviewers and concurs with this recommendation. The novelty of the approach and the practical impact on rumor detection on social media are potentially intriguing to a broader audience at NeurIPS and the AC additionally recommends this paper to be highlighted as a spotlight at the conference.