PaperHub
5.3
/10
Rejected4 位审稿人
最低5最高6标准差0.4
5
6
5
5
3.8
置信度
ICLR 2024

Directional Rank Reduction for Backdoor Defense

OpenReviewPDF
提交: 2023-09-24更新: 2024-02-11

摘要

关键词
backdoor defensebackdoor attackneuron pruning

评审与讨论

审稿意见
5

This paper argues that existing pruning-based defense methods can be ineffective at times and introduces Directional Rank Reduction (DRR) to identify toxic directions. In this study, the method approximates the target direction by maximizing the third central moment, supported by rigorous theoretical justification, and constructs a projection matrix to eliminate the toxic direction. DRR demonstrated outstanding performance in terms of both accuracy (ACC) and adversarial success rate (ASR).

优点

  1. This study shows an interesting finding that the backdoor trigger effects are not always aligned with fixed dimensions of the feature space, pruning-based methods are usually ineffective.
  2. The proposed DRR method performed well on both ACC and ASR compared to other methods.

缺点

  1. In the first equation on Page 3, it seems feasible to do the defense by reducing the norm of the residual matrix to align the benign and poisoned features seems feasible. The features from benign examples move towards the backdoored features. Does the movement hurt the model's clean performance?

  2. The last equation on Page 4 has a strong assumption that all the clean examples are centered around the mean of them. Namely, the method assumes that the distances from all the clean examples to the example center are the same. The examples marked as yellow in Figure 1 are distributed like a circle. However, the real-world data distribution often deviates from the assumption. The distribution could be elliptical-like. In this case, the obtained v is not optimal anymore.

  3. In the third row of Table 2, DRR achieves a better trade-off. Why it demonstrates a higher accuracy (ACC) instead of a lower ASR?

  4. This approach requires the optimization of a vector in each layer, which could be expensive.

minor: All the equations are not numbered!

问题

  1. "How the direction vector v is initialized in the paper, and do different initialization methods lead to varying results?

  2. In Figure 2, the value of C for certain layers is not significant. Is it possible to skip some layers when computing v?

评论

Weakenesses:

W1: In the first equation on Page 3, it seems feasible to do the defense by reducing the norm of the residual matrix to align the benign and poisoned features seems feasible. The features from benign examples move towards the backdoored features. Does the movement hurt the model's clean performance?

Answer: As we points out in our paper, rank reduction is an extension of neuron pruning (which is also a modification of the weight matrix). Established neuron pruning techniques, such as CLP and EP, have been demonstrated to maintain model performance effectively post-pruning. Our rank reduction, which 1) remove only one rank (instead of multiple ranks in neuron pruning) and 2) remove more targetedly to the direction that related to the backdoor behavior should intuitively affect less to the performance than pruning methods in general. Empirically, we show that our method affects the less on the performance compare to the previous method.

W2: The last equation on Page 4 has a strong assumption that all the clean examples are centered around the mean of them. Namely, the method assumes that the distances from all the clean examples to the example center are the same. The examples marked as yellow in Figure 1 are distributed like a circle. However, the real-world data distribution often deviates from the assumption. The distribution could be elliptical-like. In this case, the obtained v is not optimal anymore.

Answer: We appreciate the opportunity to clarify a potential misunderstanding highlighted by the reviewer regarding the assumptions underlying our method. Contrary to the reviewer's interpretation, our approach does not assume uniform distances of all clean examples from their mean, nor does it presuppose a circular distribution of data points as depicted in Figure 1. The shape of the ellipse in a multivariate Gaussian distribution is determined by its covariance matrix. Only when the covariance matrix is isotropic, which means the variance along all directions are the same, the data is distributed like a circle as mentioned by the reviewer. However, we do not put any constraints on whether the covariance is isotropic or not. The only assumption about the covariance matrix we made is assumption 2, which limits the maximum variance of the data distribution. Hence, an elliptical-like distribution is considered in our method, where the Gaussian distribution has a covariance matrix that is not isotropic.

The equation on page 4, which the reviewer refers to, formulates an optimization problem. The objective of this problem is to determine an optimal unit direction vector that maximizes the third central moment when data is projected onto this vector. The stipulation that the vector be of unit length is a constraint applied to the direction vector itself, rather than to the data samples. This is a standard practice in such optimization problems to ensure the direction vector is normalized and hence, the focus is on the direction rather than the magnitude.

W3: In the third row of Table 2, DRR achieves a better trade-off. Why it demonstrates a higher accuracy (ACC) instead of a lower ASR?

Answer: We appreciate the reviewer's insightful observation and recognize the necessity of providing a clearer explanation in our manuscript. The comparatively minimal impact on model performance observed in our study can be attributed to our method's strategy of removing at most one rank per layer. This approach is considerably more conservative than traditional neuron pruning methods, which often entail the removal of multiple ranks. If the number of ranks is tuned to more according to specific scenario, DRR can achieve even lower ASR.

W4: This approach requires the optimization of a vector in each layer, which could be expensive.

Answer: We appreciate the inquiry regarding the computational efficiency of our approach. Primarily, the optimization process can be parallelized across different network layers, offering a substantial decrease in the required time for optimization. Furthermore, in scenarios with high feature dimensions or large dataset sizes, dimensionality reduction through Principal Component Analysis (PCA) can be employed prior to the directional learning phase. This step effectively reduces the computational burden. Subsequently, the learned direction within this reduced space is projected back onto the original space, thereby economizing on memory and computational demands while preserving the integrity of the optimization process.

评论

Questions:

Q1: How the direction vector v is initialized in the paper, and do different initialization methods lead to varying results?

Answer: The reviewer's comment is well-taken. Within our manuscript, we initialize the vector v as the standard basis vector that corresponds to the maximal third central moment upon projection of the data. To thoroughly examine the impact of various initialization strategies, we have implemented two distinct approaches: (1) random initialization, and (2) initialization using the original standard basis vector that exhibits the highest third central moment. The findings from these methodologies are presented in the following section:

BackdooredDRR (random)DRR (original)
ACCASRACCASRACCASR
ResNet-18BadNets94.8398.7992.201.8792.013.18
BadNets(A2A)94.8186.5293.111.3092.421.93
Blended95.0399.9990.573.4294.402.31
CLA94.8216.1893.600.9693.400.98
Average94.8775.3792.371.8993.062.10
WideResNet-28-1BadNets92.4699.9289.972.3489.561.69
BadNets(A2A)92.4379.9488.601.8091.941.29
Blended92.5599.8091.32100.0091.502.74
CLA92.393.4990.571.0791.416.19
Average92.4670.7990.1226.3091.102.98

The results indicate that different intialization work well in different scenarios. However, in general, initializing the vector with original standard basis with the highest criterion performs more robust than random initialization.

Q2: In Figure 2, the value of C for certain layers is not significant. Is it possible to skip some layers when computing v?

Answer: The reviewer's suggestion is indeed a compelling proposition. In this way we can reduce the computational cost and the hurt to the normal performance of the model. However, currently the exact calculation of C requires the access to both poisoned data and benign data separately, which is impractice in our scenario. Nonetheless, we acknowledge the potential benefits of such a strategy and consider it an intriguing avenue for future exploration.

评论

Thank the authors for providing a detailed rebuttal. However, I still have concerns regarding the motivation of the method.

Figure 1 illustrates why the proposed method works, which is only valid when the data distribution is circle-like. In real-world datasets, the distribution is not the case. Namely, the motivation or the explanation of the proposed method is not convincing anymore.

I understand that the authors do not make any assumptions about the data explicitly. But please check the motivation illustrated in Figure 1. The illustration does not make sense for real-world data distribution.

Hence, I tend to keep my original score.

审稿意见
6

This paper proposes a novel backdoor defense method, which utilizes rank reduction to mitigate backdoor in the model. The idea of rank reduction is interesting and brings a new insight into the area.

优点

  1. The idea is novel and provides a new insight.
  2. This paper is technically sound and easy to follow.
  3. The experimental results demonstrate its effectiveness in backdoor defense.

缺点

1.Although this work is interesting, it has a limitation. This paper assumes the defender can get access to the backdoored image. However, this is hard to get in actual situations and thus limits its use greatly. I wonder whether it works without these backdoored data. 2. The backdoor attacks that this paper test is not enough. I suggest the authors to test the newest input-specific backdoor attacks in 2022. It's important to identify whether this method can achieve SOTA.

问题

1.Does it work without the attacker's backdoored data?

评论

We would like to thank the reviewer for their appreciation of our work. Below, we have provided our response to the reviewer's concerns.

Weakenesses:

W1: Although this work is interesting, it has a limitation. This paper assumes the defender can get access to the backdoored image. However, this is hard to get in actual situations and thus limits its use greatly. I wonder whether it works without these backdoored data.

Answer: First, to answer the reviewer's final question, the rank reduction framework can be adpated to any scenarios with or without backdoored data, but the metric we adopt to obtain the direction, i.e., third central moment, requires the access to the backdoored data. If other metrics is later being invented to obtain the direction, then the method could be backdoored data-free.

Second, we want to argue that the scenario in which the defender has access to the full dataset is a prevalent assumption within the backdoor attack research community, exemplified by the concept of adopting a third-party dataset as discussed in [1]. Some of methodologies that employ this setting are outlined in: [2, 3, 4, 5].

[1] Li, Y., Jiang, Y., Li, Z. and Xia, S.T., 2022. Backdoor learning: A survey. IEEE Transactions on Neural Networks and Learning Systems. [2] Chen, B., Carvalho, W., Baracaldo, N., Ludwig, H., Edwards, B., Lee, T., Molloy, I. and Srivastava, B., Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering. [3] Zheng, R., Tang, R., Li, J. and Liu, L., 2022. Pre-activation Distributions Expose Backdoor Neurons. Advances in Neural Information Processing Systems, 35, pp.18667-18680. [4] Tran, B., Li, J. and Madry, A., 2018. Spectral signatures in backdoor attacks. Advances in neural information processing systems, 31. [5] Hayase, J., Kong, W., Somani, R. and Oh, S., 2021, July. Spectre: Defending against backdoor attacks using robust statistics. In International Conference on Machine Learning (pp. 4129-4139). PMLR.

W2: The backdoor attacks that this paper test is not enough. I suggest the authors to test the newest input-specific backdoor attacks in 2022. It's important to identify whether this method can achieve SOTA.

Answer: Actually, the experiments with IAB and WaNet, which are both input-specific backdoor attacks, are in the paper. Please refer to Table 2 in our submitted paper to see our results tested on input-aware dynamic attack (IAB) and Warping-based backdoor attack (WaNet).

Moreover, we conduct more experiments on other types of attacks (AdaptiveBlend, SIG and Smooth), where the results is as shown below:

BackdooredFPANPEPCLPDRR
ACCASRACCASRACCASRACCASRACCASRACCASR
ResNet-18AdaptiveBlend94.79100.0089.475.2982.180.3094.431.7493.6833.5290.253.79
SIG94.0198.2288.9445.7089.382.3687.3630.8489.7594.2887.100.07
Smotth94.5910087.1210092.6581.2394.243.9987.2489.0394.033.58
WideResNet-28-1AdaptiveBlend92.37100.0084.7751.8882.0642.7090.455.1884.4074.5791.130.86
SIG84.0396.2082.655.2281.370.0083.820.0084.040.0082.850.00
Smotth92.1910084.526.3289.9810091.458.7891.299.0391.882.74
审稿意见
5

The paper presents a fascinating new method for backdoor defense in neural networks. The key idea of projecting the "toxic direction" that maximizes the difference between clean and poisoned features is novel and seems promising.

The theoretical analysis provides valuable insights into the limitations of standard neuron pruning approaches. Framing the problem as rank reduction along arbitrary directions rather than fixed neuron directions is a significant conceptual shift.

优点

  1. The idea of maximizing the third central moment is enjoyable. This idea yields a novel insight.
  2. The connection between neuron pruning and rank reduction is also an exciting topic.
  3. The visualization of the separation constant C provides good justification for the theoretical assumptions.

缺点

  1. More experiments can be conducted (BadNet, Blended, CLA, WaNet, and IAB are insufficient.) The authors can consider attacks like SIG [1] and low frequency (Smooth) [2]. Since your method also took latent separability as an assumption, Adapt-blend and Adapt-patch attacks [3] should also be considered. Evaluating robustness to adaptive attacks that try to evade the defense would be useful to understand limitations.
  2. The references and notations should be clarified. For example, what is the reference to Proposition 1?
  3. Also, the readability and organization of this paper need to be improved. It is better if an algorithm is provided.

[1] A new backdoor attack in cnns by training set corruption ICIP 2019

[2] Rethinking the Backdoor Attacks’ Triggers: A Frequency Perspective ICCV2021

[3] Revisiting the Assumption of Latent Separability for Backdoor Defenses, ICLR 2023

问题

  1. The memory and computational complexity could be analyzed more thoroughly, especially how the approach scales with larger datasets/models. Are there ways to make the optimization more efficient?
  2. How many extension directions v_i have you used?
  3. Modifying the weight matrix may cause a performance drop in many cases. How can your projection keep the performance?
  4. The proof needs to be more rigorous. Why use the consequence of the proof in the middle of the proof?
评论

We would like to thank the reviewer for the detailed review. We will make changes based on the feedback. Please see our responses:

Weaknesses:

W1: More experiments can be conducted (BadNet, Blended, CLA, WaNet, and IAB are insufficient.) The authors can consider attacks like SIG [1] and low frequency (Smooth) [2]. Since your method also took latent separability as an assumption, Adapt-blend and Adapt-patch attacks [3] should also be considered. Evaluating the robustness of adaptive attacks that try to evade the defense would be useful for understanding limitations.

Answer: We've conducted experiments according to the reviewer's suggestions. The results are shown below:

BackdooredFPANPEPCLPDRR
ACCASRACCASRACCASRACCASRACCASRACCASR
ResNet-18AdaptiveBlend94.79100.0089.475.2982.180.3094.431.7493.6833.5290.253.79
SIG94.0198.2288.9445.7089.382.3687.3630.8489.7594.2887.100.07
Smotth94.5910087.1210092.6581.2394.243.9987.2489.0394.033.58
WideResNet-28-1AdaptiveBlend92.37100.0084.7751.8882.0642.7090.455.1884.4074.5791.130.86
SIG84.0396.2082.655.2281.370.0083.820.0084.040.0082.850.00
Smotth92.1910084.526.3289.9810091.458.7891.299.0391.882.74

Note that the latent separability mentioned in the paper [3] only considers the latent space in the penultimate layer, while our methods utilize the separability within each layer of the model. This makes our method effective even when the penultimate layer feature is inseparable.

W2: The references and notations should be clarified. For example, what is the reference to Proposition 1?

Answer: The reviewer's request for clarification on the references and notations is acknowledged. However, regarding the reference for Proposition 1, it should be noted that to the extent of our understanding, Proposition 1 is introduced for the first time in our manuscript. Consequently, there are no prior publications to cite for this proposition.

W3: Also, the readability and organization of this paper need to be improved. It is better if an algorithm is provided.

Answer: We appreciate the feedback regarding the clarity and structural aspects of our manuscript. Recognizing the value that an algorithmic representation would add, we have taken the suggestion into consideration and will include a detailed algorithm in the revised draft to enhance comprehension of our proposed method.

评论

Questions:

Q1: The memory and computational complexity could be analyzed more thoroughly, especially how the approach scales with larger datasets/models. Are there ways to make the optimization more efficient?

Answer: We appreciate the inquiry regarding the computational efficiency of our approach, particularly in the context of scalability to larger datasets and models. Our method indeed incorporates strategies to enhance optimization efficiency.

Primarily, the optimization process can be parallelized across different network layers, offering a substantial decrease in the required time for optimization. Furthermore, in scenarios with high feature dimensions or large dataset sizes, dimensionality reduction through Principal Component Analysis (PCA) can be employed prior to the directional learning phase. This step effectively reduces the computational burden. Subsequently, the learned direction within this reduced space is projected back onto the original space, thereby economizing on memory and computational demands while preserving the integrity of the optimization process.

Q2: How many extension directions v_i have you used?

Answer: Across all experiments conducted, we use only 1 direction.

Q3: Modifying the weight matrix may cause a performance drop in many cases. How can your projection keep the performance?

Answer: The reviewer has raised an important point. As we point out in our paper, rank reduction is an extension of neuron pruning (which is also a modification of the weight matrix). Established neuron pruning techniques, such as CLP and EP, have been demonstrated to maintain model performance effectively post-pruning. Our rank reduction, which 1) remove only one rank (instead of multiple ranks in neuron pruning) and 2) remove more targetedly to the direction that related to the backdoor behavior should intuitively affect less to the performance than pruning methods compared to general pruning methods. However, the exact relationship between the way we modify the weight matrices, and the performance is hard to clearly describe and can only be shown by experiments. Here we only provide an intuitive explanation.

Q4: The proof needs to be more rigorous. Why use the consequence of the proof in the middle of the proof?

Answer: We are grateful to the reviewer for highlighting a critical aspect of our proof's construction. To address this issue, we are committed to revising and strengthening our theorem to ensure its logical soundness and rigor. The revised version of our paper has included these modifications in detail. Please kindly refer to it.

审稿意见
5

The paper proposes a rank reduction based defense against backdoor attack. Specifically, it first gives a feature-based objective to show the optimal solution to achieve the best defense effect. He then discussed the previous defense's problem based on the given objective and proposes DRR, the rank reduction based defense where aims to find a vector that would maximize the 3rd central moments of the mixed distribution. The proposed method have been verified in CIFAR10 with several backdoor methods. The result shows the proposed method could achieve a little better performance with the state-of-art defense.

优点

  1. The paper is well-written and easy to follow with only several typos.
  2. The proposed method has some good theoretical analysis and could be meaningful for the future work.

缺点

  1. Some of theoretical analysis might be not accurate. The utility function is defined using ||R-\gamma_r (R)|| and also ||R||-||\gamma_r (R)||. However, these two value is not strict equivalent. It also happens in the definition of E(R).
  2. It is unclear why the 3rd center moment would show the best performance to measure the difference. In other words, would 2nd order moment or 1st order work as well? Since 3rd order is the main metric selected, the author should explain the choice in detail.
  3. The experiment is pretty insufficient. It only covers one datasets with only one poisoning rate. I suggest the author to give a more comprehensive experiments to show their proposed method's effectiveness. Some standard setting in https://github.com/SCLBD/backdoorbench is recommended.

Minor typo: Missing \hat{x} in the definition of E(R(l).

问题

Please refer to the weaknesses part. To sum,

  1. Why does ||R-\gamma_r (R)|| =||R||-||\gamma_r (R)|| along with E(R)?
  2. Why does 3rd central moment is selected?
评论

We appreciate the reviewer's thorough review and have taken their comments into consideration. Here are our responses to their concerns:

Weaknesses:

W1: Some of theoretical analysis might be not accurate. The utility function is defined using Rγ(R)||R-\gamma (R)|| and also Rγ(R)||R||-||\gamma (R)||. However, these two value is not strict equivalent. It also happens in the definition of E(R)E(R).

Answer: We thank the reviewer for pointing this out. Indeed, the equation Rγ(R)=Rγ(R)||R-\gamma(R)||=||R||-\gamma(R)|| doesn't hold in the general case. However, it does hold when we use the proposed L1,1L_{1, 1} norm, as defined in Definition 1. It makes sense if we specify the norm before introducing this equation. We acknowledge that the organization of this section needs to be corrected to avoid misleading the reader.

W2: It is unclear why the 3rd center moment would show the best performance to measure the difference. In other words, would 2nd order moment or 1st order work as well? Since 3rd order is the main metric selected, the author should explain the choice in detail.

Answer: We clarified our choice in the revision of the paper. The first central moment (the mean of the data), doesn't make sense in this context. Because the mean only affects the position of the data center, which isn't related to the direction of the mean difference. The second central moment yields similar conclusions when it is constrained with the assumptions made in the paper. However, in practice, we find the third central moment works much better. To some extent, the third central moment not only increases with the mean difference but also with the asymmetry of the two clusters. In backdoor attacks, the amount of benign data is usually much larger than poisoned data, which cannot be captured by the second central moment. That's why the third central moment performs better than the second central moment in our scenarios.

W3: The experiment is pretty insufficient. It only covers one datasets with only one poisoning rate. I suggest the author to give a more comprehensive experiments to show their proposed method's effectiveness. Some standard setting in https://github.com/SCLBD/backdoorbench is recommended.

Answer: Following the reviewer's suggestion, we conducted additional experiments on GTSRB. The results indicate that our method performs well in this context too. It's noteworthy that our attack with CLA on GTSRB didn't succeed, which is also the case in the BackdoorBench.

BackdooredFPANPEPCLPDRR
BackdooredFPANPEPCLPDRR
ACCASRACCASRACCASRACCASRACCASRACCASR
ResNet-18BadNets95.28100.0091.740.3591.104.6494.490.3394.640.7994.930.68
BadNets(A2A)95.3195.9488.5610.8594.131.7994.740.0694.790.1395.110.48
Blended95.8799.8890.443.0091.913.2094.920.2994.680.9795.040.79
CLA96.290.0990.820.6196.140.8095.530.1394.320.1496.210.10
Average95.6973.9890.393.7093.322.6194.920.2094.610.5195.320.51
WideResNet-28-1BadNets94.3799.9990.8611.9480.70100.0093.900.3387.0911.1193.000.45
BadNets(A2A)92.2791.7289.3732.9075.4848.1890.561.0192.130.9790.701.71
Blended94.6899.7590.5111.3586.0699.5292.8610.2693.960.2692.823.07
CLA95.310.0989.090.7194.980.5094.570.1095.270.2095.410.09
Average94.1672.8989.9614.2384.3162.0592.972.9392.113.1492.981.33
评论

We also provide experimental results for different poisoning ratios:

1%:

BackdooredFPANPEPCLPDRR
ACCASRACCASRACCASRACCASRACCASRACCASR
ResNet-18BadNets94.8398.7988.4495.1692.743.0293.955.0093.041.0394.690.97
BadNets(A2A)94.8186.5289.199.9791.355.3294.158.9093.680.8294.520.66
Blended95.0399.9989.61100.0093.631.6891.1899.7393.0335.6694.881.84
CLA94.8216.1889.949.8492.089.2694.4316.8091.141.2294.866.34
Average94.8775.3789.3053.7492.454.8293.4332.6192.729.6894.742.45
WideResNet-28-1BadNets92.4699.9286.1140.6673.9059.5090.9368.0490.9745.5690.744.41
BadNets(A2A)92.4379.9486.252.4575.3715.9686.4925.9291.681.3291.461.47
Blended92.5599.8084.7499.3483.7016.3389.4395.7388.8199.8189.890.70
CLA92.393.4985.4923.5190.893.3290.934.4992.293.5292.393.49
Average92.4670.7985.6541.4980.9723.7889.4548.5590.9437.5591.122.52

5%:

BackdooredFPANPEPCLPDRR
ACCASRACCASRACCASRACCASRACCASRACCASR
ResNet-18BadNets94.1499.9987.31100.0087.921.7693.602.7393.051.9293.901.58
BadNets(A2A)94.1884.7188.1337.8094.1884.7193.2812.1492.971.0293.760.94
Blended94.81100.0088.4298.1193.236.0393.5064.1890.000.1894.011.73
CLA94.8689.9488.6628.9490.1850.2392.283.1091.621.8894.780.94
Average94.5093.6688.1366.2191.3835.6893.1720.5491.911.2594.111.30
WideResNet-28-1BadNets92.2299.9986.4130.4987.3848.2684.2910.3092.145.9290.141.51
BadNets(A2A)92.1791.2684.0674.5384.2477.2391.651.5391.871.4592.061.70
Blended91.7293.7684.663.2383.932.2389.274.3989.802.0689.890.70
CLA92.8736.2385.4923.5186.435.0985.157.8790.231.9091.444.63
Average92.2580.3185.1632.9485.5033.2087.596.0291.012.8390.882.14

and new attacks:

BackdooredFPANPEPCLPDRR
ACCASRACCASRACCASRACCASRACCASRACCASR
ResNet-18AdaptiveBlend94.79100.0089.475.2982.180.3094.431.7493.6833.5290.253.79
SIG94.0198.2288.9445.7089.382.3687.3630.8489.7594.2887.100.07
Smotth94.5910087.1210092.6581.2394.243.9987.2489.0394.033.58
WideResNet-28-1AdaptiveBlend92.37100.0084.7751.8882.0642.7090.455.1884.4074.5791.130.86
SIG84.0396.2082.655.2281.370.0083.820.0084.040.0082.850.00
Smotth92.1910084.526.3289.9810091.458.7891.299.0391.882.74

The results of our experiments indicate that the performance of our proposed method exhibits robustness across a diverse scenario.

评论

We thank all reviewers for the insightful feedback of our paper. We are glad that our novelty and algorithm are widely recognized by all reviewers. Most of the concerns are focused on the presentation and the insufficient experiments along with some misunderstandings of our method, which we have carefully responded in the following rebuttal. Particularly regarding the experimental section, we have significantly expanded our experimental validation, which is included in the revision of the paper. We hope our newly added rebuttal material can address your concerns.

AC 元评审

In this paper, the authors first explored the limitations of pruning-based defense through theoretical an empirical investigations and then proposed a Directional Rank Reduction method, a so-called extended neuron pruning framework, to address the limitations.

The authors have address the comments from the reviewers. However, after several rounds of discussions with Reviewer x3hw, he/she still is not satisfy with some responses. In particular, the motivation or the explanation of the proposed method is questionable. The authors should present the proposed method clearly without creating any possible misunderstnging! Besides, Reviewer niUP raised two concerns that the authors fail to give satisfying answers regarding Modifying the weight matrix may cause a performance drop in many cases. How can your projection keep the performance?'' and The proof needs to be more rigorous. Why use the consequence of the proof in the middle of the proof?''

为何不给更高分

However, after several rounds of discussions with Reviewer x3hw, he/she still is not satisfy with some responses. In particular, the motivation or the explanation of the proposed method is questionable. The authors should present the proposed method clearly without creating any possible misunderstnging! Besides, Reviewer niUP raised two concerns that the authors fail to give satisfying answers regarding Modifying the weight matrix may cause a performance drop in many cases. How can your projection keep the performance?'' and The proof needs to be more rigorous. Why use the consequence of the proof in the middle of the proof?''

为何不给更低分

none

最终决定

Reject