PaperHub
5.0
/10
Poster3 位审稿人
最低4最高7标准差1.4
7
4
4
3.7
置信度
正确性2.7
贡献度2.0
表达3.3
NeurIPS 2024

WaveAttack: Asymmetric Frequency Obfuscation-based Backdoor Attacks Against Deep Neural Networks

OpenReviewPDF
提交: 2024-05-15更新: 2024-11-06

摘要

关键词
backdoor attack;

评审与讨论

审稿意见
7

This paper introduces a backdoor attack that leverages DWT to create highly stealthy backdoor triggers, named WaveAttack. The attack employs an asymmetric frequency obfuscation technique to improve the impact and effectiveness of these triggers during both training and inference stages.

优点

The work’s proposal and clear reasonings to use DWT instead of other frequency-based transformations creates backdoor triggers that are highly stealthy. The perceptibility of the triggers generated are validated across numerous image fidelity metrics.

缺点

The frequency-based transformation, DWT, is the major contribution of this paper’s proposed attack. But the paper only experiments with the “Haar” wavelet, when there are dozens of different variations available.

问题

  1. Can the method be adapted to other types of wavelet transforms, and how would this affect the attack's effectiveness and stealthiness?
  2. While the authors explain the rationale behind them choosing DWT over DCT, there are no experiments detailing the superiority of DWT. Can DCT/DFT or other forms of frequency-based transformations work in this proposed method? Can they be swapped-in directly?

局限性

See weaknesses and questions.

作者回复

Response to Reviewer gVJn

Adaption to Other Types of Wavelet Transforms

Thank you for the reviewer's insightful comments. In our wavelet transformation procedure, applying different wavelets in the Discrete Wavelet Transform (DWT) is still applicable to our method proposed in this paper. We initially adopted the most common Haar wavelet due to its simplicity and computational efficiency. Additionally, we have incorporated the Daubechies wavelet, which has stronger orthogonality, to evaluate its impact on our method. The specific experimental results are summarized in the following table. From the table, we can find that the influence of different wavelets on our method’s performance is limited, indicating that WaveAttack maintains its effectiveness and stealthiness among different wavelet transformations.

WaveAttack Performance with Different Wavelets

Wavelet TypeDatasetIS ↓PSNR ↑SSIM ↑BA ↑ASR ↑
WaveAttack-HaarCIFAR100.01147.490.997994.55100
CIFAR1000.00550.120.999275.41100
GTSRB0.05840.670.987799.30100
WaveAttack-DBCIFAR100.00747.530.998994.7795.60
CIFAR1000.00550.320.999476.6480.43
GTSRB0.02241.950.988198.2199.50

The Superiority of DWT

We believe DWT cannot be directly swapped to other frequency transformation methods. This is because different frequency conversion methods will eventually affect the performance of backdoor attacks based on the frequency domain. The following table shows the impact of varying frequency domain conversion methods on backdoor attacks. From the table, we can see that compared with other frequency domain conversion methods, i.e., FTrojan (DCT) [1], Fiba (DFT) [2], the backdoor trigger generation method based on DWT in this paper can significantly improve the effectiveness and stealthiness of the backdoor attack method based on the frequency domain.

Frequency Domain Conversion Methods Comparison

MethodDatasetIS↓PSNR ↑SSIM ↑BA ↑ASR ↑
FTrojan (DCT) [1]CIFAR100.01944.070.997694.29100
CIFAR1000.018747.77280.99575.37100
GTSRB0.08940.440.987998.83100
Fiba (DFT) [2]CIFAR100.06126.080.973493.8075.40
CIFAR1000.05526.240.968874.8780.36
GTSRB0.07923.410.913099.1285.18
WaveAttack (DWT)CIFAR100.01147.490.997994.55100
CIFAR1000.00550.120.999275.41100
GTSRB0.05840.670.987799.30100

References

  • [1] Wang T, Yao Y, Xu F, et al. Backdoor attack through frequency domain. ECCV, 2023.
  • [2] Feng Y, Ma B, Zhang J, et al. Fiba: Frequency-injection based backdoor attack in medical image analysis. CVPR, 2022.
审稿意见
4

This paper proposes a novel frequency-based backdoor attack method named WaveAttack, which can effectively generate the backdoor residuals for the high-frequency component based on DWT, thus ensuring the high fidelity of poisoned samples.

优点

  1. The paper is well-written and well-structured.
  2. Extensive experiments are conducted to validate the attack method. 3 The performance of the attack is surprising.

缺点

The paper is well-written and includes sufficient experiments. However, I am concerned about the limited novelty of this work, as there are already several backdoor attacks [1-4] based on the frequency domain that have been proposed. These works leverage different components/ranges in the frequency domain, and typically, [4] also uses a high-frequency trigger. While the design is different, the high-level ideas are quite similar.

References:

[1] Feng Y, Ma B, Zhang J, et al. Fiba: Frequency-injection based backdoor attack in medical image analysis[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 20876-20885. [2] Wang T, Yao Y, Xu F, et al. Backdoor attack through frequency domain[J]. arXiv preprint arXiv:2111.10991, 2021. [3] Check your other door! creating backdoor attacks in the frequency domain. [4] Zeng Y, Park W, Mao Z M, et al. Rethinking the backdoor attacks' triggers: A frequency perspective[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 16473-16481. Minor Points:

  1. In section 3.2, the authors mention decomposing the image into four components: LL, LH, HL, HH. Since this is a key idea of the paper, it would be beneficial to explain their meanings and possibly show some examples in the Appendix.

  2. In Figure 2, the authors use "encoder" and "decoder," but in the text, they use "generator." It would be better to be consistent with terminology.

  3. Asymmetric frequency obfuscation is an important method in this paper. It would be better to describe/introduce it and its motivation in detail earlier in the paper.

问题

Please See Above.

局限性

Yes

作者回复

Response to Reviewer h6uR

We are sincerely grateful for the reviewers' insightful feedback and constructive comments. We offer comprehensive responses to all inquiries and concerns below.

Detailed Descriptions of References

Unlike our paper based on Discrete Wavelet Transform (DWT), reference [1] introduces a frequency domain-based attack method named Fiba for medical images, employing Discrete Fourier Transform (DFT) technology. Based on its GitHub project: FIBA, we reproduced this method on three image-classification datasets. The table below shows the attack and stealthiness performance comparison between WaveAttack and Fiba. From the table, we can see that compared with the Fiba method, WaveAttack achieves a higher attack success rate (ASR) and image fidelity in terms of PSNR, SSIM, and IS.

CIFAR10CIFAR100GTSRB
FibaWaveAttackFibaWaveAttackFibaWaveAttack
IS ↓0.0610.0110.0550.0050.0790.058
PSNR ↑26.0847.4926.2450.1223.4140.67
SSIM ↑0.97340.99790.96880.99920.91300.9877
BA ↑93.8094.5574.8775.4199.1299.30
ASR ↑75.4010080.3610085.18100

[2] is a frequency domain-based attack method named Ftrojan employing Discrete Cosine Transform (DCT) technology, which is also the SOTA baseline in our paper.

Since [3] is the same as reference [1] based on DFT and did not release the code, we did not choose it as the baseline for experimental results in our responses.

[4] actually does not belong to a backdoor attack method, but to a backdoor detection method based on the frequency domain. This paper proposes a frequency domain-based detection metric named BDR (Backdoor Detection Rate) for backdoor attack detection. The attack performance comparison against [4] is in Section 6.3 of our Appendix (Line 539). From Table 6 in our paper's appendix, we can find that under BDR detection, compared with FTrojan (BDR: 78.11%) and the frequency trigger generated method in [4] (BDR: 99.94%), WaveAttack can obtain better attack effectiveness and stealthiness (BDR: 5.71%).

Furthermore, we would like to make the following statements: Although many works have contributed to the frequency-domain-based backdoor attack method, WaveAttack is the first attempt to generate backdoor triggers for the high-frequency component obtained through DWT and the first method to achieve such superior attack performances against three kinds of detection methods (sample-quality-based detection methods, latent-space-based detection methods, and frequency-domain-based detection methods). By employing our proposed asymmetric frequency obfuscation, WaveAttack can not only acquire backdoor attack effectiveness but also achieve high stealthiness regarding image quality and latent space against backdoor attack detection methods.

Minor Revisions

  • Q: Explanation of frequency components in the Appendix.

    • A: Thank you for your insightful feedback. We will add the frequency components of images in the Appendix.
  • Q: Typo errors.

    • A: Thank you for pointing these out. We will fix these errors in the next version of our paper.
  • Q: Motivation of asymmetric frequency obfuscation.

    • A: Thank you for your insightful suggestion. We will further clarify the motivation of our asymmetric frequency obfuscation in the paper.

References

  • [1] Feng Y, Ma B, Zhang J, et al. Fiba: Frequency-injection based backdoor attack in medical image analysis. CVPR, 2022.
  • [2] Wang T, Yao Y, Xu F, et al. Backdoor attack through frequency domain. ECCV, 2023.
  • [3] Check your other door! Creating backdoor attacks in the frequency domain.
  • [4] Zeng Y, Park W, Mao Z M, et al. Rethinking the backdoor attacks' triggers: A frequency perspective. CVPR 2022.
评论

Thanks to the authors for the detailed rebuttal. I appreciate the effort and work put into this paper. While the performance does show improvement over previous frequency-based methods, the use of another off-the-shelf algorithm to generate high-frequency components (instead of DCT or Fourier Transformation), though effective, may not fully meet the novelty expectations for a NeurIPS submission. Therefore, I will give at most my current score.

评论

We sincerely appreciate the reviewer’s enthusiastic and generous responses.

We would like to highlight that our contribution extends beyond merely utilising the DWT method.

Specifically, this paper is the first to propose an asymmetric frequency obfuscation method within DWT-based frequency backdoor attacks. To the best of our knowledge, this introduction of the obfuscation method on DWT enables the frequency-domain-based backdoor attack method to evade the defences of three kinds of detection methods (sample-quality-based detection methods, latent-space-based detection methods, and frequency-domain-based detection methods) simultaneously for the first time.

审稿意见
4

This paper investigated the backdoor attack, aiming at improving the fidelity of poisoned samples. A novel frequency-based backdoor attack method named WaveAttack is proposed to generate highly stealthy backdoor triggers. The experiments show that the poisoned images generated by WaveAttack can achieve high attack effectiveness and fidelity.

优点

The proposed attack creates more high-fidelity poisoned samples through Discrete Wavelet Transform (DWT), meanwhile maintaining the attack success rate.

缺点

[Threat model] The threat model illustrated in Section 3.1 is unclear. In the implementation, WaveAttack requires that the attacker fully controls the training process. Therefore, the statement “They can embed backdoors into the DNNs by poisoning the given training dataset” is ambiguous.

[Stealthiness] I recommend that the authors conduct more experiments to confirm the advantage of WaveAttack in terms of attack stealthiness. As shown in Figure 3, the difference between FTrojan and WaveAttack is negligible.

[Baselines] Most baselines are poisoning-based backdoor attacks, e.g., Adapt-Blend and WaNet. It is not fair to compare WaveAttack with these poisoning-based attacks.

问题

The threat model is unclear.

The stealthiness of the attack should be further confirmed.

More comparable baselines should be included, such as LIRA [1].

[1] LIRA: Learnable, Imperceptible and Robust Backdoor Attacks. ICCV, 2021.

局限性

The authors adequately discuss the potential security issues related to the proposed backdoor attack. Also, the limitations of the proposed method in terms of computing cost have been mentioned.

作者回复

Response to Reviewer vbLQ

We sincerely appreciate the reviewer's valuable feedback and insightful comments on our paper. We have carefully considered each issue raised and provided detailed responses to all questions and concerns below.

Threat Model

Thank you for your feedback on the threat model. Similar to the configurations in LIRA [1] and Adapt-Blend, the threat model of WaveAttack indeed assumes that attackers have significant control during the training procedure. Meanwhile, to demonstrate the superiority of WaveAttack, we will include the LIRA method in our experimental results.

Stealthiness

Thank you for your insightful comments. We acknowledge that the differences between FTrojan and WaveAttack are not particularly significant in terms of human visual recognition, which requires slight magnification in residual images (Figure 3, Line 283) to observe significant differences between the two poisoned images. However, this phenomenon only occurs in sample-quality-based backdoor detection by human visual recognition.

Currently, backdoor detection algorithms are mainly divided into three categories:

  • Sample-quality-based detection methods: As shown in Table 3 (Line 289), the differences between FTrojan and WaveAttack can be detected in terms of PSNR, SSIM, and IS. Thus, compared with our paper's state-of-the-art (SOTA) attack methods, WaveAttack achieves superior poisoned image quality.
  • Latent space-based detection methods: As illustrated in Figure 4 (Line 290), latent-based detection methods can effectively detect FTrojan but not WaveAttack.
  • Frequency domain-based detection methods: These include various frequency domain filtering methods and the BDR (Backdoor Detection Rate, BDR) detection [2], as shown in Tables 4 (Line 365) and 6 (Line 544). WaveAttack has a higher ASR and BA with a lower BDR than FTrojan. This means that WaveAttack can obtain better attack effectiveness and stealthiness against frequency domain-based detection methods than FTrojan. In summary, compared to FTrojan (the SOTA frequency-domain-based backdoor attack method), WaveAttack has superior attack performance against all three detection methods in terms of effectiveness, stealthiness, and fidelity.

Baselines

Attack Performance and Stealthiness Comparison

CIFAR10CIFAR100GTSRB
LIRAWaveAttackLIRAWaveAttackLIRAWaveAttack
ISS ↓0.0190.0110.0180.0050.0890.058
PSNR ↑46.7747.4947.7750.1240.4440.67
SSIM ↑0.99790.99790.99950.99920.98790.9877
BA ↑93.5794.5573.0975.4110.7499.30
ASR ↑99.9610099.9810099.03100

Based on the BackdoorBox and LIRA, we have introduced the backdoor attack method named LIRA as a baseline for complete comparison. The comparison results are shown in the table above. This table shows that, compared with LIRA, WaveAttack still achieves the best attack effectiveness and fidelity of the poisoned images.

References

  • [1] LIRA: Learnable, Imperceptible and Robust Backdoor Attacks. Proceedings of ICCV, 2021.
  • [2] Zeng Y, Park W, Mao Z M, et al. Rethinking the backdoor attacks' triggers: A frequency perspective. Proceedings of the CVPR, 2021.
评论

Thank the authors for their response. However, I am still concerned about the slight improvement compared to the previous SOTAs, and also the novelty of this paper mentioned by Reviewer h6uR. Therefore, I tend to keep my initial score.

评论

Dear reviewer,

Thanks for your reply. We still want to clarify two things in our discussion.

Significant Improvement Over State-of-the-art Methods

Compared to the state-of-the-art methods, LIRA and FTrojan, the performance improvement of WaveAttack is significant. For instance, WaveAttack maintains excellent BA performance of backdoored DNNs while achieving 100% ASR for three datasets, whereas LIRA's BA performance of backdoored DNNs on GTSRB drops to only 10.47%. Moreover, neither of these attacks considers the existence of latent-space-based detection methods, making them vulnerable to such defences, which can fully resist FTrojan and LIRA.

Contributions

Regarding the novelty of this paper, we would like to emphasize that we are the first to propose an asymmetric frequency obfuscation method within DWT-based frequency backdoor attacks. To the best of our knowledge, this obfuscation method allows frequency-domain-based backdoor attacks to evade detection by three different types of methods (sample-quality-based, latent-space-based, and frequency-domain-based detection methods) simultaneously for the first time.

Thank you very much for your review. If you have any further comments or questions, please feel free to contact us.

Best regards,

Authors

最终决定

This paper was reviewed by three reviewers. It received mixed recommendations, with 1 Accept and 2 Borderline Rejects. The authors addressed the reviewers' concerns in the rebuttal, but the reviewers decided to keep their scores unchanged. The paper was generally applauded for its good writing, extensive experiments, and strong results. However, Reviewers vbLQ and h6uR kept their Borderline Reject decision, considering the improvement over SOTA methods small and the novelty weak.

The ACs thoroughly checked the paper, the review, and the discussion. We agree with the authors that the proposed technique has a considerable improvement over previous approaches in terms of stealthiness in the image (Table 3) and latent space (Fig. 4). We also agree on the claimed contributions and believe the novelty of the paper is significant. Hence, the decision is to recommend the paper for acceptance to NeurIPS 2024. The authors are encouraged to include the discussions from the rebuttal to the final camera-ready version of the paper. We congratulate the authors on the acceptance of their paper!