PaperHub
6.1
/10
Poster4 位审稿人
最低3最高4标准差0.4
4
3
3
3
ICML 2025

TtBA: Two-third Bridge Approach for Decision-Based Adversarial Attack

OpenReviewPDF
提交: 2025-01-14更新: 2025-07-24
TL;DR

We propose a novel Two-third Bridge Attack (TtBA) for decision-based black-box attack.

摘要

A key challenge in black-box adversarial attacks is the high query complexity in hard-label settings, where only the top-1 predicted label from the target deep model is accessible. In this paper, we propose a novel normal-vector-based method called Two-third Bridge Attack (TtBA). A innovative bridge direction is introduced which is a weighted combination of the current unit perturbation direction and its unit normal vector, controlled by a weight parameter $k$. We further use binary search to identify $k=k_\text{bridge}$, which has identical decision boundary as the current direction. Notably, we observe that $k=2/3 k_\text{bridge}$ yields a near-optimal perturbation direction, ensuring the stealthiness of the attack. In addition, we investigate the critical importance of local optima during the perturbation direction optimization process and propose a simple and effective approach to detect and escape such local optima. Experimental results on MNIST, FASHION-MNIST, CIFAR10, CIFAR100, and ImageNet datasets demonstrate the strong performance and scalability of our approach. Compared to state-of-the-art non-targeted and targeted attack methods, TtBA consistently delivers superior performance across most experimented datasets and deep learning models. Code is available at https://anonymous.4open.science/r/TtBA-6ECF.
关键词
Adversarial AttacksBlack Box Adversarial AttacksHard Label AttacksDecision-Based AttacksMachine Learning SecurityArtificial Intelligence SecurityRobustness

评审与讨论

审稿意见
4

The paper proposes a novel decision-based black box attack against image classifiers. The attack is called TtBA and it is based upon exploiting the geometry of the decision boundary. It introduces a notion of the kbridgek_{bridge} metric and discusses how it helps in constructing an efficient adversarial example. The attack is demonstrated to outperform the existing methods on a wide range of datasets and models (undefended and defended) in the untargeted and targeted attack settings.

给作者的问题

  1. Have you observed any difference when analysing the curve of the decision boundary (e. g. Figure3) for undefended and adversarially trained models?

  2. Does sensitivity analysis in the Tables 2 and 3 include adversarially robust models? Would you expect some other hyperparameters to work better for them?

论据与证据

The claims made in the paper are supported by empirical observations, formal analysis (Appendix) and extensive experimental results (Section 5 and Appendix).

方法与评估标准

In the line 169 targeted attack is formulated via a target image xtargetx_{target} with its corresponding label f(xtarget)f(x_{target}) rather than a more typical formulation via a target label ytargety_{target} i. e. achieving f(x)=ytargetf(x)=y_target. It is not clear whether evaluating attack efficiency in the targeted setting with this formulation is reasonable.

Other than that the evaluation methods and criteria seem to make sense for this problem.

理论论述

There appear to be no proofs in the main part. The derivation in the Appendix were not carefully checked.

实验设计与分析

It is not entirely clear whether the formulation of a targeted attack is reasonable (see the Methods And Evaluation Criteria Section for details).

Other than that there appear to be no issues.

Adversarial example definition in the line 199 doesn't contain clipping the resulting image to [0, 1], probably to do saving some space in the text. But in the code provided along with the submission it was checked that the clamping is happening.

补充材料

I have mostly reviewed Appendix A, G and Figure 6.

与现有文献的关系

The paper relates to the previous analysis of the decision boundary curvature and norm vector-based attacks.

遗漏的重要参考文献

There appear to be no missing essential references.

其他优缺点

Strengths

  1. A novel black-box attack outperforming state of the art, as demonstrated in extensive experiments.
  2. Studying decision-based attacks is important because they are more practical than other types of black box attacks and can pose a significant threat for safety-critical machine learning applications.

Weaknesses

  1. Figure 1 is a bit overloaded with notation. It would be good to simplify it.

其他意见或建议

No further comments.

作者回复

We sincerely appreciate the reviewer's meticulous evaluation and valuable comments, which have greatly helped improve our manuscript.

  1. The problem formulation is reasonable for two key reasons. First, in hard-label black-box attacks (e.g., HSJA, TA, CGBA), where only the model’s output label is available and gradient information is inaccessible, generating an initial adversarial example for a target label in targeted attacks is nearly impossible. As a result, these methods require a target image ytarget y_{\text{target}} as the initial adversarial example. We will explicitly highlight this methodological distinction in our revision to prevent potential confusion for readers. Second, HSJA, TA, and CGBA all evaluate attack efficiency under this targeted setting in their experiments. To ensure a fair comparison, we adopt the same setting in our experiments.

  2. We agree that the adversarial example definition should explicitly include clipping to [0,1]. This was properly implemented in our code but inadvertently omitted from the text. We will add this clarification in the revision. We sincerely thank the reviewer for carefully checking our code.

  3. Following your valuable advice, we will carefully simplify Figure 1 in the revised paper.

  4. Agreeing with the reviewer, in Figure 2 and 3, we observe that models without defense (i.e., undefended models) usually have decision boundaries with low curvature and large kbridgek_{\text{bridge}}, while robust models normally have high curvature and small kbridgek_{\text{bridge}}. The table below shows the relationship between average kbridgek_{\text{bridge}} and average 2\ell_2 distortion for successful attacks on CIFAR100 and CIFAR10 datasets across undefended ViT, CNN, and defended WRN models. It demonstrates that robust WRN models generally have higher average 2\ell_2 distortion and lower kbridgek_{\text{bridge}}, suggesting that adversarially trained models are associated with high decision boundary curvature.

Model (Dataset)ViT (CIFAR100)WRN (CIFAR100)CNN (CIFAR10)WRN (CIFAR10)
AVG ℓ₂0.7790.9910.1801.198
AVG K_bridge0.3700.3500.3620.341
  1. The sensitivity analysis in Tables 2 and 3 does not include adversarially robust models. Here we add a sensitivity analysis on robust models. By adjusting the hyperparameters of TtBA, it can certainly yield better performance for robust models, as demonstrated in the following table. Specifically, we modify the setting of k=bˇkbridge k = \check{b} \cdot k_\text{bridge} by varying the default value of bˇ=2/3 \check{b} = 2/3 across {0.55, 0.575, 0.60, 0.625, 0.65, 2/3, 0.70}, and evaluate the AUC of two WRN models on the CIFAR-100 and TinyImageNet datasets. The results demonstrate that, for robust models, the setting bˇ=0.625 \check{b} = 0.625 achieves the best performance in 3 out of 4 experiments, clearly surpassing the bˇ=2/3\check{b} = 2/3 setting. This difference likely arises because robust models can effectively conceal gradient information, causing normal vector estimation to become less reliable. Consequently, assigning a smaller weight to the normal vector can enhance the effectiveness of perturbation optimization.
Dataset (model)Attack Typeb̌=0.55b̌=0.575b̌=0.60b̌=0.625b̌=0.65b̌=2/3b̌=0.70
CIFAR100 (WRN)Non-targeted8763.68790.28657.48605.48681.88784.68816.2
Targeted22786222882197720799228062297323172
TinyImageNet (WRN)Non-targeted31864312303089829437299783002630569
Targeted121442120874115260115681116891116976117997
审稿人评论

I would like to thank the authors for addressing the points raised in my review. I have no further questions and I am keeping my score.

作者评论

Thank you again for your careful review and for checking our code. Your comment reminded us to clarify the targeted attack formulation and the definition of adversarial examples, which is crucial for eliminating misunderstandings and improving our work.

审稿意见
3

This paper introduces a decision-based black-box adversarial attack, termed Two-third Bridge Approach---TtBA, that focuses on optimizing perturbation directions for attack queries by leveraging normal vectors and the bridge direction, to relieve query complexity. This bridge direction is a weighted combo of the current perturbation direction and its normal vector, where the weight parameter is k. Through empirical evaluations, the authors show k=23kbridgek=\frac{2}{3}k_{\text{bridge}} offers the optimal directional alignment. With validations on various datasets, TtBA shows improved performances over existing non-targeted and targeted attack methods.

给作者的问题

N/A

论据与证据

The claims made in the submission are partially supported by empirical evidence. The major concern of the reviewer is that, does the hypothesis that the decision boundary of DNNs is smooth and locally concave still hold for robust models? Only a robust WideResNet was studied in the experiments. The reviewer finds this to be slightly lacking. Are other adversarial defenses such as input transformation-based ones, and adversarial training techniques tailored for ViTs, etc., relevant under this context?

方法与评估标准

The evaluation setups are reasonable and supportive of this work.

理论论述

The reviewer briefly went through the proofs in the Appendix and find that they are supportive of the claims made, but the reviewer did not check the correctness of the proofs.

实验设计与分析

Experimental designs are valid and supportive of the effectiveness of this proposed method.

补充材料

The reviewer briefly went through the appendix but did not check any proof in detail.

与现有文献的关系

The paper makes significant contributions to the broader scientific literature on black-box adversarial attacks, where research on decision-based hard label attacks is essential in theoretical and practical advancements.

遗漏的重要参考文献

A few major decision-based attacks introduced in [1-2] are missing in the current experimental comparisons. The authors should discuss the relationship of TtBA to these works and/or explain why they were not included.

[1] Wan, Jie, Jianhao Fu, Lijin Wang, and Ziqi Yang. “BounceAttack: A Query-Efficient Decision-Based Adversarial Attack by Bouncing into the Wild.” In 2024 IEEE Symposium on Security and Privacy (SP), 1270–86, 2024. https://doi.org/10.1109/SP54263.2024.00068.

[2] Park, Jeonghwan, Paul Miller, and Niall McLaughlin. “Hard-Label Based Small Query Black-Box Adversarial Attack.” In 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 3974–83. Waikoloa, HI, USA: IEEE, 2024. https://doi.org/10.1109/WACV57701.2024.00394.

其他优缺点

Strengths:

  • The authors are to be congratulated with the extensive experiments on all 5 datasets with various common model architectures that include CNNs and ViTs.
  • One of the major concerns, i.e., why 2/3 is the optimal parameter, has been explained in detail and empirically validated (in Figure 4 and Appendix G).

Weaknesses:

  • Apart from the concerns mentioned above, the reviewer find that the experiments are quite focused on comparing perturbation size (2\ell_2 distance) under the same query budget. The reviewer was kind of expecting that the work reports on the inverse, that is, the query complexity reduction under the same perturbation budget?

其他意见或建议

N/A

作者回复

Thank you for your comments.

  1. SOTA studies, including HSJA, TA, CGBA, strongly support the hypothesis that the decision boundary of DNNs remains smooth and locally concave even for many robustly trained models. This is because the robust training process does not interfere with normal vector estimation. We conduct additional experiments using the Towards Robust Vision Transformer (RVT) defense from [1], which enhances ViT robustness with position-aware attention scaling and patch-wise augmentation. The corresponding decision boundary is plotted in a figure at [decision boundary of robustly trained models](https://anonymous.4open.science/r/TtBA-6ECF/ DecisionBoundaryofRobustViT.pdf). In the figure, the decision boundary remains smooth and locally concave.

    However, for input transformation-based defenses such as RandResizePad in [2], the normal vector estimation process can be disrupted. This causes the decision boundary to lose its smoothness, which is shown at abnormal decision boundary of RandResizePad. It is important to note that this issue is not specific to our approach. All normal vector-based attacks, including HSJA, TA, and CGBA, encounter similar challenges and currently lack effective solutions. Since addressing these specific challenges is beyond the primary scope of our study, we did not include experiments on transformation-based defenses in our evaluation.

    To strengthen our revised paper, we will expand the literature review to discuss attack methods specifically designed for input transformation-based defenses and highlight their differences from normal vector-based attacks. Additionally, we will update our future work section to explore potential extensions of our method to handle such defenses effectively.

  2. We thank the reviewer for pointing out important attack methods such as BounceAttack and SQBA, which we have now incorporated in the literature review section.

  • While BounceAttack improves upon HSJA by using orthogonal gradient components and introduces momentum/smooth search mechanisms, it does not address the local optima problem caused by high-curvature decision boundaries, which is the core focus of our work. We would gladly compare with BounceAttack, but the official code is currently inaccessible, preventing us from presenting detailed results.

  • SQBA uses pre-trained surrogate models for gradient estimation and relies on access to the target model's training dataset. In contrast, our decision-based attacks assume no access to the training dataset. Therefore, a direct experimental comparison with SQBA will not be included. Thank you for your understanding!

  1. The reduction in query complexity under the same perturbation budget is shown below. Following the setup of CGBA, we set the query budget to 10,000 and the maximum 2 \ell_2 perturbation strength to ϵ=2.5 \epsilon = 2.5. We then randomly choose 500 images from ImageNet and compare the Attack Success Rate (ASR) and the average (median) queries.
AttackModelVGG-19ResNet-50Inception-V3ViT-B32
HSJAQuery2051.1(1071.8)1833.8(1209.5)2851.1(2080.1)1873.9(947.5)
ASR61.0%38.8%57.2%59.6%
CGBAQuery2500.9(1528.5)3450.7(2679.0)3169.3(2363.0)2447.8(1797.0)
ASR88.2%52.0%74.4%79.6%
TtBAQuery2350.8(1481.0)3546.6(2754.0)3098.8(2175.0)2384.4(1781.5)
ASR93.2%61.8%80.0%80.4%
  • The results show that TtBA achieves the highest ASR across all models. HSJA has the lowest average (median) number of queries, but this is due to its much lower ASR. As is well-known, some images contain robust features that require more queries to attack. TtBA, with significantly higher ASR, is able to successfully attack these robust images, thus requiring more queries on average. Meanwhile, with a similar ASR, TtBA outperforms CGBA in terms of average (median) queries. On ResNet-50, TtBA also achieves significantly higher ASR (61.8%) compared to CGBA (52.0%). We will include these results in our revised paper.

We believe we have satisfactorily addressed all the concerns raised in our rebuttal. If the reviewer agrees, would you please kindly consider adjusting your rating?

[1] Mao, Xiaofeng, et al. "Towards robust vision transformer." In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 2022.

[2] Xie, C., Wang, J., et al. "Mitigating adversarial effects through randomization." In arXiv preprint arXiv:1711.01991.

审稿人评论

Thanks for the clarifications. I feel most of my concerns are resolved. I will be keeping my score of 3: Weak accept.

作者评论

Thank you very much for your insightful and constructive comments and for providing essential references. Your suggestions have significantly enhanced our work.

审稿意见
3

The paper proposes the TtBA method for decision-based black-box adversarial attacks. It introduces a new bridge direction, a weighted combination of the current direction and its normal vector, controlled by a weight parameter kk. Experiments on multiple datasets and models show that TtBA outperforms state-of-the-art methods in both targeted and non-targeted attacks.

给作者的问题

n/a

论据与证据

Yes

方法与评估标准

Yes

理论论述

Yes

实验设计与分析

Yes

补充材料

Yes.

与现有文献的关系

n/a

遗漏的重要参考文献

n/a

其他优缺点

Strength

  • The proposed method improves the performance of decision-based adversarial attacks.

  • The paper is well written.

Weakness

  • The contribution of this paper is a little weak, from my perspective.

  • Some settings are somewhat empirical and without rational explanation. For example, k=2/3kbridgeik = 2/3k^i_{bridge}.

其他意见或建议

n/a

作者回复
  1. Thank you for raising concerns regarding the strength of our contributions. We introduce a fundamentally new and practically valuable metric, kbridgek_\text{bridge}, specifically designed to quantify decision boundary curvature, a critical but previously unexplored factor in adversarial attacks. This metric reveals vital insights into how geometric properties of decision boundaries directly affect the effectiveness of decision-based attacks. Existing SOTA decision based attack methods such as HSJA, TA, qFool, GeoDA, QEBA, and CGBA overlook the critical issue of local optima caused by high-curvature boundaries, significantly reducing their attack effectiveness. Addressing this gap, we make three substantial contributions:

    (1) We propose kbridgek_\text{bridge}, the first quantitative metric in literature, to rigorously measure boundary curvature, enabling systematic analysis and deeper understanding of adversarial optimization.

    (2) Using insights from kbridge k_\text{bridge}, we uncover a previously unidentified linear relationship between boundary curvature and optimal perturbation directions. Leveraging this discovery, we develop the TtBA method for highly effective decision-based black-box attack.

    (3) We identify a low attack efficiency problem caused by high boundary curvature and propose a robust mechanism to detect and escape them, significantly enhancing optimization efficiency and attack success rates.

    Our extensive experiments across multiple widely-used datasets and models clearly demonstrate the substantial practical impact of our contributions, representing a significant advancement in adversarial machine learning. We will further clarify the above discussion in the revised paper.

  2. While the setting k=2/3kbridgeik=2/3k_\text{bridge}^{i} is empirically motivated, it is also supported by an extensive sensitivity analysis in Appendix G. Specifically, we analyze the sensitivity of our method to different settings of kk by varying the default value of 2/32/3 across {0.55, 0.60, 0.65, 0.70, 0.75} in Table 2, and similarly adjusting other parameters in Table 3. Our results reveal two key findings: first, the current configuration achieves the best performance in 10 out of 16 experimental scenarios; second, alternative parameter values maintain comparable effectiveness. These results demonstrate that, while k=2/3k=2/3 represents the most effective choice, TtBA's performance remains robust to parameter variations, ensuring methodological reliability across different configurations.

We believe we have satisfactorily addressed all the concerns raised in our rebuttal. If the reviewer agrees, would you please kindly consider adjusting your rating?

审稿意见
3

The manuscript introduces an innovative bridge direction to optimize the adversarial perturbation by linearly combining the current unit perturbation direction with its unit normal vector. Via experiment observation, k= 2/3 k_{bridge} can yield a near-optimal perturbation direction. Besides, the paper designs a simple and effective approach to detect and escape the local optima, making the proposed method better than the SOTA.

给作者的问题

In total, the proposed method is only an improvement of existing techniques, mainly based on HSJA, TA, qFool, GeoDA, QEBA, and CGBA. No brand-new insight is found to contribute to AI security fields. Finding a k=2/3k_{bridge} by experiments matched with some theoretical verification, designing an escape scheme to skip the local optima, and so on are not too challenging and very innovative. The performance improvement is not very significant, some are even below the sota results.

论据与证据

The novelty claims and theoretical derivations are reasonable.

方法与评估标准

The proposed method and the used criteria, including evaluation datasets, are common and representative.

理论论述

I check the theoretical claims and the corresponding proofs are correct.

实验设计与分析

The experiments and results are convincing.

补充材料

Supplementary material is helpful.

与现有文献的关系

The optimization strategy may be somewhat insightful to other fields.

遗漏的重要参考文献

Not enough.

其他优缺点

Strongs: For targeted attacks, narrow adversarial regions lead to being more easily trapped in local optima. Weakness: Why d_{bridge}^{i} can be ensured to have identical decision bourndary as \hat{d}^{i}, as shown in Figure 1.

其他意见或建议

The paper includes certain novelty components, but seems to be not enough to get the bar of the ICML.

作者回复

Thank you for your valuable comments.

  1. We perform a binary search of k=kbridgei(0,1]k = k_\text{bridge}^{i} \in (0,1] to identify dk=dbridgeid_k = d_\text{bridge}^{i} which have identical decision boundary as d^i\hat{d}^{i}.
  • According to Figure 1, when kk is very small, direction dkd_k approaches d^i\hat{d}^{i} and its decision boundary of dkd_k is smaller than d^i\hat{d}^{i}.

  • When k=1k=1, dk=N^id_k = \hat{N}^i, its decision boundary is significantly larger than d^i\hat{d}^{i}. By the intermediate value theorem, there must exist k(0,1]k \in (0,1] such that dkd_k yields the same decision boundary as d^i\hat{d}^{i}. We will clarify the above in the revised paper.

  1. Thank you for raising concern regarding the "brand-new insight". Our primary innovation lies in identifying and rigorously analyzing the previously unknown relationship between decision boundary curvature and the optimization of adversarial perturbations. Existing SOTA decision based methods (HSJA, TA, QEBA, and CGBA) have largely overlooked how boundary curvature influences the occurrence of local optima that seriously impacts optimization efficiency and effectiveness.

    In contrast, our work introduces a novel and practical curvature metric, kbridge k_\text{bridge}, which provides the first systematic means to quantify and interpret decision boundary geometry. This new understanding allows us to pinpoint precisely why and how adversarial attacks fail under certain geometric conditions, delivering useful insights previously missing from the literature.

    Leveraging this discovery, we developed TtBA, a significantly more effective and efficient decision-based attack method. Further, we introduced a robust mechanism specifically designed to detect and escape local optima induced by boundary curvature, directly addressing an important limitation unexplored by previous studies. In summary, rather than merely improving upon prior methods, our research introduces brand-new conceptual understanding and practical tools with significant contributions to the AI security field. We will further clarify the above discussion in the revised paper.

  2. We acknowledge that the proposed techniques may appear conceptually simple at first glance. However, identifying critical yet overlooked issues, including the presence of local optima due to high-curvature decision boundaries, is far from trivial. Developing practical, intuitive, and effective solutions to address such issues further highlights the strength and novelty of our contributions. Hence, the simplicity of our solution does not diminish its novelty or importance; rather, it underscores the clarity and practical value of our research.

    The novelty of our technical contributions lies precisely in uncovering significant issues that have not received sufficient attention in existing literature. Despite extensive research, prior SOTA decision based methods have largely ignored how boundary curvature leads to local optimization traps that severely hinder adversarial attacks. Our research uniquely identifies this critical gap and proposes robust, intuitive, and demonstrably effective methods to address it.

    Therefore, while the proposed solutions might seem intuitive after being introduced, we argue that recognizing and formulating these specific problems and subsequently developing simple yet powerful techniques constitute substantial and novel contributions to the AI security community. We will further clarify the above discussion in the revised paper.

  3. Our evaluation is rigorous, covering extensive experiments across five datasets and seven distinct model architectures, representing a comprehensive and highly challenging benchmark. It is noteworthy that consistently surpassing SOTA performance across all tests is exceptionally difficult, which is a challenge similarly faced by recent leading methods such as HSJA, TA, and CGBA. Despite this inherent difficulty, our method achieves substantial performance improvements. Specifically in Table 1, TtBA clearly outperforms existing SOTA methods in 103 out of 108 experimented scenarios, with few remaining cases closely matching the best performance.

    Furthermore, for robust models evaluated in Figure 5, we intentionally refrained from fine-tuning parameters to rigorously test our method's robustness and generalization capability. Even under this conservative setting, we demonstrated superior performance in 37 out of 40 cases. Additional targeted parameter tuning can further enhance the effectiveness of TtBA. However, we deliberately emphasized our method's strong general performance and broad applicability across diverse settings, thereby reinforcing the substantial practical value and robustness of our contributions.

We believe we have satisfactorily addressed all the concerns raised in our rebuttal. If the reviewer agrees, would you please kindly consider adjusting your rating?

审稿人评论

Thank the authors for the careful feedback. After reading the rebuttals of the authors, I think most of my concerns are addressed. I will be willing to raise my rating.

作者评论

Thank you very much for your positive and encouraging review. We sincerely appreciate your valuable comments and constructive suggestions, which have helped improve our work.

最终决定

Reviewers identified some relatively minor issues regarding additional related work, clarification of the threat model and its access to data, and other clarifying questions. These were all addressed during the response period. Now, it is not entirely obvious why this adjusted threat model, which deviates from prior work, is of utmost importance to the broader ML community. Beyond being merely a methodological difference, the value of this work would be a lot more valuable if connected to more real-world challenges or considerations reflecting actual use-cases of the threat model. As a result, while the work is good and could be interesting, its general value is not totally clear even after reading the discussion, leading to my recommendation of weak accept.