AdvLoRA: Adversarial Low-Rank Adaptation of Vision-Language Models
摘要
评审与讨论
This paper introduces a parameter-efficient method to enhance the adversarial robustness of VLMs. Traditional adaptation methods like full fine-tuning and LoRA are vulnerable to adversarial attacks, leading to performance drops. AdvLoRA improves robustness by utilizing low-rank adaptation, parameter clustering, and adaptive update strategies, reducing computational costs. Experiments show that AdvLoRA outperforms other methods, especially in adversarial scenarios.
优点
- The writing is clear. The formulas are correct.
- The experiment is abundant and multi-dimensional.
- The research topic is important for VLM.
缺点
- While the method is effective, there is no analysis explaining the necessity of reparameterization.
- The rationale behind using clustering to establish a connection with the parameter in W is insufficiently analyzed.
- The justification for employing an adaptive update parameter is also lacking.
问题
Please see the weakness
This paper proposes a LoRA-based adversarial training method for visual language models. Unlike directly using LoRA, this method improves the efficiency and robustness of adversarial adaptation by designing a novel reparameterization method based on parameter clustering and parameter alignment. Through extensive experiments, the article demonstrates the effectiveness of AdvLora.
优点
The paper provides a detailed introduction to the method, making it easy to understand.
It also conducts numerous experiments to demonstrate the effectiveness of the approach.
缺点
- In terms of writing, the entire paper seems to not use the correct citation format; the ICLR template should utilize \citep. Therefore, a thorough review and verification of the paper are necessary to meet writing standards.
- In lines 177-181, L has not used cross-referencing \ref.
- It is a well-known fact that using adversarial samples for adversarial training can degrade model performance, and the introduction of Table 1 is not very clear regarding which model was trained.
- If I am not mistaken, AdvLora seems to only improve the initialization of LoRA, which makes its contribution appear relatively small.
- It is necessary to compare this method with other adversarial training approaches, such as RobustCLIP[1].
[1] Schlarmann, Christian, et al. "Robust clip: Unsupervised adversarial fine-tuning of vision embeddings for robust large vision-language models." arxiv preprint arxiv:2402.12336 (2024).
问题
See Weaknesses. I would like to see the authors provide further clarification on the contributions of their work to confirm whether my understanding is correct.
The authors propose a parameter-efficient adversarial adaptation method called AdvLoRA, based on Low-Rank Adaptation. Initially, they investigate and reveal the intrinsic low-rank properties present in adversarial adaptation for vision-language models (VLMs). Unlike LoRA, AdvLoRA enhances the efficiency and robustness of adversarial adaptation through a novel reparameterization method that leverages parameter clustering and alignment. Additionally, an adaptive parameter update strategy is introduced to further enhance robustness. With these innovations, AdvLoRA addresses issues related to model security and excessive resource consumption. Extensive experiments demonstrate the effectiveness and efficiency of AdvLoRA.
优点
- This paper presents AdvLoRA, a novel parameter-efficient adversarial adaptation method that improves the adversarial robustness of vision-language models (VLMs) through low-rank adaptation, representing an interesting avenue for research.
- The paper presents comparative results across some mainstream datasets.
- The method proposed in this paper is practical and applicable.
缺点
- The comparison between the proposed method and existing adversarial robustness techniques is insufficient, particularly regarding performance across different attack types.
- In the absence of an analysis of the proposed method's efficiency, clustering may be theoretically time-consuming.
- Ablation experiments should be a key component of the study, as it is crucial to evaluate the effectiveness of each module of the proposed method. The current content does not adequately demonstrate the method's effectiveness and lacks a detailed comparative analysis.
- The reparameterization method lacks theoretical support.
问题
- Why does AB need to be aligned with W_0 ?
This paper focuses on the adversarial robustness of VLMs during PEFT. The authors improve the efficiency and robustness of adversarial adaptation by designing a reparametrizing method based on parameter clustering and parameter alignment.
优点
This paper investigated an important problem and proved that the proposed ADVLORA can improve the adversarial robustness of BLIP-like VLMs in a parameter-efficient manner.
缺点
- The novelty is very limited since the ADVLORA is proposed by combining adversarial training and LORA. Also, the proposed parameter clustering is not well-motivated.
- The pipeline of ADVLORA is unclear. I hope that the authors could further clarify the purpose of Eq.8-12. Are they used for initialization or updated in each iteration?
- How to choose the parameter , which is newly introduced compared to the original LORA.
- The authors only investigate BLIP, whereas, there are many other VLMs, like CLIP.
- The citation format should be revised. And there are many typos, such as “Eq. equation” in Algorithm1.
问题
- What is the purpose of Eq.8-12?
- How to choose the parameter ?
- Does the ADVLORA work on other types of VLM?
I have read and agree with the venue's withdrawal policy on behalf of myself and my co-authors.