Perturb a Model, Not an Image: Towards Robust Privacy Protection via Anti-Personalized Diffusion Models
We prevent unauthorized personalization of diffusion models at the model level.
摘要
评审与讨论
This work proposes Anti-Personalized Diffusion Models (APDM) to hinder the unauthorized personalization of specific subjects in diffusion models, addressing privacy risks arising from high-quality subject synthesis. Unlike prior methods that generate adversarially perturbed samples but fail under simple transformations or with a few clean images, APDM shifts the protection target to the diffusion model itself. The authors first theoretically demonstrate that existing loss functions are inherently incapable of ensuring convergence for robust anti-personalization in diffusion models. To address this, they introduce Direct Protective Optimization (DPO), a novel loss function that effectively disrupts subject personalization without degrading generative quality. Furthermore, they propose a Learning to Protect (L2P) dual-path optimization strategy, which alternates between personalization and protection paths to simulate future personalization trajectories and adaptively reinforce protection. Experiments show that the proposed framework achieves state-of-the-art performance in preventing unauthorized personalization while maintaining generation quality.
优缺点分析
Strengths
-
The proposed approach demonstrates strong effectiveness in mitigating unauthorized personalization.
-
The experimental evaluation is thorough. The paper includes both comparative studies with existing baselines and detailed ablation studies.
-
Figure 1 is particularly well-designed. It clearly illustrates how personalized models can lead to privacy breaches by revealing sensitive identity information.
Weaknesses
1.Limited Novelty: The novelty of this paper may be insufficient, as the idea of using model fine-tuning for anti-personalization has already been explored in the community. The work does not appear to be the first to propose using new model parameters to achieve robust anti-personalization. For example, see: Liu X, Jia X, Xun Y, et al. PersGuard: Preventing Malicious Personalization via Backdoor Attacks on Pre-trained Text-to-Image Diffusion Models. arXiv preprint arXiv:2502.16167, 2025.
2.Lack of Clarity in the Anti-Personalization Process: The paper lacks a clear explanation of the anti-personalization process. Specifically, would a malicious personalization user continue to use DreamBooth to personalize the Anti-Personalized Diffusion Model (APDM)? If the personalized training involves images from the negative samples (as in Equation 5), I am concerned that the protection efficacy may be significantly compromised. Furthermore, this degradation could worsen with more fine-tuning epochs.
3.Missing Threat Model and Assumptions: The methodology section does not clearly state the assumptions regarding the knowledge and capabilities of both malicious personalization users and anti-personalization users. Under what conditions does the defender have control over the model? Is this assumption realistic in practical scenarios? The authors are encouraged to refer to the problem definition section in Anti-DreamBooth to better articulate the necessary assumptions.
4.Dependency on Complete Knowledge of Protected Images: Building on the previous point, does the anti-personalization user need to have full access to all the protected images to construct negative samples? If only partial knowledge (e.g., a subset of images) is available, will the effectiveness of protection degrade significantly?
5.The theoretical justification provided in the paper appears trivial and redundant. Based on Equations 5 and 8, the gradient directions of Lper and Lprotect are nearly opposite, so it is naturally impossible to optimize them simultaneously. This result seems self-evident and may not require formal proof.
6.Strong Assumption on Identifier Consistency: The paper assumes that the unique identifier (e.g., "person") and the generic prompt such as "a photo of [V*] person" are shared between the personalization and anti-personalization users. However, this is a strong assumption. In practice, defenders are unlikely to know the exact identifier used by downstream users, who may instead use terms like "girl", "woman", or "guy". If the identifiers are inconsistent, the protective effect could be significantly weakened.
7.Need for Visual Comparison in Table 2: The protection effectiveness shown in Table 2 would benefit from visual comparisons. Since there are still some noticeable differences in quantitative metrics between APDM and the standard diffusion model (SD), visual examples could help illustrate these differences more intuitively.
8.Scalability to Multiple Identities: Can APDM protect multiple identities within the same model? Deploying a separate model for each protected identity is impractical. The paper should clarify whether APDM can scale to handle multiple target identities simultaneously.
问题
-
Is the novelty sufficient? The idea of using model fine-tuning for anti-personalization is not new, with prior works (e.g., PersGuard) exploring similar directions. How does this work differentiate itself in a meaningful way?
-
What are the threat model and assumptions? The paper lacks a clear definition of the attacker and defender capabilities, the conditions under which the defender controls the model, and whether these assumptions are realistic in practice.
-
Does the method require full knowledge of protected images and consistent identifiers? If the defender does not have access to all protected images or if the attacker uses different identifiers during personalization, will the effectiveness of APDM degrade significantly?
-
Can APDM scale to protect multiple identities within a single model? The practicality of deploying APDM depends on whether it can handle multiple identities simultaneously, as maintaining separate models for each identity is not feasible.
局限性
Yes
最终评判理由
The author solved my problem, I decided to increase my score to 4 points.
格式问题
NA
We appreciate Reviewer y2uu for your constructive feedback. We hope our response effectively addresses your concerns and questions.
W1: About novelty
As discussed in the manuscript, our work is motivated by a key limitation of existing data-poisoning approaches: they require access to all of a subject's images to be effective. This is often impractical, as controlling every image of a subject on the internet is impossible. As shown in Table 1 of our main paper, the presence of even a single unprotected image can cause the protection to fail.
PersGuard operates under the same assumption of requiring full data access for its attack, thus sharing the same critical limitation as previous data-poisoning approaches. Furthermore, the authors of PersGuard note that their method's effectiveness is reduced by simple prompt variations (their "Gray-box setting"), whereas APDM demonstrates robustness not only to prompt variations but also to the presence of clean images and even entirely unseen images (as shown in Tab. R1 in our response W2).
W2 & W4: Clarification of anti-personalization process and results on the unseen data
(Process Clarification) Our work is grounded in a practical scenario where a service provider offers a diffusion-based personalization service to users. Within this system, a malicious user would naturally attempt to personalize our protected APDM using the provided by the service—which we model using the powerful and widely-adopted DreamBooth. In addition, we also conducted with other personalization methods, Custom Diffusion, and represent the results in Table 3 in the supplementary.
(Robustness to Unseen Data) In our paper, the images used for personalization are the same as those used for our protection process, i.e., negative samples in DPO. However, APDM also demonstrates robustness regardless of the number of unseen images. As shown in the Tab. R1, APDM can protect the personalization with 4-12 unseen images, which contain the protected target, i.e., not seen in the training.
Table R1. Personalization with unseen data.
| Method | # of unseen | DINO(↓) | BRISQUE(↑) |
|---|---|---|---|
| DreamBooth | - | 0.6869 | 16.69 |
| APDM | - | 0.1375 | 40.25 |
| APDM | 4 | 0.1616 | 38.14 |
| APDM | 8 | 0.1994 | 38.87 |
| APDM | 12 | 0.1873 | 38.87 |
(Robustness to Extended Fine-Tuning) Additionally, we test APDM on a larger number of personalization iterations (Tab. R2). As seen below, we found that APDM is still robust to prolonged fine-tuning.
Table R2. Personalization with larger iterations.
| Method | # iters | DINO(↓) | BRISQUE(↑) |
|---|---|---|---|
| APDM (paper) | 800 | 0.1375 | 40.25 |
| APDM | 1600 | 0.1671 | 39.85 |
W3: Threat Model and Assumptions
(Knowledge and Capabilities of Users)
- Malicious User: We assume a strong adversary. They can attempt personalization using either clean images (if leaked/off‑platform) or perturbed images after bypass transforms (e.g., flip, blur). During inference, they have full freedom to vary prompts and can use any identifier (e.g., t@t), not just the default sks token.
- Protector (Service Provider): The provider's capability is demonstrated by applying protection effectively across various models (e.g., SD v1.5, SD v2.1) and against multiple personalization methods (e.g., DreamBooth, Custom Diffusion). The negative samples used for protection are proprietary to the provider and are not exposed to any user.
(Defender's Control Conditions) The core condition under which the defender has control is a provider-controlled deployment. In this model, a user requests protection for their identity, and the provider updates its model to a protected version. The provider maintains full control over the model's lifecycle.
(Realism of this Assumption) We argue that this assumption is highly realistic in many practical scenarios. Service providers inherently control their own models and deployment environments. They are responsible for the service they offer, but not for user activity outside their platform. This setting is a practical expansion of the "uncontrolled" setting considered in Anti-DreamBooth.
Furthermore, in the rebuttal, we also add studies on variant class names, multi‑subject protection, and unseen‑data protection; these results highlight APDM’s robustness, and we will include them in the final version.
W5: About theoretical analysis
The goal of our theoretical justification is to provide a formal, rigorous proof that moves beyond intuition, and we would like to elaborate on its non-trivial nature. The reviewer's intuition seems to be that since converges, () must naturally fail. However, this high-level view oversimplifies the problem and overlooks the core of our analysis.
The convergence behavior of is not self-evident for a critical reason: is not simply the negative of . The key difference lies in the shared perceptual loss term, . The presence of the positive term in creates a complex optimization dynamic. It acts as a regularizer pulling the optimization in a different direction from the . Therefore, it is not immediately obvious whether the optimization will:
- fail to converge and oscillate,
- converge to a trivial solution where dominates, or
- find some other unintended local minimum.
Our proof provides the answer to this ambiguity. It is not trivial because it mathematically analyzes this internal conflict. We prove that the necessary conditions for convergence (derived in Proposition 1) lead to mutually contradictory requirements on the gradients (Theorem 1, Eqs. 11-12). This specific mathematical deadlock is the non-obvious reason for its failure. Furthermore, this naive approach is a well-motivated and relevant baseline. Similar gradient ascent techniques are widely adopted in fields like privacy protection to remove specific knowledge from models. Our analysis is therefore valuable as it provides a formal explanation for the limitations of such intuitive approaches.
Thus, our analysis is essential. It moves beyond the simplistic "it must fail" intuition to provide a formal, rigorous proof of why and how it fails, justifying the need for a more principled approach like ours.
W6: Protection when the identifier (class name) changes
Table R3. Variation of identifier (class name).
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.6869 | 16.69 |
| APDM ("person") | 0.1375 | 40.25 |
| APDM ("(wo)man") | 0.1084 | 38.93 |
We investigate the effectiveness of APDM when the identifiers (class name) are different between the protection and personalization. Note that, for APDM, the identifier "person" is used for protection. As shown in the Tab. R3, APDM shows its robustness to identifier changes (third row), as well as to changes on the unique identifier (Table 5 in the supplementary).
W7: Additional qualitative results
Due to the policy, we cannot attach a visual comparison in this rebuttal. Nevertheless, we argue that APDM can still produce high-quality and text-aligned images compared to the pre-trained Stable Diffusion. We will attach a visual comparison in the camera-ready version.
W8: Multiple identities scenario
We completely agree that deploying a separate model for each identity is impractical, and a multi-identity solution is vital for any real-world application. While we designated this as a key area for future work (as mentioned in Supplementary G), we were motivated by the reviewer's comment to conduct a new preliminary experiment. Our goal was to assess the inherent robustness of our current, unmodified APDM in a multi-identity scenario. To this end, we constructed a challenging negative set () with four distinct subjects (cat, sneaker, glasses, clock.) and applied our method without any tailoring. For the purpose of comparison, the results of PAP are also presented, for which the evaluation was performed on the perturbed data (the "# of clean images: " setting in our main paper).
Table R4. Multi-subject protection.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.5833 | 20.32 |
| PAP | 0.5699 | 25.72 |
| APDM | 0.3924 | 32.97 |
These results highlight two critical points regarding scalability:
(Effective Multi-Identity Protection) Despite the increased complexity of protecting four subjects simultaneously, APDM still outperforms the comparison. The resulting DINO score of 0.3924 is significantly better than the unprotected baseline (0.5833) or PAP (0.5699), demonstrating that our core protective mechanism is not limited to a single identity.
(Inherent Robustness and Scalability) Crucially, this performance was achieved without any specific modifications to our algorithm, even in step (i.e., same training time). This showcases the inherent robustness of our approach, suggesting it is not a fragile, single-purpose solution but a generalizable framework with the potential to scale.
In summary, while a fully-optimized multi-identity model remains an important future direction, this experiment confirms that APDM's fundamental design is not inherently limited to a single identity. It directly addresses the reviewer's concern about practicality by demonstrating the robustness and scalability of our core method. This provides a strong foundation for the future work we have planned.
The author solved my problem, I decided to increase my score to 4 points. Good luck.
Dear reviewer y2uu,
We sincerely appreciate your positive evaluation of our responses. We are pleased to hear that our responses have been satisfactory.
Thank you again!
This paper proposes APDM to prevent the model from personalization, thus protecting the privacy of personal images. The optimization of APDM is conducted by the introduced Learning to Protect process, which iteratively optimizes the parameters of the protected diffusion model. During the protection, the Directed Protective Optimization loss is applied to maximize the distance between negative samples of the original model and the protecting model while minimizing it between positive samples. The DPO loss is combined with a ppl loss to preserve the general performance of diffusion models, constructing the protection loss to update the model parameters. Results show that APDM can not be personalized with the protected identity, even when all training images are clean, while protection methods that perturb images failed to provide successful protection.
优缺点分析
Strength
- The paper provides a novel solution to protect the privacy of personal images, which could give new insights into the privacy/copyright protection issues, as the previous perturbation-based protection lacks practicality in real-world scenarios.
- The results of the protection performance are strong. APDM is capable of preventing personalization with all clean images while almost retaining the original performance.
Weakness
- The robustness of APDM is not explored. The paper assumes 4-6 images like DreamBooth, but determined adversaries might use 100+. Is the protection still effective under these vastly larger subject sets?
- The assessment on preserving personalization capabilities focuses on different subjects; it is unclear whether it can personalize to similar subjects of the same categories, e.g., the protected subject and the personalization subject are both male persons, but the identity is different.
问题
See weaknesses.
局限性
yes
格式问题
N/A
We appreciate Reviewer gqhK for the positive feedback. We are glad the reviewer recognized both the novelty of our practical approach and the strength of its empirical results. We address gqhK's comments and questions below.
W1: Using 100+ images for personalization
To directly address this concern, we conducted an additional experiment using a much larger dataset of 100 images for personalization. For this experiment, we manually collected a dataset of 100 images of a public figure from various online sources. We then evaluated APDM against DreamBooth and PAP in this setting. The results for PAP were generated using all 100 perturbed images, which corresponds to the most favorable configuration for PAP (the "# of clean images: " setting in our main paper).
Table R1. Personalization with 100 images.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.7983 | 1.06 |
| PAP | 0.7811 | 5.96 |
| APDM | 0.3747 | 35.01 |
The results clearly demonstrate that APDM remains highly effective even in this more challenging "100-image" scenario. APDM's DINO score (0.3747) indicates significantly stronger protection compared to both unprotected DreamBooth (0.7983) and PAP (0.7811). Furthermore, the performance gap over PAP is even more pronounced in the BRISQUE (35.01 vs. 5.96). This confirms the superior robustness of our method.
[1]: To mitigate potential privacy and copyright concerns, this collected set was used exclusively for this rebuttal experiment. The data was handled securely and has been permanently deleted after use.
W2: Personalization on the same categories
To answer this question, we represent results for personalizing subjects from the same categories (person → person). In this experiment, we protected one female subject and personalized three other distinct female subjects (ID1-3). We report quantitative results following the evaluation protocol of DreamBooth.
(Training ID and ID1-3 are white-female persons, but the identities are different.)
Table R2. Personalization on the same subject.
| Method | DINO-ID1 | DINO-ID2 | DINO-ID3 | CLIP-ID1 | CLIP-ID2 | CLIP-ID3 |
|---|---|---|---|---|---|---|
| DreamBooth | 0.6016 | 0.6471 | 0.6672 | 0.6602 | 0.6919 | 0.6611 |
| APDM | 0.6303 | 0.6383 | 0.6877 | 0.7134 | 0.6943 | 0.6460 |
As shown in the Tab. R2, APDM effectively maintains high personalization performance even for subjects within the same category (female → female). The DINO and CLIP scores are comparable. This demonstrates that our method successfully preserves the model's utility for personalizng new, similar subjects while protecting the target identity.
Thanks to the authors' detailed responses and extended results. The author has addressed some of my concerns, and I'll keep my score as it already indicates acceptance.
Dear reviewer gqhK,
We are grateful for your considerate follow-up and for reviewing the rebuttals. We appreciate that your concerns have been addressed. If any additional clarifications are needed, we hope you will let us know.
Thank you again for your positive feedback!
This paper introduces the Anti-Personalization Diffusion Model (APDM), a framework designed to prevent the unauthorized generation of specific individuals' images. Instead of protecting specific images, APDM aims to protect the diffusion model, making it inherently resistant to personalization. To achieve this, the framework utilizes a novel loss function called Direct Protection Optimization (DPO) to block personalization without harming the generation quality of the model. Moreover, they propose an adaptive training strategy, Learning to Protect (L2P), to strengthen the protection performance in potential personalization. Extensive experiments show that this work achieves state-of-the-art protection performance.
优缺点分析
Strengths:
- The motivation is reasonable and novel. Unlike methods that fail with clean images or simple transformations, this framework's model-level protection is inherently more robust and reliable for real-world use.
- The proposed DPO loss function and L2P optimization strategy are interesting and effective.
- Extensive experiments demonstrate that APDM achieves state-of-the-art performance in preventing unauthorized personalization.
- This paper is well-written and easy to follow.
Weaknesses:
- This paper lacks a comparison of protection efficiency. Given that the method requires training a diffusion model, the computational cost could be a significant concern. The 9-hour training time per identity makes the approach impractical for large-scale protection. A discussion or a comparison of the efficiency against other methods is needed.
- The comparison should be expanded to include recent protection methods, such as Mist [1] and SDS [2]. It is better to include more baseline methods in Table 1.
- The scalability of APDM is a concern. What if protecting multiple identities simultaneously? An experiment is needed to evaluate its performance when protecting multiple identities simultaneously.
[1] Liang, Chumeng, and Xiaoyu Wu. "Mist: Towards improved adversarial examples for diffusion models." arXiv preprint arXiv:2305.12683 (2023).
[2] Xue, Haotian, et al. "Toward effective protection against diffusion-based mimicry through score distillation." The Twelfth International Conference on Learning Representations. 2023.
问题
Please see the weakness part.
局限性
yes
最终评判理由
Thanks for the detailed rebuttal. My questions have been addressed so I decided to keep my rating.
格式问题
N/A
We appreciate Reviewer qkFB's positive evaluation about our motivation novel, our methodology effective, and our experimental results. We hope our response addresses your concerns and questions.
W1: Discussion about protection efficiency
To address this concern, we evaluate our approach against baselines based on both theoretical complexity and practical cost.
(Theorectical Complexity) The primary distinction lies in their scaling properties. APDM requires a one-time, fixed training cost (9 GPU-hours), making its computational complexity as it is independent of the number of images to be protected (). In contrast, data poisoning methods require per-image processing, leading to an complexity where the total cost grows linearly with .
(Practical cost) The real-world implications of this difference are significant. The Tab. R1 shows the processing times of existing methods:
Table R1. Time consumption.
| Method | AdvDM | Anti-DB | SimAC | PAP |
|---|---|---|---|---|
| Time | 262s | 288s | 594s | 297s |
As the Tab. R1 illustrates, they require approximately 5 minutes to poison the data. This allows for a direct cost-benefit analysis: APDM becomes cost-effective once the number of images () exceeds 110. In real-world applications driven by user-generated content like social media, where image sets are not only large but continually expanding, the scalability of our approach offers a substantial advantage in overall efficiency. We will add a detailed discussion of this complexity analysis to the final manuscript.
W2: Additional comparisons
As requested, we conducted additional experiments comparing APDM with the suggested methods, Mist and SDS. We report the results on the containing a single perturbed image setting (the "# of clean images: " setting in main table).
Table R2. Additional comparisons.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.6162 | 10.77 |
| Mist | 0.5903 | 30.74 |
| SDS | 0.5194 | 20.03 |
| APDM | 0.0559 | 45.87 |
Our findings reveal two crucial points:
(Shared Vulnerability) In this setting, the new baselines are also highly vulnerable. This confirms that the presence of clean images is a critical challenge for existing data-level protection methods.
(Superior Robustness) APDM continues to significantly outperform all baselines, demonstrating its superior robustness again.
W3: Multiple identities scenario
We agree with the reviewer that protecting multiple identities is an important direction, and we have noted this as future work in Supplementary G. However, to proactively explore this, we conducted a preliminary experiment to assess the robustness of APDM in such a scenario. For this, we constructed a challenging multi-concept negative set () with four distinct subjects (cat, sneaker, glasses, clock) and applied our method without modifications. For comparison, we evaluated PAP using its most favorable configuration (the "# of clean images: " setting in our main paper).
Table R3. Multi-subject protection.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.5833 | 20.32 |
| PAP | 0.5699 | 25.72 |
| APDM | 0.3924 | 32.97 |
The results demonstrate two key aspects of our method:
(Effective Protection) APDM maintains substantial protective capabilities, achieving a DINO score of 0.3924—far superior to both unprotected DreamBooth (0.5833) and data poisoning approach PAP (0.5699).
(Remarkable Robustness) This strong performance was achieved with an unmodified APDM, showcasing its inherent ability to generalize its protective mechanism to a task of greater complexity.
Therefore, this experiment not only confirms the strong potential for multi-identity protection but also validates the robustness of its core design. We believe this serves as an excellent starting point for developing a dedicated multi-concept protection method in our future work.
Thanks for the detailed rebuttal. My questions have been addressed so I decided to keep my rating.
Dear reviewer qkFB,
Thank you for the update and for carefully reviewing our rebuttal. We appreciate your time and effort. We will consider your suggestions to our final version.
Thank you again!
The paper attempts to provide protection against personalization by focusing on the model than on poisoning the data. So the goal is to obtain a protected model which will be unable to apply personalization on clean images without the need of poisoned data. First, it shows a simple modification to the current loss function used for training models is not suitable for this purpose. A new loss function along with a novel optimization algorithm is provided which ensures that the model weights are modified so that it will not be able to personalize the targeted concept but will work properly for non-targeted concepts. Thereby, achieving the goal of obtaining a protected model which should not be able to apply personalization to a fixed entity, but it should still be able to retain its generative capability and personalize other entities.
优缺点分析
Strengths
- Shift the focus of protection against personalization by text-to-image diffusion model from data level poisoning to model level change.
- Provide a proof explaining why the a naive change of direction of the Reconstruction loss in the total loss function used in DreamBooth (training loss comprises of Reconstruction loss and Class-Specific Prior Preservation loss) will not lead to convergence of the optimization function.
- Provide an alternate loss function: Direct Protective Optimization (DPO) loss for personalization protection while preserving generation capability.
- Provide a dual-path optimization strategy (Learning to Protect (L2P)) to enhance the optimization of the new loss function.
- The empirical results based on few subjects are shown to work well on the clean images, thereby supporting the claim that the protected model can be trained with effective protection to the concept.
- The method is shown to work even with different prompts.
Weaknesses
Conceptual:
- A major concern is with the main approach of perturbing the model than the data. L135: Clearly states that the adversary needs to use the model with the X which is impractical. The adversary scraping the internet and accessing the poisoned data is more realistic than enforcing the adversary to ONLY use the altered model which seems unrealistic.
- L48-50: Instead of changing the model, the service provider could add the protective watermark and return it to the user.
- Many of the concerns raised in Supplementary F could be circumvented by the service provider watermarking the data which could further prevent other models from personalization.
- There needs to be solid supporting study on the computational expense comparing data poison vs model change to support this work's approach even for the few concept case studied in this paper.
- L48-50: Instead of changing the model, the service provider could add the protective watermark and return it to the user.
- Currently this paper appears to be work in progress as the results are with one or two concepts while multi-concept is considered future work.
Theoretical:
- Notation wise equations (2) and (3) appear to be the same. Equation (3) does not reflect the inclusion of a new text token for personalization. Other than a change in the loss symbol or , the expression on the right appears to be the same. Are the notations fine or any change needed?
Empirical:
- To make the empirical results stronger new methods like ACE[2] and CAAT[3] need to be included to obtain results in Fig 3 and Table 1.
- While results for APDM are provided with clean images, how the model responds to perturbed data either adversarial or gaussian noise needs to be included.
- For the perturbation/data poisoning based methods, an important parameter of perturbation budget is currently missing in this paper. What was the noise budget used to conduct these studies?
- There is quantitative study in Table 3, can the paper provide qualitative results with subjects not protected, to show personalization of other subjects is still possible and the generated images are visually good?
Writing:
- Recent works [1], [2] need to be included to keep the literature updated.
References:
[1] Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI - Robert Hönig, Javier Rando, Nicholas Carlini, Florian Tramèr - ICLR 2025
[2] Targeted Attack Improves Protection against Unauthorized Diffusion Customization - Boyang Zheng, Chumeng Liang, Xiaoyu Wu - ICLR 2025
[3] Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models - Jingyao Xu, Yuetong Lu, Yandong Li, Siyang Lu, Dongdong Wang, Xiang Wei - CVPR 2024
问题
- Assuming the user (instead of self applying protection as assumed in Supplementary L217-226) has access to service provider who returns protected images by watermarking, then what would be the strong reason for the approach of perturbing the model followed in this paper.
- Can the authors provide a clear computation study with a simple case comparing data poison vs model change approach to observe the advantage of one over the other?
- In the paper APDM's results are presented only with clean images, can the result with perturbed images also be shown.
局限性
Yes
最终评判理由
I am supportive of the paper's approach about focusing on the model than the data, which is different as largely efforts are on data poisoning/watermarking. Despite finding the current results as a work in progress as the paper's main results are with single concept and multi-concept is considered as future work, willing to support the approach.
Hoping the paper is updated with experiments to keep it up to date with the current best methods (eg. using recent methods like ACE and CAAT, along with lower noise budget) and incorporates all the relevant suggestions by the reviewers.
I am raising the score from 3 to 4.
格式问题
No
We appreciate Reviewer 3bmf for your constructive feedback. We hope our response effectively addresses your concerns and questions.
W1: Necessity of model-level protection
It is important to clarify who is responsible for this privacy risk. Under regulations such as GDPR, the service provider—not end users—must guarantee protection for their service model. Crucially, providers already control the life‑cycle of the model and they can control the risk by updating their model. In this circumstance, the model-level protection is a highly practical solution.
Even if your suggestion (i.e., watermarking) can solve the user burden, it still has some limitations: (i) newly uploaded content from other sources remains unprotected, and (ii) simple transforms can weaken the watermark, reviving the privacy risk (Bypass issue).
The concerns in Supplementary F (and illustrated in Figure 1) are motivated by the above problems. In those part, we identify four key problems, and service-provider-side solutions like watermarking fail to address three of them. As we mentioned above, watermarking has a limitation about (i) and (ii). Furthermore, another critical problem also arises in the service phase: the provider has no way of knowing which user-provided data requires protection (conflict with regulation). In other words, if malicious users use clean images, the watermarking approach cannot handle this. For this reason, with watermarking, the service provider cannot fulfill their responsibility to protect sensitive data.
This analysis underscores the necessity of a model-level approach like APDM. However, this does not render data perturbation methods obsolete. As you mentioned, methods like watermarking are effective in out-of-service cases. However, they cannot fully address the protection challenges, such as the new content and bypass problems, which APDM is designed to solve. Therefore, a complementary deployment of data‑level and model‑level protections is the most efficient and robust strategy.
W2: Discussion about computational cost
To provide a solid analysis of the computational expense, we compare our approach with existing methods by addressing the fundamental difference in their computational complexity and providing quantitative results.
The core difference lies in the scaling properties. APDM requires only a fixed cost (9 GPU-hours) for protection, which is independent of the number of images to be protected. Therefore, its complexity is . In contrast, poisoning approaches require processing each image individually. While the per-image cost is relatively low, the total cost scales linearly with the number of images (), resulting in an complexity.
Table R1. Time consumption.
| Method | AdvDM | Anti-DB | SimAC | PAP |
|---|---|---|---|---|
| Time | 262s | 288s | 594s | 297s |
As shown in the Tab. R1, existing poisoning approaches need nearly 5 minutes to process their methods. This allows for a direct computational cost comparison. APDM becomes more cost-effective when the number of images () exceeds 110. In real-world applications, such as social media platforms or cloud services where users continuously upload new photos, is not only large but also constantly growing. In such scenarios, APDM is significantly more efficient in terms of overall cost. We will add a detailed discussion on this computational complexity analysis to the final manuscript to clarify this point.
W3: Multi-concept protection
Table R2. Multi-subject protection.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.5833 | 20.32 |
| PAP | 0.5699 | 25.72 |
| APDM | 0.3924 | 32.97 |
As noted in Supplementary G, we aim to extend our work to multi‑concept protection scenarios. Motivated by your comment, we conducted a preliminary experiment to test the robustness of our current APDM on a multi-concept task. For this experiment, we constructed a multi-concept negative set () by combining samples from four distinct subjects (cat, sneaker, glasses, clock) and applied our APDM without any modification, treating them as a single conceptual group. For comparison, we reported the results of PAP using all perturbed data (the "# of clean images: " setting in our main table) for the evaluation. As shown in Tab. R2, APDM still provides substantial protection in the multi-concept setting compared to PAP. This demonstrates the inherent robustness of our approach, even for a task it was not originally designed for, and highlights the potential of APDM for future work.
W4: Eq. (2) and (3)
We thank you for this suggestion. While we initially adopted the notation from the original DreamBooth paper for consistency, we agree that Eq. (3) can be improved. We will revise Eq. (3) in the final version for better clarity by explicitly representing the personalized text token ().
W5: Additional comparisons
Table R3. Additional comparisons.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.6162 | 10.77 |
| ACE | 0.6846 | 28.18 |
| CAAT | 0.5825 | 22.71 |
| APDM | 0.0559 | 45.87 |
We present additional results in Tab. R3. We report the results for the setting that contains a single perturbed image (the "# of clean images: " setting in our main table). The results show that APDM still outperforms these stronger competing methods.
W6: Response to perturbed data
Table R4. Personalization with perturbed data.
| Method | DINO(↓) | BRISQUE(↑) |
|---|---|---|
| DreamBooth | 0.6869 | 16.69 |
| Anti-DreamBooth | 0.5646 | 22.50 |
| APDM | 0.1375 | 40.25 |
| APDM (perturbed) | 0.1702 | 40.20 |
To answer this question, we conducted an additional experiment. We generated perturbed data using the Anti‑DreamBooth method. As shown in the Tab. R4, APDM is robust to input data, even when the data are perturbed. Furthermore, APDM with perturbed data still outperforms the original Anti-DreamBooth with a large margin, even when Anti-DreamBooth utilized all perturbed images (the "# of clean images: " setting in our main table). This demonstrates the powerful protective performance of APDM.
W7: Noise Budget
As mentioned in Line 225 (Experimental Setup) in our main paper, the perturbation intensity (noise budget) was set to 5e-2. We selected this value based on the Anti-DreamBooth paper, and for a fair comparison, this value was applied to all baselines.
W8: Qualitative results in Table 3
Table R5. Additional quantitative results.
| Method | CLIP-I(↑) | CLIP-T(↑) |
|---|---|---|
| DreamBooth | 0.7576 | 0.2374 |
| APDM | 0.7961 | 0.2329 |
Unfortunately, we are unable to provide qualitative figures due to the rebuttal policy. However, we argue that Table 3 in our main paper already provides strong evidence to address this concern. The DINO score is a key metric that directly evaluates the success of personalization. The fact that APDM outperforms DreamBooth on this score indicates that APDM both (i) successfully personalizes the unprotected subjects and (ii) avoids over‑fitting to the protected target. Because APDM achieves a higher score, any potential degradation in quality is minimized. Furthermore, as shown in the Tab. R5, APDM also achieved competitive scores on both CLIP‑I and CLIP‑T, indicating that its outputs remain well-aligned in both the image and text domains. We hope this explanation resolves your concern.
W9: About Recent works (Writing)
Thank you for the suggestions. We will update the camera-ready version to include these recent related works as follows:
- [1] investigates the vulnerability of existing protection methods. We will cite this work to provide stronger support for our problem definition, particularly for the Bypass problem.
- [2] proposes a targeted attack approach. As we have already demonstrated in Tab. R3 of this rebuttal, this method's effectiveness is still limited in our scenario. We will incorporate [2] as an advanced baseline in our comparisons to further highlight the robustness of our method.
[1] Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI - Robert Hönig, Javier Rando, Nicholas Carlini, Florian Tramèr - ICLR 2025
[2] Targeted Attack Improves Protection against Unauthorized Diffusion Customization - Boyang Zheng, Chumeng Liang, Xiaoyu Wu - ICLR 2025
Thank the authors for the detailed reply to all the questions raised. In view of the rebuttal, the model level protection is intriguing, find myself more positive about the approach and would update final justification after finer consideration of the points raised by other reviewers.
If time permits(apologize for the last minute request), would appreciate if the authors could answer this follow-up question on reply W7: While results are obtained at noise budget 5e-2 based on Anti-Dreambooth, more recent work like ACE achieve good results at ~2e-2 noise budget. Will it be possible to provide minimum results (ACE and APDM) at the noise level 4/255 (similar to W5).
Thank you for your positive remarks about our motivation and approach, model-level protection. We appreciate your openness to revisiting your evaluation, and we hope our additional clarifications and extended results are helpful as you finalize your decision.
We also appreciate your thoughtful follow-up question and are happy to get an opportunity to clarify this. As requested, we conducted an experiment on ACE with noise level 4/255, and other experimental settings remain the same as before. The results show below:
Table R6. Additional results on the various noise budgets.
| Method | Noise Budget | DINO(↓) | BRISQUE(↑) |
|---|---|---|---|
| DreamBooth | - | 0.6162 | 10.77 |
| ACE | 5e-2 | 0.6846 | 28.18 |
| ACE | 4/255 ( 1.56e-2) | 0.5793 | 29.26 |
| APDM | - | 0.0559 | 45.87 |
As shown in the Tab. R6, although ACE is robust at smaller noise budgets, even better than noise level 5e-2, it still inherits the intrinsic drawbacks of perturbation-based methods. In contrast, APDM still outperforms all of these ACE variants. This experiment demonstrates that, despite tuning, perturbation defenses remain constrained, while APDM provides a robust model-level protection.
If you have any questions or need further clarification, please let us know at any time.
Appreciate the authors quick response with additional empirical results to observe ACE results at lower budget of 4/255 and compare with APDM. I have no more follow-up questions. Thank you.
Dear reviewer 3bmf,
Thank you for your thoughtful review. We appreciate your time and feedback.
Thank you again!
This paper introduces Anti-Personalized Diffusion Models (APDM), a framework that attempts to provide protection against personalization by focusing on the model rather than on poisoning the data.
Reviewers consistently highlighted the novel solution of model-level protection, the clarity of writing and presentation, and the strong empirical performance. Reviewer 3bmf also recognized the theoretical analysis as a notable contribution, while qkFB praised the paper’s readability and overall organization. Reviewers generally agreed that the work tackles a more challenging and practically relevant task compared to data-level perturbation methods.
During review, concerns centered on scalability (multi-subject protection), computational efficiency, broader comparisons, and generalizability (including perturbed data, large-scale personalization, prompt variations, and unseen data). The authors provided additional experiments—including 100+ image personalization, multi-subject protection, comparisons with recent baselines, and efficiency analysis—along with clarifications of assumptions and theoretical contributions. Reviewers acknowledged that these responses satisfactorily resolved their major concerns, with no further objections raised.