Hidden in the Noise: Two-Stage Robust Watermarking for Images
摘要
评审与讨论
The paper presents a novel two-stage watermarking framework, named WIND, for images generated by diffusion models. It leverages the initial noise used in the diffusion process as a distortion-free watermarking method. The approach aims to enhance the robustness of watermark detection against both removal and forgery attacks by embedding group identifiers through Fourier patterns, thereby improving the efficiency of watermark retrieval.
优点
The paper presents a novel two-stage watermarking framework, named WIND, for images generated by diffusion models. It leverages the initial noise used in the diffusion process as a distortion-free watermarking method. The approach aims to enhance the robustness of watermark detection against removal and forgery attacks by embedding group identifiers through Fourier patterns, thereby improving the efficiency of watermark retrieval.
缺点
(-) Some parts of the paper are overclaimed. For example, in Lines # 78-84, these contents are already addressed in Tree-ring [1], yet the author seems to want to claim it as their contribution. The introduced method is built upon Tree-ring [1] and [2]. However, the introduction does not discuss these works [1-2].
(-) The problem of the previous method and the motivation of this work are not clearly delivered. The relationship between the third and fourth paragraphs is confusing and lacks logic.
(-) The effectiveness of the proposed method may be limited to specific types of diffusion models, and its performance on other diffusion models remains untested.
(-) Although the paper suggests methods to reduce runtime, the need for searching through a large number of initial noises could still pose challenges in real-time applications, particularly on resource-constrained devices.
Some typos:
For example, in lines 88 and 137, add ( ) for the citations.
[1] Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust. In NeurIPS 2023.
[2] Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious? In NeurIPS 2024.
问题
How does the performance of the WIND framework compare with existing state-of-the-art watermarking techniques when applied to diverse generative models beyond diffusion models?
What specific measures have been taken to ensure that the watermarking process does not compromise the visual quality of the generated images?
Could the authors elaborate on the potential implications of attackers successfully inverting the model and how that might affect the watermarking effectiveness?
Thank you very much for your detailed review. We appreciate that you found our method novel, and recognized our enhancements of efficiency and robustness. We address each of your questions below:
“Some parts of the paper are overclaimed … The introduced method is built upon Tree-ring [1] and [2]. However, the introduction does not discuss these works [1-2].”
We thank you for pointing this out. We had previously moved some of the discussion of prior works to the appendix due to page limitations. Yet, we see how this might have resulted in an overclaim. To avoid overclaiming and better acknowledging prior works, we revised the introduction according to your comment (please see the revised manuscript).
“The problem of the previous method and the motivation of this work are not clearly delivered.”
The main problem in the Tree-Ring method we aim to improve is the vulnerability to watermark forgery (in the black-box setting) and to watermark removal (in the gray-box setting) [2]. Additionally, our method allows the usage of an order of magnitude more watermark identities: different watermark instances allow embedding more meta-data on a given image to allow better validation of the image source.
We revised the manuscript according to your comments to make the motivation clearer.
“The effectiveness of the proposed method may be limited to specific types of diffusion models” , “WIND framework compare with existing state-of-the-art watermarking techniques when applied to diverse generative models beyond diffusion models?”
We thank the reviewer for raising this point. We are happy to report that the diffusion model we investigated is already enough to watermark images from other sources successfully. We report the robustness of the watermark applied to other models below:
| Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|
| 1.000 | 1.000 | 1.000 | 0.880 | 1.000 | 0.950 | 0.950 | 0.969 |
Other watermarking techniques are not as robust to these attacks [2,3,4].
We also expect our watermark to be effective directly for any model for which some inversion to the original noise is possible. We validated that also images generated by another model, (SD 1.4 [1]). It shows that our watermarking method can detect the correct watermark (initial noise) among 10000 watermark identities with a rate of 97% on it.
We add this discussion to the revised manuscript.
“the need for searching through a large number of initial noises could still pose challenges in real-time applications, particularly on resource-constrained devices.”
Generally, private-key watermarking is less common as real-time applications on edge devices. In any case, a user interested in reducing the resource constraints may use a smaller value of N, allowing a much leaner search, while maintaining the other properties of our model when generating sets of up to N images.
“What specific measures have been taken to ensure that the watermarking process does not compromise the visual quality of the generated images?”
Our method relies on using an initial random noise, drawn from the same distribution of initial noises already used by the model. Therefore, the core of our method is not compromising the visual quality of the generated images at all.
The only effect on visual quality comes from the group identifier stage, where we use existing off-the-shelf watermarking images. In our implementation, we used the RingID [5] method that slightly distorts the generation. We reported that the FID of our model is the lowest (24.33) among different watermarking methods. We further report that our watermark has a negligible impact on the CLIP score:
| CLIP Before Watermark | CLIP After Watermark |
|---|---|
| 0.366 | 0.360 |
When a model owner wishes to preserve image quality even better, they may use any other existing watermarking method for the group identifier embedding. This will still not compromise the security provided by the initial noise stage. A discussion was added in App. D in the revised manuscript.
“Could the authors elaborate on the potential implications of attackers successfully inverting the model”
Yes. As detailed in [6] and briefly covered in Sec. 2 of our paper, accurately inverting the model is as difficult as copying the forward process of the model (image generation). While hard, an attacker able to do so effectively is also capable of generating novel images using the same diffusion process. Therefore, At this stage, the model itself is effectively compromised (and not only the watermark signature). We believe that being as hard to forge as the model itself, is a reasonable level of security for almost all use cases.
Yet, approximately inverting the model might also be a threat. While even approximately inverting a model is very hard, it might be easier than stealing the model. Still, we would like to emphasize that our method is more secure than other diffusion-process-based watermarking techniques. In other techniques, the image distortion itself may allow easier forging [2], even without learning the inversion process. We added this discussion to the revised manuscript.
Thank you very much for your comments. We respectfully ask that if you feel more positive about our paper, to please consider updating your score. If not, please let us know what can be further improved; we are happy to continue the discussion any time until the end of the discussion period. Thank you!
[1] Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[2] Yang, Pei, et al. "Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?." (2024).
[3] Zhao, Xuandong, et al. "Invisible image watermarks are provably removable using generative ai." (2023).
[4] Jiang, Zhengyuan, Jinghuai Zhang, and Neil Zhenqiang Gong. "Evading watermark based detection of AI-generated content." 2023.
[5] Ci, Hai, et al. "Ringid: Rethinking tree-ring watermarking for enhanced multi-key identification." European Conference on Computer Vision. 2025.
[6] Keles, Feyza Duman, and Chinmay Hegde. "On the Fine-Grained Hardness of Inverting Generative Models." (2023).
Thanks for your response. However, this work is too similar to existing work [1]. They both embed copyright messages into initial noise. Most importantly, [1] does not change the distribution of the generated image.
Thus, I cannot raise my score. The author needs to thoroughly discuss the difference between their WIND with established methods like [1,2,3].
[1] Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models. In CVPR 2024.
[2] Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust. In NeurIPS 2023.
[3] Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious? In NeurIPS 2024.
Thank you for acknowledging our rebuttal and continuing the discussion.
We would like to clarify the main differences between our work and [1]:
-
While [1] proposes a watermark that is distortion-free for a single image, it is not distortion-free when examining sets of images (and therefore vulnerable to attacks such as [3]). We aim to be robust to attacks even when images are examined together.
-
To allow using very large values of efficiently, we propose a two-stage framework - allowing a more efficient way to recover the correct key for each image.
-
Our work also studies applying our watermark to non-synthetic (natural) images, or images coming from other generative models.
Yet, we agree that [1,2] are very relevant prior works. We revised our manuscript to better discuss the relation of our paper to these approaches.
We would like to note that [3] is not a watermarking technique, but an attack method. In fact, the main motivation of our work is to propose a watermark that is more robust to vulnerabilities, such as the vulnerability [3] found in previous methods [1] and [2] (and other content-agnostic watermarking methods). Our technique, allowing the use of a very large number of different initial noises, makes our method more robust against such attacks (see for example Fig. 2 in [3], compared to Fig. 3 in our paper).
Thank you once again for your suggestions for our work as well as your follow-up comment!
Thank you for your response and updated manuscript.
The current version is much clearer, and I acknowledge the novelty and insights of this work. This paper points out that the previous method is vulnerable to attacks [1]. This is due to the inconsistency between the distribution of the random noise and the reconstructed noise from the watermarked data. The author should stress this point and the consequence caused by the lower number of initial noises in the manuscript, to better deliver motivation for this work.
For the current version, it would be better if the author can
- Rewrite 87-92 for better connection and logic.
- Redesign Fig. 1, which does not highlight this work; it looks almost the same as previous work [2].
- Carefully check the typos of citations, for example, in line 41, 223
I am willing to increase my rating to 6, good luck.
[1] Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious? In NeurIPS 2024. [2] Tree-ring. In NeurIPS 2023.
We thank the reviewer for continuing the discussion and for their additional suggestions. We are glad that the reviewer has a more positive view of our work.
Please find our answers below:
- We have revised the introduction to better deliver the point regarding the number of initial noises.
- As today is the final day for manuscript revisions, we may not have sufficient time to upload a revised figure. However, we want to assure the reviewer that we have carefully considered their feedback. We will revise the figure in the final version to better deliver motivation for this work and to more clearly differentiate it from prior works.
- We thank the reviewer for pointing this out. We fixed it in the revised manuscript.
Thank you once again for a very dedicated review.
The paper introduces the WIND framework for embedding watermarks in AI-generated images. WIND leverages the initial noise of the diffusion process as a distortion-free watermark, embedding group-specific Fourier patterns in the noise to enable efficient watermark verification. The proposed two-stage approach aims to reduce search complexity, improve robustness against various attacks, and optimize verification accuracy.
优点
- The two-stage approach employed in WIND is similar to a multi-level indexing mechanism. Initially, group-specific Fourier patterns narrow down the search scope (Tier-1), and subsequently, an exact matching process with the original noise ensures precise verification (Tier-2). This hierarchical structure not only enhances efficiency but also serves as a novel adaptation of multi-level indexing in watermarking, supporting scalable, high-speed identification in large datasets.
- The framework demonstrates robustness against several attacks, including forgery, removal, and distortive transformations. The two-tiered verification and Fourier embedding make WIND resilient to various manipulations, ensuring watermark persistence even under adversarial conditions.
缺点
- While WIND presents a promising adaptation of existing noise-based watermarking techniques, it does not fundamentally diverge from the concepts established in Tree-Ring and RingID. The addition of a two-stage detection process is a novel enhancement but builds on the same theoretical foundation of Fourier-based watermarking in noise. The primary innovation lies in the grouping and search-space reduction, which increases efficiency and robustness but does not introduce a fundamentally new approach to noise-based watermarking.
- The multi-stage approach, while innovative, introduces complexity that may hinder ease of adoption. Managing group identifiers and applying Fourier-based transformations within the initial noise may pose challenges for large-scale implementations. Additionally, as the number of watermarked images grows, the demand for managing extensive initial noise records could impact computational scalability.
问题
Please see Weaknesses.
Thank you very much for your detailed review. We appreciate that you found our method novel, and recognized our method's resilience under adversarial conditions. We address each of your questions below:
“The primary innovation lies in the grouping and search-space reduction, which increases efficiency and robustness but does not introduce a fundamentally new approach to noise-based watermarking”
We would like to emphasize what we believe is our main technical novelty. While noise-based watermarking indeed already exists, to the best of our knowledge, it was used to watermark with special patterns embedded into the noise. We found that the random noise itself, already used by the model, could be used as a watermark as well. Surprisingly, we show the initial noise itself can encode more than 100,000 different patterns. This was not known and is the main technical insight driving our work.
We revised the manuscript to give a clearer acknowledgment of prior works.
“The multi-stage approach, while innovative, introduces complexity that may hinder ease of adoption.”
Our multi-stage approach is built for a model owner who wishes to optimize fast detection, robustness, and the ability to distinguish between many different watermarked images.
For a user with simpler demands, we may suggest a simpler approach, that maintains the robustness of our full watermarks: one may simply record all used initial noises, and use an exhaustive search over them (namely, using only the second stage of our method). While slower when watermarking billions of images, this method is much simpler to adopt and is still very effective for watermarking tens of millions of images.
Alternatively, a user may apply a similar algorithm to the one described above using only a few possible random noises. This would replace the distinguishability of many different watermarks with the ability to rapidly and simply detect the watermarked images.
We added this discussion to the revised manuscript.
Thank you again for your excellent comments. We respectfully ask that if you now feel even more positively about our paper, to consider slightly increasing your score. We are happy to continue the discussion at any time until the end of the discussion period. Thank you!
Dear Reviewer,
We sincerely appreciate your valuable feedback and the time you have dedicated to reviewing our submission. Your insights have been instrumental in shaping the final version of our submission.
We would like to kindly remind you that the discussion period is set to conclude on December 2nd. If there are any additional questions, concerns, or clarifications, we would be delighted to continue the discussion.
Thank you once again for your attention. We look forward to hearing from you!
Dear Reviewer,
As the deadline for uploading paper revisions is approaching, we wanted to check if our rebuttal has satisfactorily addressed your concerns. We have carefully considered and responded to all the points you raised in your review.
We would greatly appreciate it if you could provide your feedback on our rebuttal. This would allow us the opportunity to address any remaining questions you might have. Thus far, three out of six reviewers have engaged with us and updated their scores during the discussion period.
Thank you very much for your time and consideration.
The paper introduces a watermarking method using initial noise in diffusion processes as a robust, distortion-free way to mark AI-generated images. The authors propose a two-stage framework called WIND, which employs initial noise and Fourier patterns to facilitate efficient watermark detection and withstand forgery and removal attacks.
优点
-
The methodology is easy to follow.
-
Evaluates several settings.
-
The two-stage approach efficiently narrows down the search space by using Fourier pattern group identifiers, significantly improving detection runtime and scalability
缺点
-
The idea of picking the initial noise for diffusion is not new. And this method changes initial noise a lot compared to Tree-Ring watermark.
-
The evaluation of robustness if not comprehensive.
-
Potential vulnerability to more advanced attacks
问题
-
The organization of the paper appears somewhat disorganized. For example, Section 5.2 and the Algorithms section should be relocated to precede Section 5, which discusses the experiments. This adjustment would provide better coherence, as watermarking non-synthetic images is an integral part of the methodology.
-
The evaluation of different watermarking methods is somewhat limited. The focus of this work is primarily on watermarks that involve selecting initial noise for the diffusion process, while other watermarking techniques, such as HiDDeN, are excluded. The authors should provide an explanation for this choice to give context to the scope of their evaluation.
-
While the paper makes significant claims regarding robustness, the robustness evaluation itself is not comprehensive. Beyond image transformations and regeneration attacks, there are numerous adaptive attacks where the adversary possesses greater capability. Examples include transfer-based, query-based, and white-box attacks such as WEvade. These types of evaluations should be considered to strengthen the robustness analysis.
-
How is threshold determined in the experiments?
Thank you for your review. We appreciate that you found our methodology easy to follow, and recognized our technique as significantly improving scalability. We address each of your questions below:
“The idea of picking the initial noise for diffusion is not new”
While the technique of using patterns in the initial noise indeed follows from previous works, the idea of using the random noise itself (rather than some pattern embedded in it) is novel to the best of our knowledge.
We added a clearer explanation of this distinction in the introduction of the revised manuscript.
“this method changes initial noise a lot compared to Tree-Ring watermark.”
Respectfully, we believe there might have been a misunderstanding here, and we would like to clarify. The random initial noise pattern that we use, is coming from the same random function used by the diffusion model anyhow. Therefore, our main watermarking technique is actually distortion-free and comes from very the same distribution as the original, non-watermarked image.
The pattern we use for group identifiers is the same as the Tree-Ring type pattern. Taken together, the distortion is not larger compared to the Tree-Ring.
“The evaluation of robustness if not comprehensive” “...transfer-based, query-based, and white-box attacks such as WEvade.”
We use the same evaluation as previous similar papers [1,2] and even extend over them in Sec. 3 and 5.
Per the reviewer's request, we also evaluated additional attacks from the types mentioned. We evaluate WIND against additional attacks, including transfer-based, query-based, and white-box methods. Specifically, we employ the WeVade white-box attack [4], the transfer attack described in [6], a black-box attack utilizing NES queries [7], and a random search approach discussed in [5], adopted to attempt watermark removal. The success rates of these attacks are detailed in the table below:
| WeVade | Random Search | Transfer Attack | NES Query |
|---|---|---|---|
| 1% | 2% | 3% | 2% |
We added these results to the revised manuscript.
“For example, Section 5.2 and the Algorithms section should be relocated to precede Section 5, … This adjustment would provide better coherence, as watermarking non-synthetic images is an integral part of the methodology.”
We thank the reviewer for their suggestion and applied them to the revised manuscript.
“other watermarking techniques, such as HiDDeN, are excluded”
HiDDeN, as well as other watermarking techniques, are more vulnerable to removal and forgery attempts [3,4] as evaluated by previous works. However, we recognize their contributions to the field and their applicability in different settings and have mentioned their methods in our related works section.
“How is the threshold determined in the experiments?”
We thank the reviewer for pointing out that this detail is missing from the manuscript. If in the first variant (WIND_fast) (Sec. 4.1) we use a threshold of min l2_norm > 160. This threshold is chosen to contain the empirical distribution of matched patterns (true positive).
In the second variant (WIND_full) we do not choose a threshold, but rather, we choose the noise pattern within the group that has the lowest l2 as our candidates for the identified noise.
Thank you very much for your comments. We respectfully ask that if you feel more positive about our paper, to please consider updating your score. If not, please let us know what can be further improved; we are happy to continue the discussion any time until the end of the discussion period. Thank you!
[1] Wen, Yuxin, et al. "Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust." (2023).
[2] Ci, Hai, et al. "Ringid: Rethinking tree-ring watermarking for enhanced multi-key identification." European Conference on Computer Vision, 2025.
[3] Zhao, Xuandong, et al. "Invisible image watermarks are provably removable using generative ai." (2023).
[4] Jiang, Zhengyuan, Jinghuai Zhang, and Neil Zhenqiang Gong. "Evading watermark based detection of AI-generated content." 2023.
[5] Andriushchenko, Maksym, Francesco Croce, and Nicolas Flammarion. "Jailbreaking leading safety-aligned llms with simple adaptive attacks." (2024).
[6] Hu, Yuepeng, et al. "A Transfer Attack to Image Watermarks." (2024).
[7] Ilyas, Andrew, et al. "Black-box adversarial attacks with limited queries and information." International conference on machine learning. PMLR, 2018.
Thanks for the authors’ response, particularly regarding the choice of initial noises. While I find the evaluation of different watermarking methods not bad, it remains somewhat limited. Since this paper proposes a new watermarking method, the authors should more thoroughly consider other state-of-the-art approaches in addition to TreeRing. Nevertheless, most of my concerns have been partially addressed, so I will raise my score to 6.
Thank you for acknowledging our rebuttal and for taking the time to review our work!
We are glad that you have a better view of our work.
You mentioned that the paper should more thoroughly consider other state-of-the-art approaches in addition to TreeRing. To address this, we have added three additional baselines:
| Method | Keys | Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|---|---|
| DwtDct | 1 | 0.974 | 0.596 | 0.492 | 0.640 | 0.503 | 0.293 | 0.519 | 0.574 |
| DwtDctSvd | 1 | 1.000 | 0.431 | 0.753 | 0.511 | 0.979 | 0.706 | 0.517 | 0.702 |
| RivaGan | 1 | 0.999 | 0.173 | 0.981 | 0.999 | 0.974 | 0.888 | 0.963 | 0.854 |
| Tree-Ring | 32 | 0.790 | 0.020 | 0.420 | 0.040 | 0.610 | 0.530 | 0.420 | 0.404 |
| Tree-Ring | 128 | 0.450 | 0.010 | 0.120 | 0.020 | 0.280 | 0.230 | 0.170 | 0.183 |
| Tree-Ring | 2048 | 0.200 | 0.000 | 0.040 | 0.000 | 0.090 | 0.070 | 0.060 | 0.066 |
| RingID | 32 | 1.000 | 1.000 | 1.000 | 0.530 | 0.990 | 1.000 | 0.960 | 0.926 |
| RingID | 128 | 1.000 | 0.980 | 1.000 | 0.280 | 0.980 | 1.000 | 0.940 | 0.883 |
| RingID | 2048 | 1.000 | 0.860 | 1.000 | 0.080 | 0.970 | 0.950 | 0.870 | 0.819 |
| WIND_fast_128 | 100000 | 1.000 | 0.780 | 1.000 | 0.470 | 1.000 | 1.000 | 0.960 | 0.887 |
| WIND_fast_2048 | 100000 | 1.000 | 0.870 | 0.960 | 0.060 | 0.960 | 0.950 | 0.900 | 0.814 |
| WIND_full_128 | 100000 | 1.000 | 0.780 | 1.000 | 0.850 | 1.000 | 1.000 | 1.000 | 0.947 |
| WIND_full_2048 | 100000 | 1.000 | 0.880 | 1.000 | 0.930 | 1.000 | 0.990 | 0.980 | 0.969 |
(Tab. 16 in the new revised version)
We will include additional baselines in the final version of the manuscript.
Thank you once again for your suggestions as well as your follow-up comment!
This paper presents an innovative watermarking method that uses the initial noise in diffusion models as a distortion-free watermark, showcasing its potential application in addressing societal challenges such as deepfakes. The two-stage watermarking framework proposed by the authors incorporates Fourier patterns during the generation process to enhance the robustness of the initial noise and effectively retrieves relevant noise groups during detection. Despite the significant contributions, this paper has several shortcomings. Firstly, the structure and logical flow of the paper could be improved, particularly in the abstract section. Additionally, some technical terms and concepts should be explained in more detail to aid reader comprehension. Furthermore, the depth of the results analysis could be enhanced, suggesting an exploration of the potential reasons for changes in results and their impact on the overall robustness of the method.
优点
Strengths:
- The two-stage framework proposed in this paper addresses the vulnerabilities of current watermarking techniques, effectively demonstrating the combination of deep learning and traditional watermarking methods. This innovation offers a fresh perspective in the watermarking field.
- The research provides specific solutions, emphasizing the robustness of the method against various attacks. This contribution not only enhances the practical applicability of the paper but also lays a valuable foundation for future research in the area.
缺点
Weakness:
- The research motivation is unclear. In the Abstract section, the structure could be further optimized. There is a lack of smooth logical connection between the two paragraphs of the abstract. The first paragraph primarily discusses deepfakes and the limitations of current watermarking methods, but there is no clear transition when moving to the new framework in the second paragraph. This makes it difficult for readers to understand why the new method is necessary and how it relates to the issues discussed earlier.
- The interpretation of the experimental results is insufficient. Descriptions of the tables and figures are very clear, but a deeper analysis of these results could enhance the paper. For example, in Table 2, while the changes in similarity are displayed, it would be beneficial to explore the potential reasons behind these changes and how they impact the overall robustness of the method.
- In the experimental section, while images generated by your framework are presented, there is a lack of quantitative analysis of image quality.
- There is a lack of the performance across a broader range of inference steps beyond the 50-step setting currently used. Testing various inference step counts (e.g., 20, 50, 100, and 200 steps) can help determine how the step count affects model performance in terms of robustness.
问题
See weakness.
Thank you very much for your detailed review. We appreciate that you found our method effective, our work innovative and offering a fresh perspective, and our contribution valuable and foundational for future research! We address each of your questions below:
“There is a lack of smooth logical connection between the two paragraphs of the abstract”
We thank the reviewer for raising this point. We added a sentence describing the source of some of the limitations of existing methods, and edited another sentence to keep consistency:
”... Yet, current state-of-the-art methods in image watermarking remain vulnerable to forgery and removal attacks. This vulnerability occurs in part because watermarks distort the distribution of generated images, unintentionally revealing information about the watermarking techniques.
In this work, we first demonstrate a distortion-free watermarking method for images, based on a diffusion model's initial noise. …”
We updated the revised manuscript accordingly.
“...a deeper analysis of these results could enhance the paper. For example, in Table 2, … it would be beneficial to explore the potential reasons behind these changes”
We thank the reviewer for this suggestion. We added an exploration into the changes in similarity when applying iteratively the regeneration attack we reported in the paper [1].
| Iteration | Cosine Similarity | Detection Rate |
|---|---|---|
| 10 | 0.493 | 100% |
| 20 | 0.342 | 100% |
| 30 | 0.243 | 100% |
| 40 | 0.170 | 100% |
| 50 | 0.121 | 100% |
Please see the revised manuscript (Fig. 6) to observe the effect of these attacks on the image quality.
We see that iterative regeneration indeed decreases the similarity between the original noise and the reconstructed one (Fig. 7 in the revised manuscript). This happens as the image becomes less and less correlated to the original generation.
Yet, the detection rate of our algorithm remains very high. We attribute this to the fact that even a slight remaining correlation between the attacked image and the initial noise is significant with respect to the correlation expected from non-watermarked images. This happens due to the very low correlation between random (unrelated) initial noises (Fig. 2).
“there is a lack of quantitative analysis of image quality.”
To further assess the effect of WIND watermark on image quality we report the CLIP score [2] before and after watermarking.
| CLIP Before Watermark | CLIP After Watermark |
|---|---|
| 0.366 | 0.360 |
Results indicate that adding the watermark has a negligible effect on the CLIP score.
“There is a lack of performance across a broader range of inference steps beyond the 50-step setting currently used:
Please find below our performance evaluation across a large range of inference steps:
| Steps | Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|---|
| 20 | 1.000 | 0.780 | 1.000 | 0.880 | 0.920 | 1.000 | 0.960 | 0.934 |
| 50 | 1.000 | 0.930 | 1.000 | 0.940 | 1.000 | 0.980 | 0.980 | 0.976 |
| 100 | 1.000 | 0.930 | 1.000 | 0.940 | 1.000 | 1.000 | 0.990 | 0.980 |
| 200 | 1.000 | 0.850 | 1.000 | 0.940 | 1.000 | 1.000 | 1.000 | 0.970 |
Thank you again for your excellent comments. We respectfully ask that if you now feel even more positively about our paper, to consider slightly increasing your score. We are happy to continue the discussion at any time until the end of the discussion period. Thank you!
[1] Zhao, Xuandong, et al. "Invisible image watermarks are provably removable using generative ai." (2023).
[2] Hessel, Jack, et al. "Clipscore: A reference-free evaluation metric for image captioning." (2021).
Dear Reviewer,
We sincerely appreciate your valuable feedback and the time you have dedicated to reviewing our submission. Your insights have been instrumental in shaping the final version of our submission.
We would like to kindly remind you that the discussion period is set to conclude on December 2nd. If there are any additional questions, concerns, or clarifications, we would be delighted to continue the discussion.
Thank you once again for your attention. We look forward to hearing from you!
Dear Reviewer,
As the deadline for uploading paper revisions is approaching, we wanted to check if our rebuttal has satisfactorily addressed your concerns. We have carefully considered and responded to all the points you raised in your review.
We would greatly appreciate it if you could provide your feedback on our rebuttal. This would allow us the opportunity to address any remaining questions you might have. Thus far, three out of six reviewers have engaged with us and updated their scores during the discussion period.
Thank you very much for your time and consideration.
- This paper introduces a novel distortion free diffusion model based watermarking method for images. The method is distortion free as it employs initial random noise that is already used by the model versus previous methods/models that distort the distribution of generated images which negatively affects the robustness of those models.
- The method tackles both watermark removal attacks and forgery attacks.
- To make the watermark detection process more efficient, during generation grouped noises are augmented with distinct fourier patterns/identifiers. Therefore during detection, first the group identifier is recovered and then the exact match within it is found making the detection much faster versus comparing the noise to all previously used initial noises.
优点
originality and significance:
- The method presented by the paper is able to tackle both forgery and removal attacks unlike previous works.
- Shows that initial noise used by diffusion models can be a watermark and improves robustness.
quality and clarity: The paper is generally well written - maintains a good flow of ideas, very few typos and explains concepts and background where necessary
缺点
- Qualitative results only cover Watermark Detection Accuracy in Table 1 but it is important to add other detection metrics such as TP/AUC etc for a stronger evaluation and to be more convincing.
- Also, no ablation on perturbation strenghts vs detection accuracy
- Section 5.2 Watermarking non-synthetic images: Issues with this section
- Quantative Evaluation - Only evaluates FID, but does not touch upon more image similarity and quality metrics of watermarked images like CLIP score, SSIM and PSNR.
- Qualitative: Lacks side by side comparison of watermarked images results from different models
- Experiments: Details of how the experiments were performed are lacking. Ex: how was the detection threshold (picked?
- Supplementary (nit, optional): Would be good to see side by side comparisons of competing methods on runtime or atleast approximate numbers that demonstrate the factor by which this method is slower.
问题
Please see the weaknesses section above.
Questions:
- In Section 5.2 Watermarking non-synthetic images: It does not seem like the image quality is actually close to the non-watermarked images - In images which contain text in Fig 5 and Fig 9, the watermarked images are not able to retain the text seen in the original non-watermarked image (ex: last row of images with traffic text directions and text on the aircraft). Are there ways in which this problem can be resolved/handled?
- It seems like this method relies on the (approximate) invertibility property of DDIM. Is so, please mention this in section 2.2. Fig 1 mentions it but it would be nice to also add it in 2.2 for clarity if this is the case.
- Is there any other diffusion model besides SD v2 that this method was tried with?
Thank you for your detailed review. We appreciate that you recognized our improvement over previous works, and found our manuscript well-written. We address each of your questions below:
“it is important to add other detection metrics such as TP/AUC”
We thank you for your suggestion. Please find additional metrics below ( = 10000 and = 2048):
| AUC | TP@1% |
|---|---|
| 0.971 | 1.000 |
“no ablation on perturbation strengths vs detection accuracy”
We thank the reviewer for this suggestion. We explored the regeneration attack strength across a range of values:
| Iteration | Cosine Similarity | Detection Rate |
|---|---|---|
| 10 | 0.493 | 100% |
| 20 | 0.342 | 100% |
| 30 | 0.243 | 100% |
| 40 | 0.170 | 100% |
| 50 | 0.121 | 100% |
Please see the revised manuscript (Fig. 6) to observe the effect of these attacks on the image quality.
We see that iterative regeneration indeed decreases the similarity between the original noise and the reconstructed one (Fig. 7 in the revised manuscript). This happens as the image becomes less and less correlated to the original generation.
Yet, the detection rate of our algorithm remains very high. We attribute this to the fact that even a slight remaining correlation between the attacked image and the initial noise remains significant with respect to the correlation expected from non-watermarked images. This happens because of the very low correlation between random (non-watermarked) noises (Fig. 2).
“Only evaluates FID, but does not touch upon more image similarity and quality metrics”
To further assess the effect of WIND watermark on image quality we report the CLIP score [6] before and after watermarking.
| CLIP Before Watermark | CLIP After Watermark |
|---|---|
| 0.366 | 0.360 |
Results indicate that adding the watermark has a negligible effect on the CLIP score.
“Lacks side by side comparison of watermarked images results from different model”
In addition to the results we already had (Fig. 4), we added additional side-by-side results per the reviewer's request (please see Fig. 12,13,14,15,16,17 in the revised manuscript).
“Experiments: Details of how the experiments were performed are lacking. Ex: how was the detection threshold picked?”
We thank the reviewer for pointing out that the details are missing from the manuscript.
In the first variant (WIND_fast) (Sec. 4.1) we use a threshold of min l2_norm > 160. This threshold is chosen to contain the empirical distribution of matched patterns (true positive).
In the second variant (WIND_full) we do not choose a threshold, but rather, we choose the noise pattern within the group that has the lowest l2 as our candidates for the identified noise.
The list of used prompts for our evaluation is taken from here: https://huggingface.co/datasets/Gustavosta/Stable-Diffusion-Prompts
Additionally, the full implementation of the method is available in the anonymized repository.
“Would be good to see side-by-side comparisons of competing methods on runtime”
Please find the comparison of one detection below of the runtime (seconds):
| WIND | Tree-Ring | RingID |
|---|---|---|
| 22 | 20 | 14 |
“the watermarked images are not able to retain the text seen in the original non-watermarked … Are there ways in which this problem can be resolved/handled?”
Yes. The distortion for the images is caused by adding our group identifier watermark. There are in fact two options to mitigate it:
(i) We can use other existing watermark techniques that do not affect the generation quality as group identifiers [1]. While the robustness of the fast (WIND_Fast) method might decrease in this case, the robustness of the full method will not be affected.
(ii) We can use the full method (WIND_Full) without group identifiers directly, and the watermark will be distortion-free (the image quality will be identical to that of non-watermarked images).
“... relies on the (approximate) invertibility property of DDIM… please mention this in section 2.2. Fig 1 … also add it in 2.2 for clarity if this is the case.”
As suggested, we now emphasize our reliance on the invertibility property of DDIM in the mentioned sections. Please see the revised manuscript.
“Is there any other diffusion model besides SD v2 that this method was tried with?”
Per the reviewer's request, we validated that also images generated by another model, (SD 1.4 [2]). It shows that our watermarking method can detect the correct watermark (initial noise) among 10000 noises with a rate of 97% on it.
Yet, in practice, our method can also be used to watermark non-synthetic images and images generated by other models using the inpainting technique (Sec 4.3). Therefore, it is generally applicable, even for models that do not have this inversion property.
Thank you very much, once again, for your excellent comments. We respectfully ask that if you feel more positive about our paper, please consider updating your score. If not, please let us know what can be further improved; we are happy to continue the discussion any time until the end of the discussion period. Thank you!
[1] Chadha, Ankit, and Neha Satam. "An efficient method for image and audio steganography using Least Significant Bit (LSB) substitution." (2013).
[2] Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
Thank you to the authors for addressing majority of the concerns and questions - I have raised the score to 6. The primary concern is that the image quality post watermark addition is not well-retained as evidenced by images containing text. There is not sufficient quantitive image quality evaluations done in order to make a strong case, or qualitative comparisons against other methods on examples where it can be clearly shown that this method is superior in maintaining image quality while keeping watermark detection performance high compared to other methods.
Thank you for acknowledging our rebuttal and for taking the time to review our work! We are glad that the majority of your concerns were addressed. We would like to address your remaining concern better.
Our paper explores two settings:
(i) Watermarking of images generated by the diffusion model.
The image quality generated using our full method is comparable to that of previous techniques. Users who wish to generate distortion-free images, without affecting image quality, can do so by omitting the group identifier (at the cost of a slower detection phase for very large values of ) using WIND_w/o.
We include a more comprehensive evaluation of image quality below:
| Method | SSIM | PSNR |
|---|---|---|
| WIND_w/o | 1.000 | - |
| WIND_full | 0.494 | 14.647 |
| RingID | 0.454 | 13.560 |
| Tree-Ring | 0.545 | 15.251 |
(For WIND_w/o PSNR diverges for the distortion-free case)
Both variants (WIND_w/o and WIND_full) are less vulnerable to forging and removal attacks such as [1].
(ii) Watermarking of non-synthetic images:
In this case, our method affects the image quality as it inpaints a part of the image (as noted by the reviewer regarding Fig. 5, 11 in the revised manuscript). Although other watermarking methods may preserve image quality better, our image quality remains high:
| Method | SSIM | PSNR |
|---|---|---|
| WIND_inpainting | 0.768 | 26.806 |
| DwtDctSvd | 0.983 | 39.381 |
| RivaGAN | 0.978 | 40.550 |
| SSL | 0.984 | 41.795 |
| StegaStamp | 0.911 | 28.503 |
Importantly, to the best of our knowledge, our approach is the only one capable of watermarking non-synthetic images while remaining robust against the regeneration attack [2]. Therefore, it is preferable when an adversary may try to remove the watermark.
In addition, the inpainting technique can be applied selectively to specific parts of the image if the copyright owner wishes to perfectly preserve fine details in certain areas.
We added these additional results to the revised manuscript.
Thank you once again for your suggestions for our work as well as your follow-up comment!
[1] Yang, Pei, et al. "Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious?." (2024).
[2] Zhao, Xuandong, et al. "Invisible image watermarks are provably removable using generative ai." (2023).
This paper investigates reverse vulnerabilities in diffusion watermarking methods to forgery and removal attacks. Specifically, the authors propose WIND, a two-stage watermarking framework that first embeds a unique group identifier within the initial noise of a diffusion model using Fourier patterns and then searches the reconstructed noise within the noise group to identify the match. Empirical results show that WIND can achieve better detection performance and robustness against various attacks.
优点
(1) The paper is organized by some theoretical analysis and empirical evaluations. Furthermore, the motivations and methodology are clearly stated overall.
(2) The work leverages the inherent initial noise of diffusion models as a watermark without external watermarking processes that might degrade image quality. It is innovative that grouped noise patterns are used with Fourier-based group identifiers, which is effective and robust.
(3) The search efficiency is heavily relied on the number of watermarks N and noise group M, which is partially solved by proposing a faster detection variant.
(4) The effectiveness of the proposed watermarking generalized to inpainting task.
缺点
(1) The evaluations of the proposed approach are limited, where only one diffusion model is investigated. Moreover, some experimental settings are missing.
(2) Proof of Theorem 4.1 seems not entirely convincing. Although it is stated as a mathematical result, the actual evidence is more empirical rather than rigorous. For example, the paper’s results show low false positives, but the proof does not provide a probability bound for these occurrences.
More details can be seen in Questions.
问题
(1) The authors only evaluate the method on Stable Diffusion v2. The generalization is unknown to other diffusion models, such as Stable Diffusion XL, consistency models and transformer-based diffusion architectures. Moreover, it is unclear how detection would perform at higher generation resolutions. And what is the prompt template for the generation?
(2) Can WIND be adapted to GAN-based generative models when DDIM inversion is replaced by other reconstruction approaches?
(3) The inversion step could affect both reconstruction speed and performance. The trade-off between the inversion step, watermarking overhead, and detection accuracy is not discussed.
(4) It could be better to include group index embedding in Algorithm 1.
Thank you for your detailed review. We appreciate that you found our work innovative, effective, and robust. We address each of your questions below:
“only one diffusion model is investigated”, “how detection would perform at higher generation resolutions”
We thank the reviewer for raising this point. We are happy to report that the diffusion model we investigated is already enough to watermark images from other sources successfully. We add robustness evaluation for our inpainting method, applicable to images from any source:
| Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|
| 1.000 | 1.000 | 1.000 | 0.880 | 1.000 | 0.950 | 0.950 | 0.969 |
We also expect our watermark to be effective directly for any model for which approximate inversion to the original noise is possible. Namely, as the correlation between random noises in a very high dimension is concentrated around 0, even a very slight success in the inversion process is enough to be distinguishable. In higher-generation resolutions, the dimensionality of the noise is even higher, and therefore the separation would be even better [1].
Per the reviewer's request, we are also investigating an additional model (SD 1.4 [2]). It shows that our watermarking method can detect the correct watermark (initial noise) among 10000 noises with a success rate of 97%.
We add this discussion to the revised manuscript.
“some experimental settings are missing”, “what is the prompt template for the generation?”
We thank the reviewer for these comments.
We list below the experimental details we think the reviewer might find missing:
The list of used prompts for our evaluation is taken from here: https://huggingface.co/datasets/Gustavosta/Stable-Diffusion-Prompts
The threshold for detection: For the first variant (WIND_fast) (Sec. 4.1) we use a threshold of min l2_norm > 160.
The second variant (WIND_full) does not use a threshold, but rather, we choose the noise pattern within the group that has the lowest l2 as our candidates for the identified noise.
Apart from that we re-validated that all the experimental details are included and available in the project codebase.
We also added these details to the revised manuscript.
“Proof of Theorem 4.1 seems not entirely convincing.”
The WIND method is an approach for generating multiple watermarked images. Theorem 4.1 tells us that compromising one or more watermarked images does not give away any information about any other watermarked images. E.g., the adversary cannot "generate valid reconstructed noise for any other initial noise index ". (Theorem 4.1 follows from the use of a cryptographic hash function, as described in App. E.) That said, Theorem 4.1 does leave open the possibility that an adversary can take a watermarked image, reconstruct the initial noise only for that image, and use it to attack the method. We evaluate this option empirically (Fig. 2,3).
We added a clarification to the revised manuscript.
“Can WIND be adapted to GAN-based generative models when DDIM inversion is replaced by other reconstruction approaches?”
Yes. GAN-based methods also have a random input that controls the semantic properties of a generation. Therefore, one may use inversion to reconstruct this input and compare it to a key embedded during inference [3,4,5] In practice, even for GAN-generated images, we recommend using the inpainting method we report, as it supplies robustness to various attacks as we report below:
| Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|
| 1.000 | 1.000 | 1.000 | 0.880 | 1.000 | 0.950 | 0.950 | 0.969 |
“...The trade-off between the inversion step, watermarking overhead, and detection accuracy is not discussed.”
First, we would like to emphasize that our inversion step, while costly, is only needed when one would like to detect if the image is a watermark. There is virtually no overhead during generation.
The inversion step in the detection does impose an overhead. It takes seconds on a single NVIDIA GeForce RTX 3090. We believe that this time is reasonable for real watermarking use cases such as validating legal claims [6,7], as is also used by prior methods [8,9].
Regarding the search part of our algorithm, of the inverted noise - we allow the following tradeoff here:
A. Detection of the group identifier alone. This operation takes a search of O(M) but is somewhat vulnerable to both removal and forgery attempts.
B. Detection of the Fourier pattern, followed by a validation of the exact initial noise within the group. This operation takes an O(N/M) search. It is vulnerable to removal attempts, but more resilient to forgery attempts. (please see Tab. 1)
C. Exhaustive search of the initial noise, also outside the identified group. This operation takes an O(N) search. It is more resilient to both removal and forgery attempts (See Tab. 1, Fig. 3).
Practically, the nearest neighbor search can be accelerated using many methods [10,11] and can be scaled to tens of millions without significantly affecting the detection time.
We added clarification to the revised manuscript.
“It could be better to include group index embedding in Algorithm 1.”
We thank the reviewer for raising this clarifying point. We added it to the revised manuscript.
Thank you very much, once again, for your excellent comments. We respectfully ask that if you feel more positive about our paper, to please consider updating your score. If not, please let us know what can be further improved; we are happy to continue the discussion any time until the end of the discussion period. Thank you!
[1] El Karoui, Noureddine. "Concentration of measure and spectra of random matrices: Applications to correlation matrices, elliptical distributions and beyond." (2009).
[2] Rombach, Robin, et al. "High-resolution image synthesis with latent diffusion models." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[3] Dinh, Tan M., et al. "Hyperinverter: Improving stylegan inversion via hypernetwork." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[4] Wang, Tengfei, et al. "High-fidelity gan inversion for image attribute editing." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2022.
[5] Hu, Xueqi, et al. "Style transformer for image inversion and editing." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
[6] Sandoval, Maria-Paz, et al. "Threat of deepfakes to the criminal justice system: a systematic review." (2024).
[7] Meskys, Edvinas, et al. "Regulating deep fakes: legal and ethical considerations." Journal of Intellectual Property Law & Practice (2020).
[8] Wen, Yuxin, et al. "Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust." (2023).
[9] Ci, Hai, et al. "Ringid: Rethinking tree-ring watermarking for enhanced multi-key identification." European Conference on Computer Vision. 2025.
[10] Chen, Qi, et al. "Spann: Highly-efficient billion-scale approximate nearest neighborhood search." Advances in Neural Information Processing Systems (2021).
[11] Andoni, Alexandr, Piotr Indyk, and Ilya Razenshteyn. "Approximate nearest neighbor search in high dimensions." Proceedings of the International Congress of Mathematicians. 2018.
Thanks for the rebuttal by the authors. The authors almost address my concerns about theoretical proof, training and evaluation efficiency, generalization to other model. However, while additional experiments are conducted on SD-v1.4, how the proposed method performs on consistency models or transformer-based diffusion models is unknown. Moreover, it could be better that authors show some experimental results to demonstrate their claims on higher-resolution generations. After considering contributions and the responses to other reviewers, I increase my rating score to 6.
Dear Reviewer,
We sincerely appreciate your valuable feedback and the time you have dedicated to reviewing our submission. Your insights have been instrumental in shaping the final version of our submission.
We would like to kindly remind you that the discussion period is set to conclude on December 2nd. If there are any additional questions, concerns, or clarifications, we would be delighted to continue the discussion.
Thank you once again for your attention. We look forward to hearing from you!
Dear Reviewer,
As the deadline for uploading paper revisions is approaching, we wanted to check if our rebuttal has satisfactorily addressed your concerns. We have carefully considered and responded to all the points you raised in your review.
We would greatly appreciate it if you could provide your feedback on our rebuttal. This would allow us the opportunity to address any remaining questions you might have. Thus far, three out of six reviewers have engaged with us and updated their scores during the discussion period.
Thank you very much for your time and consideration.
Thank you for acknowledging our rebuttal and for taking the time to review our work. We are pleased that the reviewer found many of their concerns addressed.
We would like to emphasize that our method is already applicable to images generated by other sources, as demonstrated through our watermark inpainting technique (Sec. 4.3). Nevertheless, we agree that directly applying our method to additional models is interesting, and we will explore the results on such models.
Thank you once again for your valuable suggestions for our work!
We thank all the reviewers for their valuable feedback. Our work introduces WIND, a two-stage watermarking method for diffusion models, using initial noise as a distortion-free watermark that is robust to removal and forgery attempts. We appreciate that the reviewers found our techniques innovative and novel (YVZd, hApF, jnGF), acknowledged the effectiveness and strong results of our method (YVZd, QcSJ, 7vcU, XGAj, jnGF,hApF), recognized our paper as clear and well written (7vcU, YVZd), and even foundational for future research in the area (QcSJ). Following your suggestions, we highlight further improvements:
(A) We extend the evaluation of our method's applicability to images generated by other models. We find that WIND can also supply strong robustness to images not generated by DDIMs while maintaining image quality.
(B) We validate the robustness of our method in additional settings, including new attack types and parameter regimes.
(C) Additional qualitative results on the image quality produced by our method and additional quantitative results.
Answers to individual reviewer concerns are detailed below. We would be very happy to keep the discussion going, addressing any points that remain unclear, or any new suggestions. Thanks again for your suggestions!
(A) Applicability to image generated by other models
Many reviewers asked about the applicability of methods to other types of models. While our technique relies on the inversion to approximate the initial noise, we can still use our inpainting capability (please see Sec. 4.3) to watermark images generated from other models.
We report below the robustness of this watermark using a set of 100 noises:
| Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|
| 1.000 | 1.000 | 1.000 | 0.880 | 1.000 | 0.950 | 0.950 | 0.969 |
And the image quality it gives:
| Method | FID |
|---|---|
| DwtDctSvd | 25.01 |
| RivaGAN | 24.51 |
| Tree-Ring | 25.93 |
| RingID | 26.13 |
| WIND | 24.33 |
This allows our method to watermark images generated by other model types, as well as natural images.
(B) New attack types and new parameter regimes
We extend our results to explore new attack types and other parameter regimes (all these experiments have been done with = 10000 and = 2048), including:
1. We explore a different number of inference steps:
| Steps | Clean | Rotate | JPEG | C&S | Blur | Noise | Bright | Avg |
|---|---|---|---|---|---|---|---|---|
| 20 | 1.000 | 0.780 | 1.000 | 0.880 | 0.920 | 1.000 | 0.960 | 0.934 |
| 50 | 1.000 | 0.930 | 1.000 | 0.940 | 1.000 | 0.980 | 0.980 | 0.976 |
| 100 | 1.000 | 0.930 | 1.000 | 0.940 | 1.000 | 1.000 | 0.990 | 0.980 |
| 200 | 1.000 | 0.850 | 1.000 | 0.940 | 1.000 | 1.000 | 1.000 | 0.970 |
2. We report the success rate of additional attacks; including transfer-based, query-based, and white-box methods. Specifically, we employ the WeVade white-box attack [1], the transfer attack described in [3], a black-box attack utilizing NES queries [4], and a random search approach discussed in [2], adopted to attempt watermark removal. The success rates of these attacks are detailed in the table below:
| WeVade | Random Search | Transfer Attack | NES Query |
|---|---|---|---|
| 1% | 2% | 3% | 2% |
3. We evaluate against a stronger (more iteration) regeneration attack, by iteratively using the attack we reported in the paper [5].
| Iteration | Cosine Similarity | Detection Rate |
|---|---|---|
| 10 | 0.493 | 100% |
| 20 | 0.342 | 100% |
| 30 | 0.243 | 100% |
| 40 | 0.170 | 100% |
| 50 | 0.121 | 100% |
(C) Additional quantitative and qualitative results on the image quality produced by our method
1. We add quantitative results for the preservation of image quality by our method.
To further assess the effect of WIND watermark on image quality we report the CLIP score [6] before and after watermarking. ( = 10000 and = 2048)
| CLIP Before Watermark | CLIP After Watermark |
|---|---|
| 0.366 | 0.360 |
Results indicate that adding the watermark has a negligible effect on the CLIP score.
Additional qualitative comparisons can be found in Fig. 12,13,14,15,16,17 in the revised manuscript.
2. We evaluate the AUC and True Positive (TPR@1%FPR) rate:
| AUC | TP@1% |
|---|---|
| 0.971 | 1.000 |
All the above results, as well as answers to specific reviewer concerns, are also added to the revised version of the manuscript.
[1] Jiang, Zhengyuan, Jinghuai Zhang, and Neil Zhenqiang Gong. "Evading watermark based detection of AI-generated content." Proceedings of the 2023 ACM SIGSAC. 2023.
[2] Andriushchenko, Maksym, Francesco Croce, and Nicolas Flammarion. "Jailbreaking leading safety-aligned llms with simple adaptive attacks." arXiv (2024).
[3] Hu, Yuepeng, et al. "A Transfer Attack to Image Watermarks." (2024).
[4] Ilyas, Andrew, et al. "Black-box adversarial attacks with limited queries and information." International conference on machine learning. PMLR, 2018.
[5] Zhao, Xuandong, et al. "Invisible image watermarks are provably removable using generative ai." arXiv (2023).
[6] Hessel, Jack, et al. "Clipscore: A reference-free evaluation metric for image captioning." arXiv (2021).
Hi Reviewers,
We are approaching the deadline for author-reviewer discussion phase. Authors has already provided their rebuttal. In case you haven't checked them, please look at them ASAP. Thanks a million for your help!
We thank all reviewers for their time and effort invested in reviewing our paper. Below, we provide a summary of the main updates we made:
- Revised the introduction to give a more comprehensive overview of prior works and their relation to our method
- Added results demonstrating our robustness to additional attacks
- Extended the evaluation of our watermarking technique when applied to images not generated by a diffusion model
- Incorporated additional metrics to assess image quality
Once again, we sincerely thank the area chair and all reviewers!
This paper is working on image watermarking. Authors proposed a distortion-free watermarking method for images based on a diffusion model's initial noise. Authors proposed a two-stage watermarking framework for efficient detection. Authors showed that proposed achieved SOTA robustness to forgery and removal against a large set of attacks.
This paper was reviewed by 6 reviewers and got mixed scores as five 6, one 5.
Strength and weaknesses given by reviewers before rebuttals are as follows (notes that different reviewers has different perspectives of the paper, so conflicts in strength and weaknesses might happen):
Strength: 1) paper is well written; 2) proposed method is novel, effective and robust; 3) lays a valuable foundation for future research;
Weaknesses: 1) evaluation is limited; 2) Proof of Theorem 4.1 seems not entirely convincing; 3) research motivation is unclear; 4) The interpretation of the experimental results is insufficient; 5) lack of quantitative analysis of image quality; 6) lack of the performance across a broader range of inference steps; 7) important to add other detection metrics such as TP/AUC etc for a stronger evaluation and to be more convincing; 8) Details of how the experiments were performed are lacking; 9) The idea of picking the initial noise for diffusion is not new; 10) The evaluation of robustness if not comprehensive; 11) Potential vulnerability to more advanced attacks; 12) does not introduce a fundamentally new approach to noise-based watermarking; 12) The multi-stage approach, while innovative, introduces complexity that may hinder ease of adoption; 13) Some parts of the paper are overclaimed; 14) The problem of the previous method and the motivation of this work are not clearly delivered; 15) The effectiveness of the proposed method may be limited to specific types of diffusion models, and its performance on other diffusion models remains untested; 16) large number of initial noises could still pose challenges in real-time applications;
During author-reviewer discussion phase:
Reviewer YVZd (rating 6) mentioned that their concerns on theoretical proof, training and evaluation efficiency, generalization to other models are addressed. but higher-resolution application or how the proposed method performs on consistency models or transformer-based diffusion models is unknown. However, the reviewer increased the rating to 6.
Reviewer QcSJ (rating 6) didn't reply during rebuttal.
Reviewer 7vcU (rating 6) raised the score to 6, but still has concerns on image quality.
Reviewer XGAj (rating 6) suggested most of their concerns are partially addressed. but also mentioned while they find the evaluation of different watermarking methods not bad, it remains somewhat limited. however they increased rating to 6.
Reviewer jnGF (rating 5) didn't reply in the discussion.
Reviewer hApF (rating 6) suggested that the current version is much clearer, and they acknowledge the novelty and insights of this work and increase rating to 6.
Reviewers didn't give any comments during the reviewer-AC discussion phase.
Given 5 reviewers are positive about this paper and the only review with rating 5 didn't involve in the discussion and AC checked their concerns were addressed. So AC decided to accept this paper.
审稿人讨论附加意见
During author-reviewer discussion phase:
Reviewer YVZd (rating 6) mentioned that their concerns on theoretical proof, training and evaluation efficiency, generalization to other models are addressed. but higher-resolution application or how the proposed method performs on consistency models or transformer-based diffusion models is unknown. However, the reviewer increased the rating to 6.
Reviewer QcSJ (rating 6) didn't reply during rebuttal.
Reviewer 7vcU (rating 6) raised the score to 6, but still has concerns on image quality.
Reviewer XGAj (rating 6) suggested most of their concerns are partially addressed. but also mentioned while they find the evaluation of different watermarking methods not bad, it remains somewhat limited. however they increased rating to 6.
Reviewer jnGF (rating 5) didn't reply in the discussion.
Reviewer hApF (rating 6) suggested that the current version is much clearer, and they acknowledge the novelty and insights of this work and increase rating to 6.
Reviewers didn't give any comments during the reviewer-AC discussion phase.
Given 5 reviewers are positive about this paper and the only review with rating 5 didn't involve in the discussion and AC checked their concerns were addressed. So AC decided to accept this paper.
Accept (Poster)