Residual Denoising Diffusion Models

Jiawei Liu,Qiang Wang,Huijie Fan,Yinong Wang,Yandong Tang,Liangqiong Qu

OpenReview PDF

提交: 2023-09-19更新: 2024-03-26

TL;DR

To unify image generation and restoration, we introduce residuals into diffusion models and propose a directional residual diffusion process with perturbations, allowing the target image to diffuse into a purely noisy or a noise-carrying input image.

摘要

We propose residual denoising diffusion models (RDDM), a novel dual diffusion process that decouples the traditional single denoising diffusion process into residual diffusion and noise diffusion. This dual diffusion framework expands the denoising-based diffusion models, initially uninterpretable for image restoration, into a unified and interpretable model for both image generation and restoration by introducing residuals. Specifically, our residual diffusion represents directional diffusion from the target image to the degraded input image and explicitly guides the reverse generation process for image restoration, while noise diffusion represents random perturbations in the diffusion process. The residual prioritizes certainty, while the noise emphasizes diversity, enabling RDDM to effectively unify tasks with varying certainty or diversity requirements, such as image generation and restoration. We demonstrate that our sampling process is consistent with that of DDPM and DDIM through coefficient transformation, and propose a partially path-independent generation process to better understand the reverse process. Notably, our RDDM enables a generic UNet, trained with only an $\ell _1$ loss and a batch size of 1, to compete with state-of-the-art image restoration methods. We provide code and pre-trained models to encourage further exploration, application, and development of our innovative framework.

关键词

Diffusion ModelsRestorationGenerationDual DiffusionResidualPath-independentDecoupled Diffusion

评审与讨论

审稿意见

评分: 6置信度: 42023-10-17

The paper introduces Residual Denoising Diffusion Models (RDDM), a dual diffusion process. The proposed RDDM decouples the diffusion process into residual diffusion and noise diffusion, which can unify the image restoration and generation process. The authors demonstrate the consistency of RDDM with the diffusion models DDPM and DDIM by transforming coefficient schedules. Additionally, they propose a partially path-independent generation process. Experiments demonstrate the effectiveness of RDDM.

优点

The authors propose RDDM as a unified framework for image restoration and generation. It offers a versatile approach to these related tasks.
The authors demonstrate the consistency of RDDM with DDPM/DDIM through coefficient schedule transformations.
The proposed partially path-independent generation process decouples residuals and noise, and reasonably explains the role of the two branches.
They provide the code, which shows the solidness of the work.

缺点

The paper lacks perceptual metrics or a detailed comparison for some tasks, such as Low-light and Deraining, where RDDM performs better regarding PSNR and SSIM. It would be beneficial to provide more comprehensive comparisons, especially using perceptual metrics. Additionally, the paper evaluates the performance at 5 steps for shadow removal and deraining, and 2 steps for low-light and deblurring. Since step numbers affect performance, it is recommended to analyze the impact of different numbers of steps on performance.

问题

In Table 1, RDDM performs better in different strategies for two restoration tasks, Low-light (LOL) and Deraining (RainDrop). It would be beneficial to explain this phenomenon.
Provide more comparisons in terms of different metrics and evaluate the model's performance with more steps in Low-light and Deraining.

审稿意见

评分: 5置信度: 42023-10-17

This paper proposes residual denoising diffusion models (RDDM), which decouples the traditional single denoising diffusion process into residual diffusion and noise diffusion.

优点

This paper compares various tasks.

缺点

*The data provided for inpainting tasks is limited. I would like the authors to conduct quantitative comparisons, especially on the CelebA and Places2 datasets, and compare with state-of-the-art diffusion models (2,3), including considerations of efficiency and parameter count.

*Residual diffusion has been explored and discussed extensively in the domain of image restoration (1,2). It should be discussed in detail and compared with them.

*Please provide a comparison of the computational complexity, runtime, and parameter count for the methods being compared.

*For the deraining task in Table 3, please compare it with Restormer using several standard datasets, including comparisons of runtime efficiency and parameter count.

(1) Srdiff: Single image super-resolution with diffusion probabilistic models (2) Diffir: Efficient diffusion model for image restoration (3) Repaint: Inpainting using denoising diffusion probabilistic models (4) Restormer: Efficient transformer for high-resolution image restoration

问题

check weakness

审稿意见

评分: 5置信度: 52023-10-28

This paper proposes a unified model for image generation and restoration with the concept of residual. The proposed method is compatible with different diffusion models and sampling strategies. It can also extend to image restoration task.

优点

This paper unified image generation and generation in a framework.
This paper demonstrates its effectiveness on various image restoration task.
This paper is well-writen and easy-to-understand.

缺点

The novelty is limited. It seems more like a combination of two existing components. The concept of residual is not rare in diffusion models. Several previous works employ the same idea in image restoration task [1][2]. Method [1] already introduced the residual concept in restoration task and verified the effectiveness. Beseds, this paper didn't cite these methods.
This method only mathetically combines image generation and restoration. In the experiments, it employ pretrained diffusion models for image generation with only coefficient transformation. While for image restoration, it needs to retrain the diffusion model for different restoration tasks. [1] Image restoration with mean-reverting stochastic differential equations [2] Resshift: Efficient diffusion model for image super-resolution by residual shifting.

问题

Refer to the weakness and questions mentioned above.

审稿意见

评分: 5置信度: 42023-11-02

The authors propose a novel framework for diffusion models named residual denoising diffusion models (RDDM), which is used for image restoration and image generation. RDDM decouples the diffusion framework into residual diffusion and noise diffusion. The former represents directed diffusion and the latter represents randomness in the diffusion process. Qualitative and quantitative experiments showed the superiority of the method.

优点

The results in the paper show significant improvement in image generation and image restoration.
The proposed architectural provides a new idea for the interpretability of diffusion models and more accurate results can be obtained with fewer sampling steps.

缺点

Sampling speed in reverse process is an important factor that influences the quality of diffusion models. Compared with DDPM, DDIM, and other diffusion models, does RDDM have an advantage in sampling speed?
In Eq.10 (reverse process), the author sets \eta to 0, which represent a deterministic generation process. What is the advantage to set \eta to 0?
The author's approach to minimizing the loss function is minimizing the upper bound of Eq. 25, i.e. minimizing Eq. 26. However, it seems that in the derivation from Equation 25 to Eq. 26, the author did not explicitly consider the relationship of \alpha_t and \frac{\beta_t^2}{\overline{\beta}} between Eq.25 and Eq.26. And the author only utilizes \lambda_{res} and \lambda_{\varepsilon} as coefficients for the loss function. How can author ensure that Eq. 26 is indeed an upper bound for Eq. 25?

问题

Please refer to weakness.