PaperHub
6.7
/10
Poster3 位审稿人
最低6最高8标准差0.9
6
8
6
4.0
置信度
正确性3.3
贡献度3.0
表达3.7
NeurIPS 2024

Quality-Improved and Property-Preserved Polarimetric Imaging via Complementarily Fusing

OpenReviewPDF
提交: 2024-04-25更新: 2024-12-19
TL;DR

We propose a polarimetric imaging framework that can produce clean and clear polarized snapshots by complementarily fusing a degraded pair of noisy and blurry ones.

摘要

关键词
Polarimetric ImagingExposure fusionDeep Learning

评审与讨论

审稿意见
6

This paper proposes a method to fuse a pair of short-exposure (noisy) and long-exposure (blurry) captures to produce clean and clear polarized snapshots. The proposed method consists of three phases to reconstruct the irradiance, texture, and polarization.

优点

The paper is well-written, the proposed method is well-described, and experiments using both synthetic and real-world data are conducted to evaluate the effectiveness of the proposed method.

缺点

It is difficult to fully understand and evaluate the real-world experiments. Please refer to the corresponding questions for further details.

问题

  1. When generating the synthetic dataset, what are the short and long exposure times? Have you considered different exposure time settings and varying motion blur intensities? Have you added sensor noise to the synthetic images?

  2. How is the reference image obtained when conducting real-world experiments (Fig. 6)? There are few details about the real-world experiments in the paper.

  3. In Fig. 7, is it possible to add comparison results from other methods? It is difficult to evaluate the final results solely based on the proposed framework. Also, how should one understand the difference between the “reflection-contaminated input” and the “reflection-removed output”? Again, there are few details about the real-world experiments.

  4. In line 56, it is not accurate to claim that this is “the first time applying a fusing strategy to polarimetric imaging.” There are existing pieces of literature on polarimetric image fusion, such as “Semantic-guided polarization image fusion method based on a dual-discriminator GAN” by Liu et al., and “Polarization-driven semantic segmentation via efficient attention-bridged fusion” by Xiang et al.

局限性

Limitations of this paper are well discussed.

作者回复

Reviewer WQ4j

  • Issues about the synthetic dataset.

    • The information can be found in Line215-223. Our synthetic dataset is generated from the PLIE dataset [32] (a dataset about polarized image low-light enhancement), which provides short-exposure (TshortT_{short}) polarized snapshots that suffer from low-light noise (serve as L\mathcal{L}) along with the corresponding long-exposure (Tlong10TshortT_{long} \approx 10T_{short}) high-quality reference snapshots (serve as I\mathcal{I}) captured by a Lucid Vision Phoenix polarization camera with a tripod. We only generate the blurry polarized snapshots (B\mathcal{B}) by ourselves.
    • As for the exposure time, since TshortT_{short} varies across the PLIE dataset [32] dataset, our synthetic dataset could cover a long range of different exposure time settings. As for the ratio between TlongT_{long} and TshortT_{short}, since the PLIE dataset [32] fixes the ratio to about 1010, we cannot change it during the generating process of our synthetic dataset. However, we believe our method can still have the generalization ability to different ratios, since we have the blurry polarized snapshots as the input, which can directly provide the information about the ratio.
    • As for the motion blur, we have considered varying motion blur intensities. We adopt the approach proposed in [33] to generate the blurry polarized snapshots, which can generate different motion blur intensities and patterns. Besides, to generate more severe motion blur for increasing the diversity, we add impulsive variation [1] to the motion trajectories.
    • As for the sensor noise, since the images in PLIE dataset [32] are captured by a real polarization camera, they already include real sensor noise. Therefore, we do not need to add synthetic noise into the images.
  • Issues about the reference images of real data.

    • See common issues 1.
  • Issues about Figure 7.

    • We provide the comparison results from other methods in the attached PDF file (Figure B). We can see that the reflection-removed image with the fusing process of our framework contains more detailed textures and less reflection contamination than other methods.

    • In Figure 7, we feed a short-exposure noisy, a long-exposure blurry, and the fused polarized snapshots into a reflection removal network (NIPS19RSP [16]) respectively, and obtain the corresponding output images respectively. The label "reflection-contaminated input" denotes the scenes fed into the reflection removal method [16], and the label "reflection-removed output" denotes the output scenes from the reflection removal method [16] respectively.

    • The capturing process of the scenes in Figure 7 is similar to the real data shown in Figure 6, please refer to common issues 1.

  • Issues about our claim.

    • Here we explain the differences between our claim and the papers mentioned in the reviews separately:
      • The claim of "the first time applying a fusing strategy to polarimetric imaging" in Line56 of our paper is about polarimetric imaging. Note that the term "polarimetric imaging" means outputting high-quality polarized snapshots captured by polarization cameras. Since existing methods designed for such a goal do not use the complementarily fusing strategy, our claim could be accurate.
      • The term "polarization image fusion" in the paper named as "Semantic-guided polarization image fusion method based on a dual-discriminator GAN" is the process of fusing an intensity image and a polarization parameter image solved by Stokes vector into a more detailed image. It means outputting unpolarized images with more details, which is about increasing the quality of intensity images.
      • The term "attention-bridged fusion" in the paper named as "Polarization-driven semantic segmentation via efficient attention-bridged fusion" is fusing polarization information into the semantic segmentation procedure, which is about network designing.
评论

Thank you for your rebuttal. The answers make sense!

审稿意见
8

This paper proposes a polarimetric imaging framework that can produce clean and clear polarized snapshots by complementarily fusing a degraded pair of noisy and blurry ones. It adopts a neural network-based three-phase fusing scheme with specially designed modules tailored to each phase, which can not only improve the image quality but also preserve the polarization properties.

优点

The idea of using complementarily fusing to achieve quality-improved and property-preserved polarimetric imaging is novel and reasonable. As obtaining high-quality polarized images is significant in polarization-based vision applications, while previous methods based on single modality (either noisy or blurry) tend to suffer from various artifacts, the proposed method could be a practical way to increase the performance of polarization-based vision applications.

The network module designs are also reasonable. All modules (Irradiance restoration, Polarization reconstruction, and Artifact suppression) are carefully and specially designed to solve the problems in the fusing process, which means the authors do spend efforts in observing and analyzing the properties in both the noisy and blurry polarized snapshots.

The idea is clearly presented, and the experiments are sufficient. The performance improvement shows that the proposed method is effective.

In addition to the experiments on synthetic and real data, the authors also show the results of reflection removal, which makes the paper convincing.

缺点

The authors say that they adopt the PLIE dataset [32] as the source data to generate their own dataset. However, they do not tell the reasons why to choose the PILE dataset [32]. For example, [25] also provides a dataset (LLCP dataset) similar to the PLIE dataset [32], so why not choose the LLCP dataset as the source data? Any reasons?

问题

It seems that using polarizers instead of polarization cameras can also capture the polarized images. However, in some cases one can only use a polarizer to capture polarized images instead of a polarization camera. Can the proposed method be used to process the data captured using polarizers?

局限性

The authors have adequately addressed the limitations.

作者回复

Reviewer SSTW

  • Why not choose the LLCP dataset as the source data?
    • This is because the quality of PLIE dataset [32] used in our paper could be better than the LLCP dataset [25]. For example, overexposed regions often appear in the reference images of the LLCP dataset [25]. Training a network with such data will reduce its generalization ability.
  • Can the proposed method be used to process the data captured using polarizers?
    • Yes, it is theoretically feasible, as long as we can obtain four polarized images of the same scene with different polarizer angles (0,45,90,1350^{\circ}, 45^{\circ}, 90^{\circ}, 135^{\circ}). However, using polarizers can only capture a single polarized image in a single shot, which is less convenient compared with using polarization cameras.
评论

Thank you for the response, which have addressed my concerns. After reviewing other comments, I still believe this paper is technically solid and novel. Therefore, I maintain my score.

审稿意见
6

This paper proposes the first method for polarimetric image enhancement by fusing noisy and blurry pairs. While a short exposure polarimetric image produces sharp but noisy DoP and AoP, a long exposure makes them smooth but blurred. To effectively exploit the complementary advantages of these two images and satisfy the physics constraints of the polarimetric image, this paper proposes a three-phase fusing scheme. Experimental results show that the proposed method outperforms existing polarimetric image enhancement methods.

优点

  • The first method for polarimetric image enhancement by fusing noisy and blurry pairs.
  • Propose a novel fusing scheme to effectively use complementary polarimetric information of noisy and blurry pairs and retain polarimetric cues by directly processing DoP and AoP.
  • Experimentally validate the effectiveness of the fusion of noisy and blurry polarimetric image pairs and the proposed network. The accurate restoration of polarimetric cues is critical for downstream tasks.

缺点

  • While the proposed method improves the pSNR of DoP and AoP, their SSIMs are almost the same as PLIE [32].
  • Requiring two shots is undesirable for some downstream tasks.

问题

  • Why does not the proposed method significantly improve the SSIMs of DoP and AoP? How does the improvement of PSNR affect the downstream tasks?
  • How are reference images of real data obtained? Can they be used for quantitative evaluation?
  • The visual comparisons of the ablation study will help understand the effectiveness of each component.

局限性

Yes.

作者回复

Reviewer 4C9V

  • Issues about the PSNR and SSIM values of the DoP and AoP.
    • PSNR is highly sensitive to small changes in pixel values, whereas SSIM considers structural information and spatial relationships within the image. Our method leverages the clean information from the blurry input, effectively reducing the noise level in the DoP and AoP, particularly in high-value regions, leading to higher PSNR scores compared to other methods. However, since the overall pixel values of the DoP and AoP are typically very small (much smaller than those of the images), structural details are usually significantly degraded (due to distortions in the low-value regions) and challenging to correct. In such situations, neither our method nor the compared methods can significantly improve the structural details, resulting in similar SSIM scores. Consequently, the improvement in SSIM is not as pronounced as in PSNR.
    • Generally, an improvement in PSNR can be positively correlated with the quality of downstream tasks. This is because higher PSNR indicates better preservation of polarization information, which can provide more accurate polarization cues for downstream tasks. However, the quantitative relationship between them cannot be precisely determined, as the quality of downstream tasks significantly depends on the specific methods designed for those tasks.
  • Issues about the reference images of real data.
    • See common issues 1.
  • Visual comparisons of the ablation study.
    • We take the scene in Figure 5 as an example, which can be found in the attached PDF file (Figure A). We can see that although there is not much difference compared to other ablation items (due to their similar quantitative scores), there is still a slight advantage in details such as cleaner background.
评论

I appreciate your addressing my questions. The difference between PSNR and SSIM scores would help readers understand the experimental results. I keep my rating and recommend the acceptance of this paper.

作者回复

Common issues

We sincerely thank all reviewers for their valuable comments and suggestions. We feel encouraged that the novelty and performance of our method are acknowledged by the reviewers:

  • Propose a novel fusing scheme to effectively use complementary polarimetric information of noisy and blurry pairs and retain polarimetric cues by directly processing DoP and AoP. (Reviewer 4C9V)
  • The idea of using complementarily fusing to achieve quality-improved and property-preserved polarimetric imaging is novel and reasonable. (Reviewer SSTW)
  • The paper is well-written, the proposed method is well-described, and experiments using both synthetic and real-world data are conducted to evaluate the effectiveness of the proposed method. (Reviewer WQ4j)

We first address the common questions here as the common issues, and then answer each reviewer's specific questions in the corresponding comments.

  1. Issues about the reference images of real data. (Reviewer 4C9V, WQ4j)
    • The reference images of real data (in Figure 6) are captured as follows:
      1. Use a tripod to fix the polarization camera.
      2. Use a program (written by the SDK of the polarization camera) to capture a short-exposure (TshortT_{short}) polarized snapshot (as the noisy input L\mathcal{L}) and a long-exposure (Tlong10TshortT_{long} \approx 10T_{short}) blur-free one (as the reference I\mathcal{I}) consecutively. Note that this step is similar to the way of data capturing in [3].
      3. Remove the tripod, hold the polarization camera by hand (which brings about motion blur), and capture another long-exposure (also TlongT_{long}) polarized snapshot (as the blurry input B\mathcal{B}).
    • These real data can also be used for quantitative evaluation. Taking the scene in Figure 6 as an example, the quantitative scores are shown below:
PSNR-p\mathbf{p}SSIM-p\mathbf{p}PSNR-θ\thetaSSIM-θ\thetaPSNR-I\mathbf{I}SSIM-I\mathbf{I}
Ours30.110.77419.170.44135.450.973
PLIE [32]28.970.77118.740.39934.130.971
PLIE+28.730.76518.690.40735.240.973
PolDeblur [33]27.190.73018.530.42635.120.958
PolDeblur+27.980.74719.050.43735.380.969
LSD2 [17]25.760.56313.450.36816.580.906
LSFNet [2]25.990.68118.280.42133.410.938
SelfIR [27]22.210.72017.260.37635.370.940
D2HNet [28]25.640.68317.540.38528.890.863
最终决定

This paper receives one strong accept and two weak accept. This paper proposes a complementary method to fuse a degraded pair of noisy and blurry ones to generate high-quality polarimetric imaging. In the rebuttal, the authors have adequately addressed most of the reviewers' major concerns. The authors are encouraged to revise the paper by adding a comparison of the computational cost.

公开评论

There are several errors in Figure 4. The angle between the S\vec S and the positive half-axis of the x-axis should be 2θ2\theta, not θ\theta, and S,x=2θ\langle \vec S,x\rangle=2\theta.

公开评论

Thank you very much for pointing out this issue. We will update Figure 4 by replacing θ\theta with 2θ2\theta and revise the corresponding text once the camera-ready submission process reopens after the conference.