PaperHub
4.8
/10
Poster4 位审稿人
最低3最高6标准差1.1
5
3
5
6
4.3
置信度
正确性3.0
贡献度2.8
表达2.8
NeurIPS 2024

You Only Look Around: Learning Illumination-Invariant Feature for Low-light Object Detection

OpenReviewPDF
提交: 2024-05-14更新: 2024-11-06

摘要

关键词
object detection;lowlight object detection;illumination invariant feature;

评审与讨论

审稿意见
5

The paper focuses on the Low-light Object Detection task from the perspective of feature learning. In detail, the paper proposes an Illumination-Invariant Module to extract illumination-invariant features and a learning illumination-invariant paradigm. Experiments verify the effectiveness of the proposed method.

优点

  1. The writing is good with beautiful illustrations.
  2. The idea is novel and interesting. The paper leverages illumination-invariant features to detect objects in low light, which is novel for the low-light object detection task. The proposed method is easy to follow and achieves good performance.

缺点

More related methods need to be compared in Table 1 and Table 2. [1] Ziteng Cui, Kunchang Li, Lin Gu, Shenghan Su, Peng Gao, Zhengkai Jiang, Yu Qiao, and Tatsuya Harada. You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction. In BMVC, page 238, 2022. [2] Shangquan Sun, Wenqi Ren, Tao Wang, and Xiaochun Cao. Rethinking image restoration for object detection. Advances in Neural Information Processing Systems, 35:4461–4474, 2022. [3] Wenyu Liu, Gaofeng Ren, Runsheng Yu, Shi Guo, Jianke Zhu, and Lei Zhang. Image-adaptive yolo for object detection in adverse weather conditions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 1792–1800, 2022. [4] Sanket Kalwar, Dhruv Patel, Aakash Aanegola, Krishna Reddy Konda, Sourav Garg, and K Madhava Krishna. Gdip: Gated differentiable image processing for object detection in adverse conditions. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 7083–7089. IEEE, 2023. [5] Qingpao Qin, Kan Chang, Mengyuan Huang, and Guiqing Li. Denet: Detection-driven enhancement network for object detection under adverse weather conditions. In Proceedings of the Asian Conference on Computer Vision, pages 2813–2829, 2022. [6] Xiangchen Yin, Zhenda Yu, Zetao Fei, Wenjun Lv, and Xin Gao. Pe-yolo: Pyramid enhancement network for dark object detection. In International Conference on Artificial Neural Networks, pages 163–174. Springer, 2023. [7] Khurram Azeem Hashmi, Goutham Kallempudi, Didier Stricker, and Muhammad Zeshan Afzal. Featenhancer: Enhancing hierarchical features for object detection and beyond under low-light vision. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6725–6735, 2023.

问题

How does the method perform in comparison to recent related methods [1-7]? The proposed method assumes neighboring pixels exhibit high similarity of illumination. However, how does it hold true when meeting uneven light at night?

局限性

The proposed method assumes neighboring pixels exhibit high similarity of illumination. However, how does it hold true when meeting uneven light at night?

作者回复

We sincerely thank you for your insightful and positive comments.


How does the method perform in comparison to recent related methods?

We complement more detailed comparison experiments as presented in Table 1, which includes runtime, model size, and performance. Note that our YOLA still achieves the best performance and speed among the evaluated methods.

Besides, ReForDe[7] didn’t release their detailed training data, so we can not implement it. However, compared to the improvements reported by ReForDe in their paper (\leq0.2 mAP on ExDark using YOLOv3), our YOLA demonstrates a more substantial enhancement, achieving improvements of 1.7 mAP and 2.7 mAP on YOLOv3 and TOOD, respectively.

Additionally, FeatEnHancer[1] didn't release their code, we thus follow the FeatEnHancer’sexperimental setting to implement the RetinaNet-based detectors as shown in Table 2. We can see that even though our baseline implementation on the ExDark dataset is inferior to FeatEnHancer’s, the integration of YOLA enables our method to achieve the best performance (1.9 mAP significant improvement compared to baseline). For DarkFace dataset, FeatEnHancer decreases the baseline performance by 0.1 mAP, which is attributed to hierarchical features that failed to be captured by RetinaNet, as claimed in [1]. In contrast, our YOLA, triggered from the physics-based model perspective without elaborate design, surpasses the baseline with a remarkable improvement of 2.5 mAP. It strongly suggests the generalizability and effectiveness of YOLA.


The proposed method assumes neighboring pixels exhibit high similarity of illumination. However, how does it hold true when meeting uneven light at night?

In the main text, we assume that the illumination values of neighboring pixels are equal, which allows us to eliminate the influence of illumination when extracting features. However, in cases of uneven illumination, this assumption's constraint is weakened but still helps reduce the impact of illumination, as shown in teaser (b). Our method is detection-driven and can further mitigate the influence of such uneven illumination during the learning process, as illustrated in teaser (d). Additionally, please refer to the appendix. When the actual distance between neighboring pixels is too large, the assumption may not hold. Therefore, we propose IIloss to constrain the extraction of illumination-invariant features to as close a region as possible, mitigating the impact of uneven lighting.


DetectormAPSize(M)FPS
Baseline72.532.04457.7
IAT[2]73.032.13550.9
IAYOLO[3]65.032.20952.5
GDIP[4]72.8167.0054.0
DENet[5]73.532.08955.7
PEYOLO[6]67.832.13538.8
Ours75.232.05256.6

Table 1: Quantitative comparisons of the ExDark dataset based on TOOD detectors


DatasetMethodmAP50_{50}
ExdarkBaseline72.1
w/ FeatEnHancer72.6 (+0.5)
----------------------------------------------------------
Baseline^{\dagger}70.9
w/ YOLA72.8 (+1.9)
--------------------------------------------------------------------
DarkFaceBaseline47.3
w/ FeatEnHancer47.2 (-0.1)
----------------------------------------------------------
Baseline^{\dagger}50.2
w/ YOLA52.7 (+2.5)

Table 2: Quantitative comparisons (YOLA vs. FeatEnHancer), \dagger indicates our implemented baseline.


Reference:

[1] Khurram et al. Featenhancer: Enhancing hierarchical features for object detection and beyond under low-light vision, ICCV 2023.

[2] Cui et al. You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction. BMVC 2022.

[3] Liu et al. Image-adaptive yolo for object detection in adverse weather conditions. AAAI 2022.

[4] Sanket et al. Gdip: Gated differentiable image processing for object detection in adverse conditions. ICRA 2023.

[5] Qin et al. Denet: Detection-driven enhancement network for object detection under adverse weather conditions. ACCV 2020.

[6] Yin et al. Pe-yolo: Pyramid enhancement network for dark object detection. ICANN 2023.

[7] Sun et al. Rethinking image restoration for object detection. NeurIPS 2022.

[8] Cui et al. Multitask aet with oerthogonal tangent regularity for dark object detection, ICCV 2021.

评论

Thanks to the author's detailed answer, which has resolved my doubts. I am raising my score to 6: Weak Accept.

评论

We are grateful for the feedback and thanks very much for the approval.

评论

We sincerely thank you for your time and effort in reviewing our manuscript.

We have provided detailed information on the performance of quantitative experiments across different detectors, as shown in Tables 1 and 2. For a fair comparison, we reimplemented these methods using the MMdetection toolbox based on their open-source code, ensuring consistent training hyperparameters. We can see that our YOLA still achieves the best performance across different detectors. Additionally, we will include these experiments in the revised version.

Thank you once again for your valuable suggestions.

DetectorExdark mAPDarkFace mAP
Baseline71.060.0
IAT72.659.8
IAYOLO68.159.9
GDIP67.560.4
DENet71.360.0
PEYOLO68.853.9
Our72.761.5

Table 1: Quantitative comparisons based on YOLOv3 detector.

DetectorExdark mAPDarkFace mAP
Baseline72.562.1
IAT73.062.0
IAYOLO65.055.5
GDIP72.862.9
DENet73.566.2
PEYOLO67.861.1
Our75.267.4

Table 2: Quantitative comparisons based on TOOD detector.

审稿意见
3

This paper proposes YOLA, a framework for object detection in low-light conditions by leveraging illumination-invariant features. A novel Illumination-Invariant Module to extract illumination-invariant features for low-light image enhancement.

优点

Figures are helpful for understanding. The proposed method gives better performance than others.

缺点

RG/RB/GB-chromaticity is widely used in computer vision; learning-based methods for intrinsic images or inverse rendering also adopt the constraint. It deals with light intensity but does not try to solve the image noises in low light conditions. Does this paper consider noise reduction in the pipeline?

Illumination invariant features based on chromaticity have ambiguities between colors with the same chromaticity, e.g. dark red or light red. Such limitations are not discussed.

In experiments, LLIE methods fail to achieve satisfactory performance due to inconsistency between human visual and machine perception. The enhancement methodologies prioritize human preferences. However, it is important to note that optimizing for enhanced visual appeal may not align with optimized object detection performance. It is hard to understand without visual examples.

In the pipeline, the work is basically low-light enhancement + detector. In evaluations, only object detection is evaluated, why not evaluate on low-light enhancement datasets/benchmarks?

The proposed features are claimed to fit many tasks. More tasks other than detection can be demonstrated.

问题

See above.

局限性

Overall I think the idea of illumination invariant feature is simple, so the technical contribution is limited. The results of ours in Figure 3-4 are strange, the images are very cloudy. It is very different from the example in the pipeline.

作者回复

We sincerely thank you for your constructive comments.


In the pipeline, the work is basically low-light enhancement + detector. In evaluations, only object detection is evaluated, why not evaluate on low-light enhancement datasets/benchmarks?

First and foremost, we need to emphasize that our framework is designed to enhance the performance of object detectors in low-light scenarios, rather than for visual brightening or denoising .

Human visual alignment can improve detection performance, but it is not the only solution. From a machine vision perspective, detection does not depend on achieving image quality that aligns with human vision. Despite the presence of noise and blurriness in low-light images, this does not hinder our method from improving performance in detection tasks.

Essentially, our method is object detection rather than image enhancement, which means comparing our results based on image enhancement benchmarks is unfair.

More specifically, our framework does not employ any extra low-lit and normal-lit image pairs or other image restoration-related loss to enhance the input images. We only leverage detection loss to guide IIM in producing task-specific illumination invariant features. Therefore, improving visual quality is not our goal. Interestingly, we found that the subsequent enhanced image yielded by FuseNet tends to display increased brightness.


In experiments, LLIE methods fail to achieve satisfactory performance due to inconsistency between human visual and machine perception. The enhancement methodologies prioritize human preferences. However, it is important to note that optimizing for enhanced visual appeal may not align with optimized object detection performance. It is hard to understand without visual examples.

As we have shown, some of the images we present are cloudy, which exemplifies the discrepancy between machine vision and human vision. In LLIE (Low-Light Image Enhancement) methods, many loss functions are designed based on human prior assumptions, such as illumination smoothness and color consistency in Zero-DCE, or TV loss in denoising. These losses often lead to the loss of many details while preserving details preferred by human perception. This difference may result in LLIE methods performing poorly in detection tasks, as discussed in related works[1, 2, 3].


RG/RB/GB-chromaticity is widely used in computer vision; learning-based methods for intrinsic images or inverse rendering also adopt the constraint. It deals with light intensity but does not try to solve the image noises in low light conditions. Does this paper consider noise reduction in the pipeline?

Illumination invariant features based on chromaticity have ambiguities between colors with the same chromaticity, e.g. dark red or light red. Such limitations are not discussed.

Enhancing image quality, such as reducing noise and improving chromaticity distinction, is a practical strategy, but it is not the only one for improving detection performance. In this work, We are committed to finding an end-to-end approach to enhance downstream detection tasks, rather than focusing on image restoration.

Besides, improving both image quality and detection performance typically requires paired annotated datasets, which presents significant challenges for practical applications. Therefore, we strongly believe that developing simpler and more efficient end-to-end methods to reduce the burden of data annotation will greatly benefit the community.


The proposed features are claimed to fit many tasks. More tasks other than detection can be demonstrated.

We present our evaluation of YOLA on the instance segmentation task in Table 1. We report the quantitative comparisons of several advanced LLIE and low-light object methods using Mask R-CNN on the low-light instance segmentation (LIS[10]) dataset. We can see that our YOLA achieves the best performance across all metrics, indicating that YOLA not only facilitates low-light object detection but also low-light instance segmentation. For more visual comparison, please refer to our appendix.


MethodAPseg^{seg}APbox^{box}
Baseline34.241.3
DENet[3]38.646.4
PENet[4]36.143.6
Zero-DCE[5]38.746.4
EnlightenGAN[6]38.445.8
RUAS[7]36.143.8
SCI[8]36.544.3
NeRCo[9]36.744.6
Ours39.847.5

Table 1: Quantitative Comparisons.

Reference

[1] Cui et al. Multitask aet with oerthogonal tangent regularity for dark object detection, ICCV 2021.

[2] Khurram et al. Featenhancer: Enhancing hierarchical features for object detection and beyond under low-light vision, ICCV 2023.

[3] Qin et al. Denet: Detection-driven enhancement network for object detection under adverse weather conditions. ACCV 2020.

[4] Yin et al. Pe-yolo: Pyramid enhancement network for dark object detection. ICANN 2023.

[5] Guo et al. Zero-reference deep curve estimation for low-light image enhancement. CVPR 2020.

[6] Jiang et al. Enlightengan: Deep light enhancement without paired supervision. TIP 2021.

[7] Liu et al. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. CVPR 2021.

[8] Ma et al. Toward fast, flexible, and robust low-light image enhancement. CVPR 2022.

[9] Yang et al. Implicit neural representation for cooperative low-light image enhancement. CVPR 2023.

[10] Chen et al. Instance segmentation in the dark. IJCV 2023.

评论

Dear Reviewer,

We would like to extend our sincere appreciation for the time and effort you have dedicated to reviewing our manuscript. Your valuable comments and insights are greatly appreciated. We eagerly await your feedback on the points we have addressed in our rebuttal. If you have any concerns or require further clarification, please do not hesitate to let us know. Thank you once again for your commitment to the review process.

Sincerely,

Authors

审稿意见
5

In this paper, the authors propose an object detection method in low-light scenarios that is based on illumination-invariant feature learning. Additionally, the extraction of illumination-invariant features from low-light images, which can be easily integrated into existing object detection frameworks, The results reveal significant improvements in low-light object detection tasks as well as promising results in both well-lit and over-lit scenarios.

优点

1.This paper introduces an new object detection in low-light conditions by leveraging illumination-invariant features. 2. The writing is well done and well organized. 3. The Illumination-Invariant Module seems to be a plug and play module that is very useful.

缺点

  1. As a plug and play module, I think the lighting invariant module should be integrated into more detectors to prove its effectiveness.
  2. The authors claim that they learned light invariant features, and then models trained in low light conditions should be able to generalize directly to normal light conditions. And vice versa. The authors should provide more experiments to demonstrate the generalization ability of their light invariant features. 3.The authors should provide evaluation content such as runtime and memory used by the run, it seems that their approach is more lightweight.

问题

See the weaknesses

局限性

See the weaknesses

作者回复

We sincerely thank you for your insightful and constructive comments.

As a plug and play module, I think the lighting invariant module should be integrated into more detectors to prove its effectiveness.

We report more detectors as shown in Table 1. It contains the Anchor-based detector (Faster-RCNN[1], RetinaNet[2], YOLOV3[3]), and the Anchor-free one (TOOD[4], Sparse-RCNN[5]). By integrating YOLA, the performances of all detectors are significantly improved, demonstrating the generalization capability of YOLA. Besides, please refer to the appendix, we also evaluate our YOLA on the instance segmentation task, demonstrating YOLA’s effectiveness on Mask R-CNN detector.


The authors claim that they learned light invariant features, and then models trained in low light conditions should be able to generalize directly to normal light conditions. And vice versa. The authors should provide more experiments to demonstrate the generalization ability of their light invariant features.

We complement the more generalization experiments as presented in Table 2, where GAPGAP represents mAP50_{50} gap between in-domain and cross-domain. We trained detectors separately in low-light (ExDark) and normal-light(Pascal VOC 2012[6]) conditions and then tested them in the opposite conditions. We can see that our YOLA achieves the best performance across different domains and shows a smaller performance drop when tested on cross-domain.


The authors should provide evaluation content such as runtime and memory used by the run, it seems that their approach is more lightweight.

We have included a comparison of runtime, model size, and performance in Table 3. The inference speeds, evaluated on an RTX 2080Ti using the MMdetection toolbox, indicate that YOLA introduces the fewest additional parameters (0.008M) while achieving the highest FPS and outstanding performance among the evaluated detectors, excluding the baseline.


DetectormAP50_{50}(ExDark)mAP50_{50}(DarkFace)Types
YOLOv371.060.0Single Stage
+YOLA72.761.5
-----------------------------------------------------------------------
RetinaNet70.950.2Single Stage
+YOLA72.852.7
-----------------------------------------------------------------------
FasterRCNN71.443.0Two Stage
+YOLA72.544.6
-----------------------------------------------------------------------
SparseRCNN63.543.5Two Stage
+YOLA68.752.8
-----------------------------------------------------------------------
TOOD72.562.1Single Stage
+YOLA75.267.4
-----------------------------------------------------------------------

Table 1: YOLA with more detectors.


DetectorTraining SetTesting SetmAP50_{50}Testing SetmAP50_{50}GAP ↓
YOLOv3ExDark_trainExDark_test71.0VOC_val57.613.4
+YOLAExDark_trainExDark_test72.7VOC_val60.512.2
------------------------------------------------------------------------------------
YOLOv3VOC_trainVOC_val78.8ExDark_test57.421.4
+YOLAVOC_trainVOC_val78.9ExDark_test58.520.4

Table 2: Generalization comparison.


DetectormAPSize(M)FPS
Baseline72.532.04457.7
IAT[7]73.032.13550.9
IAYOLO[8]65.032.20952.5
GDIP[9]72.8167.0054.0
DENet[10]73.532.08955.7
PEYOLO[11]67.832.13538.8
Ours75.232.05256.6

Table 3: Quantitative comparisons of the ExDark dataset based on TOOD detectors


Reference:

[1] Ren et al. Faster r-cnn: Towards real-time object detection with region proposal networks. NeurIP2015.

[2] Lin et al. Focal loss for dense object detection. ICCV 2017.

[3] Redmon et al. Yolov3: An incremental improvement. ArXiv.

[4] Feng et al. Tood: Task-aligned one-stage object detection. ICCV 2021.

[5] Sun et al. Sparse r-cnn: End-to-end object detection with learnable proposals. CVPR 2021.

[6] Everingham et al. The {PASCAL} {V}isual {O}bject {C}lasses {C}hallenge 2012 {(VOC2012)

[7] Cui et al. You only need 90k parameters to adapt light: a light weight transformer for image enhancement and exposure correction. BMVC 2022.

[8] Liu et al. Image-adaptive yolo for object detection in adverse weather conditions. AAAI 2022.

[9] Sanket et al. Gdip: Gated differentiable image processing for object detection in adverse conditions. ICRA 2023.

[10] Qin et al. Denet: Detection-driven enhancement network for object detection under adverse weather conditions. ACCV 2020.

[11] Yin et al. Pe-yolo: Pyramid enhancement network for dark object detection. ICANN 2023

审稿意见
6

This paper proposes a plug-and-play module for extracting illumination-invariant features from low-light images. By integrating a zero-mean constraint within the module, a diverse set of kernels is effectively learned. These kernels excel at extracting illumination-invariant features, thereby enhancing detection accuracy. Experiments on object detection and semantic segmentation tasks demonstrate the effectiveness of the module.

优点

  1. The authors design a Illumination-Invariant Module to extract illumination-invariant features without requiring additional paired datasets, and can be seamlessly integrated into existing object detection methods.
  2. Lambertian assumption is introduced, which enhances the interpretability of the model.
  3. The authors claim that the proposed method achieves state-of-the-art performance on several benchmark datasets for object detection.

缺点

  1. The authors assume uniform illumination between neighboring pixels to eliminate the influence of the positional term 𝑚 in Equation 1. However, images captured in real-world scenes often exhibit uneven lighting. I question the validity of this assumption.
  2. The paper does not provide definitions for the symbols used in Equation 2.
  3. What does "Baseline" refer to in Tables 1 and 2, and why is the object detection performance based on low-light enhancement methods worse than the Baseline?
  4. Many end-to-end low-light face detection algorithms have been proposed. The authors should compare their method not only with general object detection algorithms but also with these specialized low-light face detection algorithms on the DarkFace dataset.

问题

There are some questions regarding the training loss. The paper does not provide the training loss of the model. Are the components for learning illumination-invariant features and the object detection network trained jointly, or is the illumination-invariant feature learning component trained separately? If trained separately, considering that the ExDark and DarkFace datasets consist only of annotated low-light images without corresponding normal-light images, is this training unsupervised? The authors should provide detailed information about the training process in the paper.

局限性

see weakness

作者回复

We would like to thank the reviewer for carefully reading our submission and providing many insightful comments.

The authors assume uniform illumination between neighboring pixels to eliminate the influence of the positional term mm in Equation 1. However, images captured in real-world scenes often exhibit uneven lighting. I question the validity of this assumption.

In the main text, we assume that the illumination values of neighboring pixels are equal, which allows us to eliminate the influence of illumination when extracting features. However, in cases of uneven illumination, this assumption's constraint is weakened but still helps reduce the impact of illumination, as shown in teaser (b). Our method is detection-driven and can further mitigate the influence of such uneven illumination during the learning process, as illustrated in teaser (d). Additionally, please refer to the appendix. When the actual distance between neighboring pixels is too large, the assumption may not hold. Therefore, we propose IIloss to constrain the extraction of illumination-invariant features to as close a region as possible, mitigating the impact of uneven lighting.


The paper does not provide definitions for the symbols used in Equation 2.

We apologize for the oversight. In the revised version, we will include the corresponding explanations. BB denotes the blue channel, whereas RR represents the red channel.


What does "Baseline" refer to in Tables 1 and 2, and why is the object detection performance based on low-light enhancement methods worse than the Baseline?.

'Baseline' refers to the detector that our method is built upon. The difference between 'Baseline' and 'Ours' is that we utilized the IIM (Illumination Invariant Module). Lowlight image enhancement methods may fail to achieve satisfactory performance due to the task of primarily satisfying human vision instead of machine perception. This phenomenon is also claimed by many related works[1,2,3].


Many end-to-end low-light face detection algorithms have been proposed. The authors should compare their method not only with general object detection algorithms but also with these specialized low-light face detection algorithms on the DarkFace dataset.

We explored several specialized low-light face detection algorithms [4,5,6]. However, most of these algorithms are evaluated on the DarkFace test set that does not provide Ground Truth, or lacks open-source code. Consequently, we selected some representative face detection algorithms adopted by these methods as our baselines, including DFSD[7] and PyramidBox[8]. We implemented our YOLA using their default settings for a fair comparison, as shown in Table 1. Our YOLA outperforms the baselines, achieving 0.6 and 0.8 higher mAP on PyramidBox and DFSD, respectively, demonstrating its generalization ability in the face detection task.


There are some questions regarding the training loss. The paper does not provide the training loss of the model. Are the components for learning illumination-invariant features and the object detection network trained jointly, or is the illumination-invariant feature learning component trained separately? If trained separately, considering that the ExDark and DarkFace datasets consist only of annotated low-light images without corresponding normal-light images, is this training unsupervised? The authors should provide detailed information about the training process in the paper.

Our YOLA model is trained without any additional image pair annotations using an end-to-end joint training fashion. Specifically, the features produced by the IIM are supposed to be inherently illumination invariant at initialization, based on the Lambertian assumption. Thus, we do not require any additional normal light images or other loss such as Brightness loss to guide its learning. We only employ detection loss to guide the IIM in producing task-specific illumination invariant features for downstream tasks.


DetectorBaselineOurs
PyramidBox47.748.3
DFSD44.945.7

Table 1: Face detection algorithms on DarkFace Dataset


Reference:

[1] Cui et al. Multitask aet with oerthogonal tangent regularity for dark object detection, ICCV 2021.

[2] Khurram et al. Featenhancer: Enhancing hierarchical features for object detection and beyond under low-light vision, ICCV 2023.

[3] Qin et al. Denet: Detection-driven enhancement network for object detection under adverse weather conditions. ACCV 2020.

[4] Wang et al. Unsupervised face detection in the dark. T-PAMI 2022.

[5] Wang et al. Hla-face: Joint high-low adaptation for low light face detection. CVPR 2021.

[6] Yu et al. Single-stage face detection under extremely low-light conditions. ICCVW 2021.

[7] Li et al. Dsfd: dual shot face detector. CVPR 2019.

[8] Tang et al. Pyramidbox: A context-assisted single shot face detector. ECCV 2018.

评论

Thanks to the author's detailed response, I choose to rise my score to 7.

评论

Thank you for your feedback! We appreciate your support!

最终决定

This paper proposed an end-to-end method for object detection in low light scenes using several constraints. Initially, three reviewers were positive, and several of them raised their scores after reviewing the author's feedback. Although the negative reviewer remains concerned about the lack of experiments on low-light enhancement, the AC reviewed the comments and the author's feedback and agreed with the authors' perspective. The paper can be accepted for publication at NeurIPS.