5.5

/10

Rejected4 位审稿人

最低5最高6标准差0.5

4.0

置信度

正确性2.8

贡献度2.8

表达3.3

ICLR 2025

Balanced Learning for Domain Adaptive Semantic Segmentation

Wangkai Li,Rui Sun,Bohao Liao,Zhaoyang Li,Tianzhu Zhang

OpenReview PDF

提交: 2024-09-27更新: 2025-02-05

摘要

关键词

Semantic segmentation

评审与讨论

审稿意见

评分: 6置信度: 52024-10-26

This paper introduces a BLDA method to address class-imbalanced problem in unsupervised domain adaptive semantic segmentation. BLDA analyzes the distribution of predicted logits to assess class prediction bias and proposes an online logits adjustment mechanism to balance class learning in both source and target domains. The method incorporates Gaussian Mixture Models (GMMs) to estimate logits distributions and aligns them with anchor distributions using cumulative density functions. Extensive experiments on standard UDA semantic segmentation benchmarks demonstrate significant performance improvements.

优点

The class-imbalanced is an important issue in DASS, and this paper provides a novel method to tackle this problem by aligning the logits distributions of all classes with anchor distributions to achieve balanced prediction.
Extensive experiments have demonstrated the effectiveness of the proposed method.

缺点

The paper claims a key contribution in proposing a post-hoc class balancing technique to adjust the network's predictions by establishing two anchor distributions, $P_p$ for positive predictions and $P_n$ for negative predictions. However, the paper lacks sufficient explanation regarding the selection criteria for these anchor distributions, which raises questions about the method's validity and soundness.
The current approach in this paper aligns the positive and negative distributions to anchor distributions as part of the post-hoc class balancing strategy. However, based on my understanding, this alignment may not effectively address label noise—a crucial aspect of self-training where pseudo label denoising is often central to performance improvement. Instead, recent studies [1,2] have demonstrated the utility of negative pseudo labeling, showing that leveraging negative information more directly can enhance model robustness and reduce noise. Clarification on the rationale for this alignment-based approach, especially in comparison to existing negative pseudo-labeling methods, would help to justify the method’s efficacy and theoretical basis in the context of label noise mitigation.

[1]. Domain Adaptive Semantic Segmentation without Source Data

[2]. A Curriculum-style Self-training Approach for Source-Free Semantic Segmentation

问题

Some questions in Figure 3:

Figure 3 presents the logit distributions for positive and negative samples; however, the lack of labeled x- and y-axes in the figure makes it challenging to interpret these distributions effectively.
There is no clear explanation of the direction of reweighting and resampling applied to the logit distributions. This omission makes it difficult to understand the intended insights from Figure 3, as well as the overall method’s mechanism and impact on balancing.
There are a few grammatical errors, such as the "Discusiion" in L307.

审稿意见

评分: 5置信度: 42024-11-04

This paper discusses the unsupervised domain adaptation problem in semantic segmentation tasks. The method first identifies unbalanced classes by analyzing the predicted logits. Then, it aligns the distributions using a preset anchor distribution. Finally, it also adopts a Gaussian mixture model to estimate logits online to generate unbiased pseudo-labels for self-training. Experiments are conducted on the classic GTAv/SYNTHIA to Cityscapes benchmark for evaluation.

优点

The paper is well-written and easy to follow. The figures clearly show the distribution trends to help understand the core idea.
There are many formula languages to describe the proposed method precisely.
The experiments on the GTAv/SYNTHIA/Cityscapes benchmark show clear improvements over baseline methods.

缺点

The novelty is limited. The data distribution problem is not newly recognized, and the proposed method adopting anchor distributions for alignment and GMM for unbiased generation is also explored by previous methods. For example, the following papers [a-d] also adopt anchors and/or GMM methods to cross-domain alignment. Please consider providing more discussion with these related works.
The method is only verified on a relatively small-scale benchmark. The compared works are from two years ago, which cannot prove this work's value to today's more advanced semantic segmentation approaches. Please consider providing more analysis with other datasets to prove the generalization ability of the method. Optional datasets such as Vistas, IDDA, BDD100k, and VIPER.

[a] Multi-Anchor Active Domain Adaptation for Semantic Segmentation

[b] Category Anchor-Guided Unsupervised Domain Adaptation for Semantic Segmentation

[c] ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation

[d] Uncertainty-aware Pseudo Label Refinery for Domain Adaptive Semantic Segmentation

问题

Please refer to the weaknesses for details. Due to the concerns of the novelty and potential impact, the reviewer is inclined to rate a borderline reject.

审稿意见

评分: 6置信度: 32024-11-06

This paper addresses the challenge of class imbalance in unsupervised domain adaptation (UDA) for semantic segmentation, where labeled source data is used to improve the model’s performance on an unlabeled target dataset. The authors propose a Balanced Learning for Domain Adaptation (BLDA) technique that aligns class predictions by analyzing and adjusting predicted logit distributions, even without prior knowledge of distribution shifts. BLDA enhances UDA model performance by mitigating class bias, particularly for under-represented classes, leading to more accurate segmentation.

优点

The motivation is clear, with a thorough statistical analysis of the class bias issue in unsupervised domain adaptation (UDA) for semantic segmentation (Figures 1 and 2).
The paper is generally well-written, well-structured, and easy to follow.
The proposed method comprises four modules. Although each module is simple and widely used in the machine learning field (e.g., GMM and alignment with anchor distributions), these techniques are effective in addressing issues found in this task.
The experiments are comprehensive, covering three transfer tasks for segmentation, an additional image classification task (included in the supplementary materials), and extensive qualitative analyses.

缺点

The proposed method is computationally heavy, as it includes an additional regression head with extra training objectives and requires GMM updates via EM algorithms. Consequently, this approach may incur significantly more computation time and memory usage than baseline methods.
In Tables 1, 2, and 4, all existing methods equipped with BLDA are outdated. It remains questionable whether current SOTA methods (in 2023 and 2024) are sufficient to address prediction bias issues.

问题

For weakness 1, could you conduct a theoretical complexity analysis comparing the proposed BLDA with the baseline? Additionally, please report and analyze the actual inference time, training time, and memory usage, along with a comparison to baseline methods (without adding BLDA).
For weakness 2, could you integrate BLDA into recent UDA segmentation methods [A], [B], [C], and [D]?
The mentioned works are highly relevant but lack citations in this paper. Could you update Section 2.1 (Related Work) to include all necessary references?

[A] Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation [B] CDAC:Cross-domain Attention Consistency in Transformer for Domain Adaptive Semantic Segmentation [C] Diffusion-based Image Translation with Label Guidance for Domain Adaptive Semantic Segmentation [D] Learning Pseudo-Relations for Cross-domain Semantic Segmentation

2024-11-26

Thank you for your detailed response. Most of my concerns have been addressed, and I will therefore maintain my current positive rating.

审稿意见

评分: 5置信度: 42024-11-07

This paper proposes a novel approach called BLDA to address class bias in domain adaptation for semantic segmentation tasks. It first evaluates prediction bias across different classes by analyzing the network's logits distribution. Then, a a post-hoc method is designed to adjust logits distributions after training. With the logits changes, a real-time logits values adjustment module is proposed by using GMMs to estimate logits distribution parameters online. The author then introduces cumulative density estimation as shared structural knowledge to connect the source and target domains. An additional regression head in the network predicts the cumulative distribution value of samples, which represents class discriminative capability, further enhancing adaptation performance on semantic segmentation tasks. The results in the experiments shows its effectiveness as a module addition to selected existing DA for segmentation baselines.

优点

This paper provide a new way to measure the class distribution changes in semantic segmentation by the logits distribution.
The proposed module could easily be applied to existing UDA for semantic segmentation methods, potentially have a broad use in this area.
The proposed module is generally effective on most of the classes in the two benckmark tasks.
The visual aid is good, provide an intuition of the motivation, also demostrates the effectiveness of the proposed module.

缺点

The proposed method relies on the logits distribution. However, this distribution can be affected by data quality and model architecture, which can affect the accuracy of bias assessment.
As a DA for segmantation task, a very severe issue is its efficiency concern. Adaptation process already cost a lot of time and computational resources, the proposed method seems exacerbated this issue by multiple GMMs. An efficiency study including wall-clock time or other efficiency measurement will be good to discuess the trade-offs between class-balanced performance and the actual cost.
If the anchor distribution is far away from the true distribution of the target domain, logits alignment may be suboptimal, meaning if the domain gap is large, this part may be not work.
As a module proposed rather than a whole algorithm, its effectiveness is expected to be confirmed on a considerable large amount of baselines methods, however, only few of them are studied and compared only for Transformer-based methods. I would recommand to evaluate on more baselines such as [1][2][3] and backbones (such as Deeplab v2 Deeplab V3+, for methods such as ProDA) to conform its effectiveness. especially those even have more severe class-imbalance issues.
There exist a huge amount of methods or loss functions targeting class-imbalanced issue (for or not for semantic segmentation), some need in related works and some need a experiments for comparison, but only few of them listed and discussed.
Since the classes have been categorised as over/under predicted, group them in the experiments and study would be better to understand the module effectiveness on classes with different characteristics.

I will scoring up or down based on the author's reply.

[1]. Domain adaptive semantic segmentation by optimal transport

[2]. DiGA: Distil to Generalize and then Adapt for Domain Adaptive Semantic Segmentation

[3]. Prototypical contrast adaptation for domain adaptive semantic segmentation

问题

See the weakness section.

AC 元评审

2024-12-11

The work proposes a novel approach namely BLDA to tackle class bias in unsupervised domain adaptation for semantic segmentation. The method analyzes logits distributions to assess class imbalance, employs Gaussian Mixture Models (GMMs) to adjust logits online, and utilizes cumulative density estimation to align source and target domains. Extensive experiments demonstrate its effectiveness as a plug-and-play module, with improvements in segmentation performance across diverse datasets and baselines. Strengths of the paper include its clear motivation and comprehensive experimentation. However, the novelty of the proposed approach is somewhat limited due to similarities with prior works that use GMMs or anchor-based approaches. Reviewers also questioned its computational inefficiency and noted a lack of validation on larger or more diverse benchmarks. The authors have proactively addressed most concerns on experiments, yet the core contributions remain marginal to reach the publication bar of ICLR.

审稿人讨论附加意见

During the discussion, reviewers raised concerns about novelty, computational cost, and generalizability. Specific issues included the similarity to prior GMM-based methods, insufficient evaluation on recent baselines, and limited benchmarks. The authors responded with detailed explanations, providing theoretical complexity analysis, efficiency improvements, and validation on additional datasets such as VIPER and BDD. These efforts demonstrated the method’s practical applicability and clarified its unique contributions to addressing class bias in UDA.

However, reviewers like 9gg7 and Z47K remained unconvinced, noting the incremental novelty and suboptimal choice of benchmarks. Despite thorough rebuttals and additional experiments, the reviewers keep their initial ratings due to their doubts about the paper's broader impact and relevance to current SOTA methods. These considerations ultimately lead to the decision to reject, while acknowledging the potential of the work with further development and validation.

最终决定Reject

2025-01-22

Reject