Density Ratio Estimation-based Bayesian Optimization with Semi-Supervised Learning

Jungtaek Kim

OpenReview PDF

提交: 2024-09-26更新: 2025-02-05

摘要

关键词

Bayesian optimizationDensity ratio estimation-based Bayesian optimizationBayesian optimization with semi-supervised learning

评审与讨论

审稿意见

评分: 5置信度: 42024-10-22

The paper proposes a novel Bayesian optimization (BO) method called DRE-BO-SSL, which combines semi-supervised learning (SSL) with Density Ratio Estimation-based BO (DRE-BO). It addresses the common issue of overconfidence in classifiers used in DRE-based BO methods like BORE and LFBO, particularly when early-stage data is limited. By incorporating SSL techniques—specifically estimating pseudo-labels through label propagation and label spreading—the method refines class label predictions and improves classifier accuracy. This approach enhances the exploration-exploitation trade-off, mitigating the tendency to over-exploit due to overconfident classifiers. Instead of fitting a regressor to observed data, DRE-BO-SSL uses a classifier to guide the search, computing the acquisition function in terms of class probability ratios. Comparative experiments on synthetic and real problems demonstrate that DRE-BO-SSL outperforms traditional Gaussian Process-based BO and recent density ratio-based approaches across a wide range of tasks. The empirical results show that integrating SSL techniques into DRE-based BO effectively improves performance by leveraging unsupervised data sampling to address overconfidence issues.

优点

The proposed method is novel and represents a good contribution to Density Ratio Estimation-based Bayesian Optimization (DRE-based BO). By using semi-supervised learning in DR-based BO to address the overconfidence issue, the authors offer a promising approach with nice empirical results. The concept of employing semi-supervised learning is intriguing and shows potential in solving the general overconfidence problem in DRE.

缺点

While the proposed method is novel, several significant issues need to be addressed:

There is no clear explanation of how the overconfidence problem is resolved through the use of semi-supervised learning. Although Appendix C attempts to elucidate the reasoning or mechanism by which semi-supervised learning mitigates overconfidence, the explanation lacks clarity and fails to provide a compelling rationale. In fact, Figure 12 appears to indicate that unlabeled data may not be necessary, which contradicts the premise that semi-supervised learning is essential for addressing the overconfidence issue.
As the authors themselves acknowledge, the proposed method does not offer theoretical guarantees within the context of Bayesian Optimization. This omission raises concerns about the method's validity and applicability.

These shortcomings suggest that the paper does not adequately justify the need for the proposed method, nor does it thoroughly explain how the method effectively addresses the identified problem. Addressing these issues would strengthen the contribution and impact of the work.

Another issue, rather minor compared to the above motivation issue, is that the presentation of the paper should be improved. For example, \zeta is used in the Introduction before it is properly defined in Section 3.

问题

\Sigma, upper/lower bounds u,l for the truncated normal are not properly defined. How should we set or estimate these values?
How this particular way of sampling is related to the cluster assumption?

审稿意见

评分: 3置信度: 32024-11-02

The paper proposes semi-supervised learning (SSL) based Bayesian optimization (BO). The basic idea is to use label propagation (spreading) to utilize unlabeled points. The authors claim that SSL provides a better classification resulting from which a better solution should be obtained.

优点

To my knowledge, a combination of the density ratio based BO and SSL has not been widely studied.
The paper is well-organized and easy to follow.

缺点

The fundamental assumption of SSL is that p(X) provides some information about p(Y∣X) (since the unlabeled data are sampled from P(X)), which has been discussed in classical text such as [1]. However, there is no discussion or fundamental justification of the situations in BO where this hypothesis holds. In other words, the paper fails to show a rationale why SSL is effective for BO.

[1] Chapelle, et al., Semi-Supervised Learning, MIT Press, 2006.

In fact, the relation between p(X) and p(Y|X) is highly unclear in the experimental settings, and therefore, it is unclear why SSL is effective. In the synthetic data, X is from uniform and nothing is related to Y. In tabular benchmark and NATS-Bench, the candidate points are uniform grid according to the tables in appendix. Therefore, it is quite unclear why unlabeled points is informative for applying SSL. In 64D MNIST, samples in the same three digits may consists of low-dimensional manifold. On the other hand, the number of three digits, the target to be optimized, would not continuous in the pixel space (e.g., 000 and 100 would be relatively close in the pixel space, but the value of y is largely different. In other words, proximity in X space is not fully informative about proximity in Y space). For me, validity as a BO benchmark problem of 64D MNIST is not clear.
SSL methods used in the paper is most classical methods (proposed in 2002 and 2003), no more recent methods are mentioned in detail.
How SSL avoids the overconfidence issue. Mathematical detailed mechanisms are not shown.
It is unclear how the cluster assumption and f(z) is related. Since f(z) is (truncated) standard normal, the cluster structure should become quite simple and nothing is related to Y. Therefore, here again, I currently do not think f(z) satisfies the requirement of SSL.

问题

How \zeta^-1 p(z=1|x)/(p(z=1|x) + p(z=0|x)) is derived from (2)?
How the threshold for y is determined from zeta?
In 64D MNIST, the authors mentioned that the problem is to find the 'minimum' multi-digit number. This means that there exists multiple global minimum? (i.e., all images with '000' are global minimum?)

审稿意见

评分: 3置信度: 52024-11-03

This paper proposes the DRE-BO-SSL method, which introduces the concept of semi-supervised learning to address the over-confidence (over-exploitation) issue observed in density-ratio-based Bayesian optimization problems. In general Bayesian optimization, Bayesian regression models such as Gaussian Process models are often used; however, this study focuses on the density-ratio-based approach discussed in works such as [Bergstra et al. (2011)], [Tiao et al. (2021)], and [Song et al. (2022)]. The main contribution of the proposed method is the incorporation of traditional semi-supervised learning approaches, such as label propagation and label spreading, into density-ratio-based Bayesian optimization. According to the authors, this allows for resolving the over-confidence (over-exploitation) problem in density-ratio-based BO.

优点

Bayesian optimization based on a standard Gaussian Process model incurs high computational costs when the sample size is large. The density-ratio-based approach, as discussed in works such as [Bergstra et al. (2011)], [Tiao et al. (2021)], and [Song et al. (2022)], could serve as an effective alternative. In this sense, the topic addressed in this paper is important.
The authors' approach of identifying issues based on experimental observations, such as those in Figure 1, and formulating the problem generally to address these issues is practical, as it reflects real-world challenges in data analysis.
The paper provides a clear and concise overview of existing methods such as density-ratio-based Bayesian optimization and label propagation, making it accessible even for readers who are not well-versed in this field.

缺点

This study aims to tackle the over-confidence and over-exploitation problem in density-ratio-based Bayesian optimization. However, it remains unclear why a semi-supervised learning approach based on label propagation would effectively address this issue. Similarly, the rationale behind using the sampling method in Equation (10) for unlabeled point sampling was not clearly explained, making the motivation difficult to grasp. If the authors rely solely on experimental evaluations without clear explanations, presenting these findings at a top-tier conference like ICLR may be premature; more thorough discussions and theoretical analyses are needed.
In my opinion, the proposed DRE-BO-SSL method lacks originality for top-tier conferences like ICLR, as it simply applies label propagation—traditional semi-supervised learning techniques—to existing density-ratio-based Bayesian optimization. Deeper considerations are necessary, such as whether other semi-supervised learning methods might prove ineffective or what outcomes might arise if a semi-supervised learning approach were applied to Gaussian Process-based Bayesian optimization.

问题

None.

AC 元评审

2024-12-18

This paper presents a Bayesian Optimization (BO) method that integrates Semi-Supervised Learning (SSL) with Density Ratio Estimation (DRE)-based BO to address over-confidence issues in classifiers. By using SSL techniques like label propagation to refine class predictions, the method aims to improve exploration-exploitation balance. The paper has some strengths such as novel combination of DRE-based BO and SSL to address the over-confidence classifier problem. Furthermore, it shows competitive empirical performance against traditional BO methods. However, it also has some relevant weaknesses like lack of a theoretical justification. The paper does not convincingly explain why SSL resolves the over-confidence issues. Moreover, it has empirical limitations such as relevant experimental benchmarks, and some results suggest unlabeled data may not always be beneficial. The SSL methods used are outdated, with limited exploration of alternatives or modern approaches. The paper also has some presentation issues. Important details, such as thresholds and parameter definitions, are unclear. While the concept of integrating SSL into DRE-based BO is novel and shows some promise, the paper needs significant revisions to address theoretical gaps, improve clarity, and provide stronger empirical and methodological justifications.

审稿人讨论附加意见

The authors did not provide a response.

最终决定Reject

2025-01-22

Reject