3.7

/10

withdrawn3 位审稿人

最低3最高5标准差0.9

3.7

置信度

正确性2.3

贡献度1.7

表达2.0

ICLR 2025

Exploring the Causal Mechanisms: Towards Robust and Explainable Algorithm Selection

Xingyu Wu,Jibin Wu,Yu Zhou,Liang Feng,KC Tan

OpenReview PDF

提交: 2024-09-27更新: 2024-11-22

TL;DR

We introduce causality to describe the underlying mechanisms of the algorithm selection task, and design a robust and explainable algorithm selection model.

摘要

关键词

Algorithm SelectionAutomated Machine LearningRobustnessExplainability

评审与讨论

审稿意见

评分: 3置信度: 42024-10-29

This paper presents a new method towards robust and explainable algorithmic selection by using of causality. However, both of the clarity and novelty is lack.

优点

Robust and explainable algorithm selection is important and of interest.
Incorporating causality into this area is apealing.

缺点

Main issue 1. Clarity. The overall presentation is poor, as authors cannot highlight their contribution and distinguish their work from existing one. For example, counterfactual explanation is well-motivated, and their causal versions are also very popular in previous work. However, authors claim that "we measure the minimal intervention from the perspectives of explanation complexity and explanation strength,", which is wield, as the framework of minimial intervention for CE has already been established in NeurIPS 2021.

(a) What is the physical meaning of AF and PF? Can you provide more detailed clarification?
(b) What is the core task of this paper?
(c) A more clear version of paper is required. Definition of algorithmic selection, the objective of the task, with surrounding definitions and running examples.

Lack of novelty.

(a) Why the causality is required? I am not convinced by your illustration. Is it introduced for just dealing with distributional shift? If so, please provide formal characterization on shift. If not, please justify this point in detail.
(b) Searching DAG in continuous spaces is always a popular approach, and the counterfactual explanation is well-studied. Please justify your contribution.

Minor issues. You cannot define conditional distribution as P(AF | PF), as PF it self serves as random vectors. Please use more strict and formal illustration in theoretical analysis.

问题

See Weakness.

审稿意见

评分: 5置信度: 42024-10-31

This paper investigate the problem of selecting optimal algorithm for particular problem instance. Originally, the algorithm feature is predicted based on the problem feature. The correlation-based machine learning methods can be applied to solve this. However, this kind of methods are vulnerable to data bias and distribution shift. To address these issues and improve the transparency, this paper introduce causal structure learning to explore the underlying mechansim of algorithm selection. The experimental results show that the proposed CausalAS method achieve robustness to distribution shift, and provide explainability through causal graph and counterfactual explanation.

优点

The proposed method achieve robustness to distribution shift, which is an important quality to the application of methods. Empirical results show the effectiveness of CausalAS in different scenarios of distribution shift. The improvement margin is significant. Based on the intermediate product (i.e. causal graph), the CausalAS method provides two kinds of explanations, which is critical to the transparency of the method. The above two properties together contribute the trustworthy application of the method.

缺点

1.I think the novelty of the proposed method is limited. The proposed method is similar to that in [1]. The design of loss function and the given assumptions are similar to the counterpart in [1]. And the problem formulation seems to be directly transformed from recommendation (i.e. problem corresponds to user, algorithm corresponds to item). And the authors did not give citation to this important reference.

[1] Yue He, Zimu Wang, Peng Cui, Hao Zou, Yafeng Zhang, Qiang Cui, Yong Jiang. CausPref: Causal Preference Learning for Out-of-Distribution Recommendation.

问题

The authors claim to find the optimal algorithm. However, the candidate algorithms are only divided into selected (S=1) and not selected (S=0). I want to know whether there is only one selected algorithm. If not, which one is the optimal?
In the section of "Demonstration of Explainability", only the feature index is demonstrated in Figure 4. The semantic meaning of them is not known. It limits the explainability of the shown demonstration.

审稿意见

评分: 3置信度: 32024-11-04

This work introduces causality to explore underlying mechanisms in algorithm selection problem. Based on Pearl’s causal framework, it proposes a structural equation model (SEM) based on a causal DAG among problem features and algorithm features. Neural network-based method is proposed to fit the model, through minimizing a mixture of reconstruction, sparsity, acyclicity and selection loss. As both demonstrated in texts and experiments, this method is featured by its robustness under distribution shift and explainability towards understanding the mechanism between problem and algorithm features, and it outperforms other methods in most instances especially when constructing dense causal graphs.

优点

This paper proposes a novel way to treat the algorithm selection problem from causal perspective, and awares us of the bias caused by distribution shift in algorithm selection. It also builds an adequate causal framework to treat the problem under Pearl’s causal framework.
Experiment results endorse the superior performance of this method in terms of accuracy, robustness, etc. comparing with previous methods in algorithm selection.

缺点

Some parts of the paper are hard to catch up. e.g.

Causal Learning Structure In section 2.2, the paper considers incorporating graph information of DAG through designing the first layer of NN as an adjacency matrix. It is argued that it leads to a consistent model. However, there seems to be no theory or references to prove the consistency.
Loss function The loss function is designed to be a weighted sum of four different losses. As these four losses measure completely different aspects, it is suggested to discuss on the weight so as to make them comparable. Besides, if the graph is pre-specified, then what is the use of sparsity and acyclic losses? If it is to be discovered, there should be an illustration on how to construct the DAG to ensure there is only directed flow from problem features to algorithm features.
Do-calculus The notation of do-calculus in section 3.2 needs to be clarified. e.g. in $do(**PF**=**PF**+\delta_{**PF**})$ , it should be clarified which PF stands for variable and which stands for specific values.

Overall, the paper is well-motivated by incorporating causal frameworks and methods (Do-calculus, SEM, Causal learning) to deal with algorithm selection. However, it seems more efforts ought to be spent on fulfilling details of this method and explaining its rationality.

问题

The questions are sufficiently described in the weaknesses section.

撤稿通知

2024-11-22

We deeply regret the significant mismatch in expertise caused by the reviewer assignment process. However, we are grateful for the guidance and support provided by Area Chair EjdX. During our discussions with the AC, we were informed that "The reviewers have been selected based on their knowledge of causality and distribution shift." Unfortunately, it is evident that our paper focuses on algorithm selection, which differs significantly from the reviewers' expertise.

The lack of the necessary expertise has led to reviewers struggling to understand the core contributions of our work. In some cases, it seems they even do not know what the task focused on by the paper is. It is neither practical nor appropriate to dedicate substantial space in the main body of a research paper to providing a tutorial on such a foundational topic, nor is it within the scope of an algorithm selection study to propose novel methods in causal learning.

Regrettably, the review comments we received have little relevance to the actual research focus of our paper. Given these circumstances, providing a rebuttal would be unproductive. Therefore, we have decided to withdraw our submission.