PaperHub
4.4
/10
Rejected5 位审稿人
最低1最高6标准差1.7
1
5
6
5
5
3.0
置信度
正确性2.4
贡献度2.0
表达3.0
ICLR 2025

Principle Counterfactual Fairness

OpenReviewPDF
提交: 2024-09-28更新: 2025-02-05

摘要

关键词
Counterfactual Fairness

评审与讨论

审稿意见
1

The authors propose principal counterfactual fairness, i.e. various forms of counterfactual fairness that are only considered for attributes that have no individual causal effect on the outcome of interest. They also propose a postprocessing algorithm to achieve principal counterfactual fairness, and derive statistical bounds.

优点

The authors relate several different definitions for counterfactual fairness in a common notation, and show their principal stratification can be seen as an extension of many prior definitions.

They tackle an interesting problem: the formalisation of which attributes ought to be protected, and which shouldn't.

缺点

W1. Unfortunately, I believe their problem formulation is flawed by design. The authors only consider the principal strata, i.e. partitions of the data according to a sensitive attribute, for sensitive attributes that causally don't affect the actual outcome. However, why would we ever care about enforcing fairness when that is not the case? Taking the example of the authors: if the task is to estimate whether someone is a good runner, why would we enforce any equality with respect to people with a leg disability, if we already assume that they will not actually be equally good runners? The authors make the same observation in the paper, but fail to discuss the simplest solution: just don't enforce any fairness definition for groups that have no causal effect on the outcomes. In fact, they show in Corollary 1 that their definition of principal counterfactual fairness simplifies to 'standard' counterfactual fairness if we assume that protected attributes indeed have no individual causal effect on all outcomes. However, wouldn't I always either assume the protected attributes have no individual causal effect, or just ignore the group? In fact, making this assumption is a core part of any causal model for fairness.

W2. "This criterion demands that any alterations made to the values of protected attributes do not result in changes to individual decision-making" -> This is highly unclear. Counterfactual fairness demands that changes in protected attributes do not causally result in different decisions, because counterfactual fairness allows changes in the features if the protected attributes change.

W3. Table 1 is also unclear, especially without reading the preliminaries first. How is D(0) different from (D = 1)? Is D(0) the 'do'-operator? I looked it up in Mitchell et al. 2021, and they use D under the do operation of setting A to either 0 or 1. Also, it could be clarified that the definitions of counterfactual parity and cond. counterfactual parity are taken from Kusner et al. 2017, but simply named differently by Mitchell et al. 2021.

W4. The definition of counterfactual equalized odds appears inaccurate. Here, the authors only condition on the product of Y and D = 0, which suggests that the condition is always Y = 1 and D = 0. However, this clearly only implements a form of equalized opportunity. In the actual paper by Mishler et al. 2021, they consider any Y as the condition, in addition to the constraint that D = 0.

W5. Some clarity issues: there are no line numbers, and the authors confuse the spelling of "principle" vs "principal".

问题

Q1. Why is definition 6 implied to be stronger than definition 5? It would seem neither generally implies the other.

Q2. It's unclear what the relation is between the definition by Mishler et al. 2021 and the definition by Imai and Jiang 2020. It would seem that the former is a special case of the latter?

Q3. Is the ignorability assumption realistic? A ind. of D(1) and D(0) is something we typically impose as a condition, not as an assumption.

审稿意见
5

In this paper, the authors propose a new fairness notion called principle Counterfactual Fairness (PCF). The motivation behind this notion is that algorithms only need to be fair to individuals whose protected attribute has no individual effect on the outcome of interest. The authors derive necessary conditions to assess whether an algorithm satisfies principle CF and propose a corresponding optimization-based evaluation method. They also introduce a post-processing algorithm to adjust unfair decisions. The effectiveness of the algorithm is validated on synthetic datasets and one real-world dataset.

Claim: My research experience in causality and counterfactual fairness is primarily based on the Structural Causal Model (SCM) framework. While I have some basic knowledge of the Potential Outcome (PO) framework, it may not be sufficient to fully assess the novelty or significance of the proposed methods, and my evaluation might be biased.

优点

  1. The writing is clear and the paper is easy to follow.
  2. Relaxing the original CF constraint so that it does not need to apply to all individuals offers an interesting perspective, and the motivation for this approach is well-presented.

缺点

Relationship to CF

Overall, I find it challenging to compare the proposed PCF with the original CF in [1]. Is it possible to define principle CF within the SCM framework? More specifically, what would the causal relationships between AA, DD, and YY look like in a causal graph?

Experiment

  1. My primary concern is about what the authors aim to justify through the empirical study and whether they achieve that goal. Regarding the first question, would it be useful to test previous CF methods to assess if they are indeed too restrictive for PCF? Regarding the second question, in the current draft, the authors show that their post-processing algorithm can improve CF and PCF. How can we be sure that existing fairness or CF methods wouldn’t perform better than the proposed approach?
  2. Could the authors provide justification for their choice of dataset? For instance, why not use datasets like Law or UCI Adult, which are more commonly used in CF literature [1][2][3][4]?
  3. What is the motivation behind using a causal discovery algorithm and creating subgroups based on its results? This part is unclear to me, particularly since the definition of PCF is not based on an SCM.

Significance

I cannot fully acknowledge the significance of the proposed framework due to the following reasons:

  1. As mentioned above, there is confusion about the comparison between CF and PCF.
  2. Concerns about the experimental design and its ability to justify the proposed framework’s or method’s contributions.
  3. It appears that the authors only provide a necessary condition for an algorithm to satisfy PCF, which relies on the minimum and maximum values of the solution to an optimization problem. I’m uncertain about the effectiveness of this criterion—what is the likelihood that an algorithm could pass this test without truly satisfying PCF? This aspect is not discussed, either theoretically or empirically.

[1] Kusner, M. J., Loftus, J., Russell, C., & Silva, R. (2017). Counterfactual fairness. Advances in neural information processing systems, 30.

[2] Zuo, Z., Khalili, M., & Zhang, X. (2023). Counterfactually fair representation. Advances in Neural Information Processing Systems, 36, 12124-12140.

[3] Kim, H., Shin, S., Jang, J., Song, K., Joo, W., Kang, W., & Moon, I. C. (2021, May). Counterfactual fairness with disentangled causal effect variational autoencoder. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 9, pp. 8128-8136).

[4] Rosenblatt, L., & Witter, R. T. (2023, June). Counterfactual fairness is basically demographic parity. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 12, pp. 14461-14469).

问题

  1. What is the relationship between the proposed notion and Individual Fairness?
  2. In the conclusion section, the authors suggest that causal discovery might be helpful for PCF. Could the authors elaborate a bit more on this?
  3. Could the authors provide some interpretation of Theorem 2, which seems to be missing in the current draft?

伦理问题详情

N/A

审稿意见
6

This paper introduces principal counterfactual fairness, a new fairness concept designed for scenarios where only certain individuals' decisions should remain unaffected by changes to their protected attributes. While traditional counterfactual fairness requires that decisions remain consistent across all individuals regardless of protected attributes (such as race or disability), this paper argues that fairness should only apply when a protected attribute has no causal effect on the outcome. For example, leg disability should not influence college admission but could justifiably affect athlete selection. Using principal stratification from causal inference, the authors define fairness conditions for cases where the protected attribute does not causally impact the outcome of interest. They derive statistical bounds to assess fairness and present an optimization-based test to verify if an algorithm meets these criteria. Additionally, a post-processing technique is introduced to adjust decisions minimally, ensuring principal counterfactual fairness. The approach's effectiveness is demonstrated through experiments on both synthetic and real-world datasets, validating the proposed fairness notion and methods.

优点

The paper is very well written and demonstrates several significant strengths: First, the research question is both clearly articulated and compellingly motivated, addressing an important gap in the current fairness literature by focusing on when and for whom fairness constraints should apply in decision-making algorithms. Second, the paper conducts a thorough literature review, effectively situating the research within the broader context and clearly delineating how this work builds on and extends existing fairness concepts, particularly in relation to counterfactual fairness and causal inference.Third, the paper is a nice combination of theory and practical applications. The authors established some theoretical guarantees for the proposed fairness criterion, and proposed feasible approach to implement the method in practice. Additionally, the paper excels in balancing theoretical rigor with practical relevance. The authors not only introduce a novel fairness criterion—principal counterfactual fairness—but also provide robust theoretical guarantees for its implementation. They derive statistical bounds that make the fairness criterion both evaluable and actionable, and develop a practical, optimization-based evaluation method. Furthermore, the proposed post-processing approach offers a feasible solution for applying the fairness criterion with minimal changes to individual decisions. Through experiments on both synthetic and real-world datasets, the authors demonstrate the method's practical effectiveness and applicability, showcasing the real-world impact of the theoretical insights.

缺点

Despite its strengths, the paper has a few limitations. First, while the proposed principal counterfactual fairness criterion is an interesting extension of counterfactual fairness, the contribution feels somewhat incremental. The extension applies primarily to certain narrowly defined scenarios, which may limit its broader applicability and introduce additional challenges in estimation, particularly when defining principal strata for diverse datasets. Second, a notable limitation is the partial identification of principal counterfactual fairness. Since the fairness criterion is only partially identifiable, it is impossible to obtain fully unbiased point estimates, which raises concerns about the potential implications of estimation bias. However, the impact and magnitude of this bias on decision-making outcomes remain unclear, warranting further exploration. Finally, the generalizability of this method to a wide range of applications is uncertain. While the motivating examples provide some context, they are quite specific, and it is not clear how effectively this approach could be applied to varied real-world fairness challenges outside of these examples. Further empirical work and diverse application contexts would help clarify the scope and utility of this method.

问题

(1) A recurring error in the manuscript is the misspelling of "principal fairness" as "principle fairness." I recommend the authors conduct a thorough review to correct these instances.

(2) Given the partial identification of principal counterfactual fairness, it would be valuable to understand the impact of bias in parameter estimates. I suggest that the authors conduct additional numerical analysis to provide insights into how this bias might affect the interpretation and robustness of results.

(3) The paper frames the problem with separate decision and outcome variables. However, in many scenarios, the decision variable itself is the outcome (e.g., pass/fail outcomes based on admissions decisions). A discussion on how the framework applies in such cases could enhance the paper's generalizability.

(4) Several new fairness-related concepts are introduced and summarized in Table 1, but the theoretical results and empirical analysis are limited to one of these concepts. Clarifying this focus in the discussion would help manage reader expectations and frame the contributions more transparently.

(5) In the data analysis section, the paper shows percentage increases in counterfactual fairness (CF) and principal counterfactual fairness (PCF). A discussion of the magnitude and practical significance of these increases, especially in light of potential estimation biases, would be beneficial for readers.

审稿意见
5

The paper studies the question of which individuals and attributes should be protected. To do so, the paper proposes a principle counterfactual fairness notion and provides sufficient conditions for its existence. The authors then propose a post-processing approach to achieve principal counterfactual fairness. The paper concludes with experiments on both synthetic and real-world datasets.

优点

-- The paper studies an interesting and important problem.

-- The wiring is clear for the most part and the high-level ideas are easy to follow.

缺点

-- The theoretical results are not convincing. For example, the independence from the true label in Assumption 1 is strong. Why should the true label be independent of the sensitive attribute? Theorems 2 and 3's results require close estimates of π^\hat{\pi}, μ^\hat{\mu} but the paper does not discuss how these quantities can be estimated using samples.

-- In addition, the experimental results raise additional questions. What is the base value of CF/PCF before post-processing? The paper only shows the increment. Furthermore, if I am not mistaken, only exact fairness is considered. What if we allow approximation? Moreover, what is the effect of imposing fairness on accuracy?

问题

-- Please see above for questions about Assumption 1 and computing π^\hat{\pi}, μ^\hat{\mu}.

-- Please see above for suggestions on the experiments. To strengthen the paper, I suggest the authors include the base CF/PCF values, explore approximate fairness, and analyze the accuracy-fairness tradeoff in their experiments.

审稿意见
5

This paper combines principal fairness and counterfactual fairness, using principal stratification together with the individual level modeling of counterfactuals. It leverages fairly standard causal modeling assumptions (conditional ignorability) and estimation methods (doubly robust, outcome regression, inverse propensity weighting) to propose a constrained optimization problem for training a fair model, and also to propose a post-processing strategy to reduce the unfairness of a previously trained model.

优点

Rigorous treatment of an important problem.

缺点

  • The contribution is somewhat incremental since it combines two existing fairness definitions.
  • Assumptions like ignorability may limit the applications. I think it is more likely this fails in fairness applications specifically. An empirical observation from many social/health sciences is that sensitive attributes are often causally entangled with many/most other relevant variables, so as long as there are any unobserved variables those variables are likely to be confounders (hence ignorability would not hold).
  • Overall not important that there are typos of "principle" instead of "principal," but the one in the title of the paper might be important.

问题

  • Can something be done to relax the assumption of ignorability?
  • How can we know if the given application is one where principal counterfactual fairness should be used instead of, say, counterfactual fairness?
AC 元评审

This paper introduces "Principal Counterfactual Fairness" (PCF), a novel fairness concept that restricts counterfactual fairness to individuals whose protected attributes have no causal effect on the outcome. The paper develops theoretical guarantees, proposes statistical bounds for fairness verification, and introduces a fairness intervention for PCF.

The reviewers recognized that the paper addresses an important aspect of fair machine learning. Still, they noted several limitations with the current submission, ranging from limitations in the paper's fundamental assumptions and practical impact (H1Gw), issues with the experimental results (UL6f, AVJH), and raised questions regarding the theoretical contributions. The authors did not submit a rebuttal, so these questions remained unaddressed.

审稿人讨论附加意见

The authors did not submit a response to the reviewer's comments, so reviewers stood by their initial assessment.

最终决定

Reject