DAG-SHAP: Feature Attribution in DAG based on Edge Intervention
摘要
评审与讨论
The paper proposes a novel feature attribution method called DAG-SHAP, which enhances existing Shapley value-based methods by focusing on edge intervention in Directed Acyclic Graphs (DAG). It aims to address the challenges in capturing exogenous contributions and externality in complex feature interactions with causal relationships. The authors also propose an efficient approximation algorithm to compute DAG-SHAP values, validated by experiments on synthetic and real datasets. The contributions include advancing causal feature attribution methods, particularly by addressing limitations in existing Shapley-based models.
优点
The introduction of edge intervention in DAGs for feature attribution is a creative approach, differentiating DAG-SHAP from existing Shapley-based methods. The paper provides a thorough theoretical analysis, including proofs of the properties DAG-SHAP satisfies, such as causality and externality. The proposed approximation method for computing DAG-SHAP is practical for real-world applications, as demonstrated by experiments on both synthetic and real datasets. The experiments are well-designed and provide clear comparisons with existing methods, highlighting the advantages of DAG-SHAP.
缺点
The paper proposes an approximation algorithm, but it lacks sufficient discussion on computational complexity and performance optimization, especially for large datasets or more complex DAG structures. The scalability and efficiency of the algorithm in such scenarios remain unclear. Additionally, while the paper validates DAG-SHAP’s effectiveness on simpler datasets, it does not include experiments on more complex models, such as deep neural networks with deeper DAG structures. Demonstrating its performance on high-dimensional or deeper DAGs would strengthen the research. Moreover, the computational cost of the algorithm for large-scale datasets is not fully addressed. As the number of edges and vertices increases, the complexity of calculating Shapley values grows significantly, and the paper would benefit from a deeper exploration of the trade-offs between efficiency and accuracy, especially in real-time or large-scale applications.
问题
- Can the authors clarify the computational trade-offs of the approximation algorithm in large-scale datasets? Would it be practical for deep neural networks with highly complex DAG structures?
- The paper claims that DAG-SHAP captures exogenous contributions more accurately than previous methods. Could the authors provide additional examples or scenarios where existing methods fail in capturing exogeneity, aside from those already included in the synthetic dataset?
We are very grateful to the reviewer for the insightful comments. To address the concerns, we provide detailed point-to-point responses as follows.
Response to Weakness and Question 1. In Theorem 4, we demonstrate that the sampling error increases non-linearly (with a complexity exceeding linear growth) as the number of edges in the DAG increases, requiring a higher number of samples to achieve a given error threshold. Since the total sampling time is equal to the number of samples multiplied by the computation time of the model for each sample, the sampling time scales linearly with the model's computational complexity. Therefore, the primary factor affecting sampling time is the number of edges in the DAG. DAGs in feature attribution are unlikely to appear in extremely high-dimensional data like images. Instead, they are typically found in tabular data. In such feature attribution scenarios, the goal is to help users understand the contributions and relationships between features. Users are typically interested in the contributions of a small number of key features, as these features are more relevant to the core of the problem and help avoid being overwhelmed by irrelevant or less significant information. Therefore, users tend to focus on a limited set of features to clearly understand their impact on the output. This is also why the datasets used in related works such as causal Shapley value and Asymmetry Shapley value experiments have not adopted large datasets.
When dealing with large datasets, parallel computing can effectively accelerate the feature attribution calculations. Feature attribution for different data points can be executed in parallel. Additionally, within the feature attribution process for a single data point, sampling different permutations can also be executed in parallel. We extended synthetic dataset (a) such that each data point had 100 features, and the DAG contained 200 edges. Specifically, we replicate , , , and from the synthetic dataset (a) 25 times, resulting in 100 features in total. Each replicated feature maintains its causal relationship with , ensuring that the collective contribution of all features to remains consistent with the contribution in the original structure. We use the same neural network structure as in the experiments in Section 5.1 and use a server equipped with two AMD EPYC 9754 128-Core Processors, providing a total of 512 logical processors. For each data point, we conduct two independent rounds of sampling and compare the mean absolute error (MAE) between the sampling results. The experimental results show that with 128 sampling permutations per data point, the MAE is 5.17%, which is significantly smaller than the errors caused by the different feature attribution algorithms. For instance, the absolute error between off-SHAP which has the smallest MAE in baseline and the benchmark was 17.24%. Using 128 threads for parallel sampling of 128 permutations, the computation time was only 57.5 seconds. Additionally, we conducted experiments with 256 and 384 permutations, resulting in mean absolute errors of 4.23% and 3.71%, respectively. We added the experiment analysis in Appendix Section D.3.
We conclude that the computational cost of DAG-SHAP is acceptable for most real-world applications.
Response to Q2. A real-world example can be found in medical diagnosis, where X1 represents a patient’s genetic mutation (e.g., a BRCA1 mutation), X2 represents the level of a protein biomarker detected in blood tests (e.g., the overexpression of a specific protein), and Y represents the predicted cancer risk. In this causal structure, X1 directly affects X2, and X2 acts as a mediator that transmits the effect of X1 to Y. Here, X2 has no independent exogenous contribution, and its impact on Y is entirely derived from X1. Traditional feature attribution methods may fail to capture this structure correctly, as they often ignore dependencies in causal pathways. These methods might assign a non-zero independent contribution to X2 or fail to distinguish between the indirect and direct contributions of X2, leading to incorrect attributions. In contrast, DAG-SHAP leverages the causal graph to accurately identify X2 as a variable that solely transmits the effect of X1. It assigns zero exogenous contribution to X2 and attributes the effect appropriately to X1. This precise attribution is critical in fields such as medical diagnosis and genetic research, where misjudging the importance of mediator variables can have significant implications.
Thank you so much for reviewing our work and sharing your valuable feedback—we truly appreciate the time and effort you’ve dedicated. We’d love to know if our earlier responses have addressed your concerns. If there’s anything that remains unclear or requires further discussion, please don’t hesitate to let us know.
We appreciate the time and effort you have put into preparing this rebuttal. Your detailed responses have been very helpful in understanding the nuances of the research. I decide to stick with my initial rating.
This paper proposes a novel feature attribution method that utilizes the Directed Acyclic Graph (DAG) among features and the target variable Y to enhance Shapley value-based feature attribution methods by capturing the exogenous influence of a feature. The novelty lies in calculating Shapley values for the edges in the DAG and using aggregated edge-based Shapley values for feature attribution.
优点
• The paper is well-written, with an example that effectively illustrates the core ideas and solutions. The authors have put in significant effort to make the paper easily readable, which is greatly appreciated.
• The analysis of limitations in existing Shapley value-based feature attribution methods is comprehensive.
• The idea of estimating Shapley values for edges in a DAG is interesting.
缺点
The primary limitation is that the proposed method relies on a known DAG. In most real-world scenarios, the DAG is unknown, and it generally cannot be learned directly from data. If a DAG is available, it is possible to estimate the causal effect of features on Y. Why not use a well-defined causal effect for feature attribution, such as the average causal effect of a feature on Y, instead of introducing a new DAG-SHAP value?
Additional Concerns Regarding Clarity
• The notation do(variable=v) in a DAG is well-defined in causal inference, but do(edge=e) is not. A formal definition of do(edge=e) is required, along with justification that this intervention yields a valid causal measure. Specifically, what causal measure does the DAG-SHAP value represent?
• The edge intervention causal Shapley value of a variable is defined as the sum of attribution values for its outgoing edges in Equation (4). My questions regarding this definition are as follows:
-
Do the edges in paths that do not terminate at Y contribute to explaining Y? If not, how should these edges be excluded?
-
In a general causal graph, edges can originate from Y, such as . How should the edge intervention causal Shapley value be calculated for such edges? Should we include Y when calculating the outgoing edges from Y?
• In the experiments, vertex splitting for causal structure is considered a desirable benchmark for attribution. I assume that this is one of a few desirable benchmarks. My point is that using the term Mean Absolute Error may be misleading, as there is no objective ground truth in feature attribution.
Minor:
• After Equation (1), the terms exogenous contribution of the feature and exogenous contribution of another feature should be explained to clarify the problem definition.
问题
See weaknesses.
We are very grateful to the reviewer for the insightful comments. To address the concerns, we provide detailed point-to-point responses as follows.
Response to W1. We acknowledge that in certain applications, the DAG cannot be directly identified. However, many approaches in the field of causal inference, such as the Additive Noise Model, Post Nonlinear Model, Peter-Clark Algorithm, and Inductive Causality, can assist in constructing the DAG. DAGs are particularly suitable for modeling feature relationships in tabular data, and since we only require the causal directions between features, the task becomes significantly simpler. Moreover, DAGs are a fundamental and widely used structure applicable to numerous scenarios, making it crucial to ensure reasonable attributions within this framework.
Response to the Difference to Average Causal Effect. DAG-SHAP values are inherently local, meaning they explain feature contributions for individual predictions rather than providing a global average effect. This locality refers to the specific interaction of a feature with other features in a single data point, capturing how the combination of features contributes to the model’s output for that specific instance. In contrast, measures like Average Causal Effect (ACE) are global and reflect the average impact of a feature across the entire population, without considering specific feature combinations or the behavior of the model on individual predictions. By integrating causal knowledge into SHAP, DAG-SHAP retains this local perspective, making it particularly suitable for analyzing feature interactions and their contributions to a model’s prediction for individual data points.
suppose we have a model that predicts the probability of a heart attack (Y) based on two features: X1 (BMI) and X2 (average sleep duration). Consider Sample A as the baseline (BMI = 25, Sleep Duration = 7 hours) and Sample B (BMI = 30, Sleep Duration = 5 hours) as the target case. Using ACE, we determine that relative to the baseline, a 1-unit increase in BMI raises the probability of a heart attack by 1%, while reducing sleep duration by 1 hour raises it by 2%. Based on these average effects, the total increase in probability for Sample B would be estimated at 9% (5 BMI units × 1% + 2 hours less sleep × 2%). However, because BMI and sleep duration may have synergistic effects, the true probability increase for Sample B could be higher due to their interaction. For instance, their combined influence might lead to an 11% increase in probability, rather than the additive 9% suggested by ACE. DAG-SHAP captures this interaction by distributing the contributions among the features based on their individual and joint effects. In this case, DAG-SHAP might allocate 6.5% to BMI and 4.5% to sleep duration, accurately reflecting the combined and interactive effects of these features on the prediction. This ability to provide local explanations for individual predictions is a key advantage of DAG-SHAP. It allows the method to account for interactions between features, distinguishing it from ACE, which provides only global average effects. This makes DAG-SHAP particularly effective for interpreting machine learning models with complex feature interactions.
Response to Additional Concern 1. In causal inference, traditional node intervention applies an external intervention to a variable , severing its causal relationships with all parent nodes and forcing its value to be fixed at . In contrast, edge intervention takes a different approach. Its goal is to intervene on a specific edge in a DAG, affecting only the causal transmission from the parent node to the specified child node , without altering 's influence on other child nodes. We clarify the definition of Edge Intervention in Section 3.2. DAG-SHAP focuses on explaining the causal attribution for individual samples, evaluating how each feature contributes to the model output, considering interactions and dependencies among features from a cooperative perspective. Its uniqueness lies in the way it allocates causal effects, which can be referenced in the response to the differences with the Average Causal Effect (ACE).
Response to Additional Concern 2. Edges in paths that do not terminate at Y do not contribute to the explanation. We consider two scenarios: Including such edges in the DAG: Since interventions on these edges do not affect Y, their marginal contribution remains zero. Since the attribution value is computed as the average of marginal contributions, it also evaluates to zero. If such edges are excluded from the DAG, they are not involved in the computation process, ensuring that the attribution value remains zero. If we need to exclude these edges explicitly, we can preprocess the DAG before running the DAG-SHAP calculation by performing a depth-first search (DFS) to construct a new DAG that excludes these edges. DAG-SHAP assumes the availability of a directed acyclic graph (DAG). Exploring interventions and feature attributions on graphs with cycles is an interesting problem but falls beyond the scope of this paper. In fact, a clear solution is currently unavailable, making it a promising direction for future research.
Response to Additional Concern 3. In the revised paper, we clarified that the term "Mean Absolute Error" specifically refers to the mean absolute error calculated between the attribution results of different methods and the ASV attributions derived from the vertex-split graph.
Response to Minor Question. We have clarified this point in the revised paper to address potential ambiguities. The attribution of each feature to the model output must capture only the exogenous contribution of the feature, where the exogenous contribution refers to the portion of the feature's impact on the output that originates from itself. It ensures that the attribution reflects the intrinsic effect of the feature and its downstream causal influence on the outcome.
Thank you so much for reviewing our work and sharing your valuable feedback—we truly appreciate the time and effort you’ve dedicated. We’d love to know if our earlier responses have addressed your concerns. If there’s anything that remains unclear or requires further discussion, please don’t hesitate to let us know.
Thank the authors for answering my questions. While the explanations provided are appreciated, some of my concerns remain. For example, a DAG is not identifiable from the data, which highlights a significant limitation of the proposed method in practical applications. I will maintain my original rating.
Thank you for your time and valuable feedback. Regarding your concern that DAGs are not identifiable from data, we would like to address your concerns with the following points:
Identifying DAG from data. In the field of causal inference, many methods have been proposed to identify and estimate DAG structures from data. For example, the Peter-Clark algorithm and Inductive Causality Algorithm build DAGs by testing conditional independence relationships in the data. Other methods, such as the Greedy Inference from Graph-based models (GIE) algorithm, use a likelihood maximization approach to infer causal structures. These methods have been successfully applied in various practical scenarios, demonstrating high accuracy and reliability. Additionally, there are causal function-based models, such as the Additive Noise Model (ANM) and Post-Nonlinear Model (PNL). The core principle of these models assumes that there is a functional relationship between the cause (X) and the effect (Y). Specifically, if X positively regresses Y and the noise term is independent of X, and if Y negatively regresses X and the noise term is not independent of Y, then X is inferred to be the cause of Y. Furthermore, prior knowledge and hybrid methods are also commonly used for DAG identification from data.
DAG in Feature Attribution. We would like to point out that there has been some existing research on DAGs applied to feature attribution, such as Shapley Flow[1], Recursive Shapley Value[2], Shapley ICC[3], and PWSHAP[4]. We believe that while these methods address the basic structure of a DAG between data features, existing feature attribution techniques still face challenges related to exogeneity, externalities, and causality. In response to these issues, we propose a DAG-SHAP method based on edge interventions. As a fundamental causal structure, addressing attribution issues in DAGs is meaningful. Moreover, our method does not require a specific causal structural equation, but only the directional relationships between features, making it a practical solution.
Challenges in DAG Identification. We understand your concerns about the difficulties in identifying DAGs, as there are cases where identifying a DAG from data may be challenging or even infeasible. However, we do not think that the research based on DAGs is completely impractical. Denying the possibility of identifying feature relationships in a DAG would, in fact, negate the progress made in the causal inference field, even though identifying DAGs falls outside the scope of our paper.
We hope that our response can alleviate some of your concerns regarding the prior knowledge of our paper. We are happy to continue the discussion if you have any further questions.
[1]Wang J, Wiens J, Lundberg S. Shapley flow: A graph-based approach to interpreting model predictions[C]//International Conference on Artificial Intelligence and Statistics. 2021
[2]Singal R, Michailidis G, Ng H. Flow-based attribution in graphical models: A recursive shapley approach[C]//International Conference on Machine Learning. 2021
[3]Janzing D, Blöbaum P, Mastakouri A A, et al. Quantifying intrinsic causal contributions via structure preserving interventions[C]//International Conference on Artificial Intelligence and Statistics. 2024
[4]Ter-Minassian L, Clivio O, Diazordaz K, et al. PWSHAP: a path-wise explanation model for targeted variables[C]//International Conference on Machine Learning. 2023
We sincerely appreciate your feedback on our paper and have carefully addressed your concern regarding the identifiability of DAGs from data. Specifically, we have provided a detailed response covering the following aspects: existing methods for identifying DAGs from data, prior work on feature attribution using DAGs, and the challenges in DAG identification.
As the discussion deadline is approaching, we kindly ask if our response addresses your concerns or if any additional clarification is needed. Thank you for your time and support. We look forward to hearing from you.
This paper introduces a Shapley attribution method that is based on utilizing a given causal DAG by using edge interventions rather than interventions on nodes/variables directly. By this, the method aims at capturing both exogenous contributions and effects between features. The authors show in different artificial and real-world data sets that their method is better at capturing the relevance of features than related works when there are complex causal interactions between them.
优点
- Well-motivated solution and a mostly fair comparison with related work
- Identifies a key issue with existing methods
- Clearly lists desired properties, although the authors lack mathematical proofs that their method fulfills these
- Fair and comprehensive empirical validation on both synthetic and real datasets
缺点
- Lack of theoretical guarantees and proofs that the proposed methods fulfill the desired properties
- Limited discussion of computational complexity and scalability for large DAGs, especially when they go beyond the rather small examples in the experiments
- A rather minor concern: The assumption of known causal structure may be unrealistic in many real applications
- While many important related papers are discussed, some fairly relevant papers are missing (see Questions section for more details)
问题
The paper is generally well written and addresses an important problem. My two main concerns are, however, a lack of theoretical guarantees that the method truly fulfills the desired properties and some comparison with related work in a similar direction. For the theoretical aspects, I appreciate the list in the appendix, but these are mostly argumentative rather than mathematical proofs, which is the most important aspect when providing a novel attribution technique. Regarding the related work, you have a great comparison, but including a discussion about the following papers can be helpful:
- "Feature relevance quantification in explainable AI: A causal problem" by Janzing et al.
- "Quantifying intrinsic causal contributions via structure preserving interventions" by Janzing et al.
Both papers argue similarly for attributing causal influences. While the first work argues via hard-interventions, something you explicitly want to avoid, the second work has a similar approach towards attribution of influences. Your approach still seems to differ from these works, but a comparison can be insightful.
Generally, the simple example to illustrate the idea is helpful and maybe you could use this to explicitly show that the most related works would not capture the right contributions.
A few remarks/questions:
- The e_s notation with and without bold S is often hard to distinguish, maybe consider using a different notation that points out the difference more clearly.
- How does the computational complexity of DAG-SHAP scale with the number of vertices and edges in the graph? What are the practical limitations for large-scale applications?
- The work lacks some discussion about the practical modeling assumptions of your functions f. Some remarks on the impact of inaccurate modeling assumptions could be insightful.
After rebuttal
It seems the main contribution is rather using the MDN model for modeling the interventions to compute , rather than the Shapley definitions. Strictly speaking, using the formalism in the referenced works, one could use an MDN there as well to compute the quantities. For instance, while the intrinsic causal influence paper defines functional causal models of the form , this can be represented via an MDN as modeling of is not required in a structural causal model as long as it represents a generative model as given in the MDN case. That being said, the novelty here stems particularly from using an MDN, rather than from the Shapley definitions.
I have carefully considered all of the authors responses and truly appreciate the time and effort in addressing my concerns. Nevertheless, I will stick to my initial rating as I still have concerns that the novelty lies in the estimation method rather than the Shapley part. That being said, I am not strongly advocating for rejection, and if the other Reviewers see significant novelty here, I am also okay with this. To reflect this, I have decreased my confidence score.
伦理问题详情
N/A
We are very grateful to the reviewer for the insightful comments. To address the concerns, we provide detailed point-to-point responses as follows.
Response to W1. We provide the proof that DAG-SHAP fulfills the desired properties in Appendix Section B.2, please refer to lines 786-917.
Response to W2. The computational complexity of DAG-SHAP is acceptable for most practical feature attribution tasks involving DAGs in real-world applications. DAGs in feature attribution are unlikely to appear in extremely high-dimensional data like images. Instead, they are typically found in tabular data. In such feature attribution scenarios, the goal is to help users understand the contributions and relationships between features. Users are typically interested in the contributions of a small number of key features, as these features are more relevant to the core of the problem and help avoid being overwhelmed by irrelevant or less significant information. Therefore, users tend to focus on a limited set of features to clearly understand their impact on the output. This is also why the related works such as causal Shapley value and Asymmetry Shapley value experiments have not adopt large datasets. To illustrate that DAG-SHAP support larger dataset, we conduct feature attribution in a synthetic dataset with 100 features and 200 edges, the detailed experimental results, experimental setups, models used, and the computational hardware are all provided in Appendix Section D.3.
Response to W3. We acknowledge that there are applications where the DAG cannot be directly identified. However, many approaches in the field of causal inference, such as the Additive Noise Model, Post Nonlinear Model, Peter-Clark algorithm, and Inductive Causality, can assist in constructing the DAG. DAGs are particularly suitable for modeling feature relationships in tabular data, and since we only require the causal directions between features, the task becomes significantly simpler. Moreover, as a fundamental and widely used structure, DAGs are applicable to numerous scenarios, making it important to ensure reasonable attributions within this framework.
Response to W4. Please see detailed response to Q1.
Response to Q1. The paper "Feature Relevance Quantification in Explainable AI: A Causal Problem" primarily focuses on the distinction between calculating Shapley values using observational versus interventional conditional distributions. This distinction corresponds to our discussions on the On-manifold Shapley value and the Causal Shapley value in our work. Note that the interventional Shapley value discussed in their paper differs slightly from the Causal Shapley value. They do not explicitly highlight direct versus indirect contributions and the role of symmetric versus asymmetric sampling in Causal Shapley value. However, the total attribution values for each feature remain consistent between the interventional Shapley value and the symmetric causal Shapley value. As a result, the comparisons in our work with the Causal Shapley value inherently encompass the interventional Shapley value approach outlined in the referenced paper. Nevertheless, it is one of the first to highlight the role of causality in feature attribution.
Regarding the paper "Quantifying Intrinsic Causal Contributions via Structure Preserving Interventions", we appreciate its relevance and foundational contributions to the field. Below, we outline the similarities and differences between our work and theirs, which we have addressed in the revised manuscript. Our work focuses on attributing the contributions of features to a model's output, similar to the Causal Shapley value and On-manifold Shapley value. In contrast, their work aims to attribute uncertainty (e.g., variance or Shannon entropy) in the target to its influencing features. This distinction makes the objectives of the two approaches fundamentally different. Both approaches emphasize isolating the contributions of features independently of others. They describe these as "intrinsic" contributions, while we frame them as "exogenous contributions." However, our approach uses value-based interventions (manipulating feature values directly), while theirs employs structure-preserving interventions aimed at measuring uncertainty reduction. From our view, their approach can be seen as focusing on feature interventions that assess changes in uncertainty metrics. Their definition of Shapley-based ICC does not account for the sequence of interventions (e.g., parent-to-child order), which we discuss as potentially causing causal inversion in Structural Causal Models (SCMs). Thank you for highlighting these two important related papers. We have added discussions of them in the revised manuscript.
Response to Q2. All S should be bold, and we have fixed this issue, thanks. For the computational complexity, please refer to the response to W2. For the practical application of DAG-SHAP, having an extremely large number of nodes and edges poses a limitation, as it not only increases computational costs but also makes it more challenging to obtain an accurate DAG. However, it is important to note that this is not solely a limitation of DAG-SHAP. It is a common challenge in both the fields of causal inference and SHAP-based feature attribution methods At the same time, in feature attribution scenarios, the primary goal is to help users understand the contributions and relationships between features. As a result, users typically focus on important features and their interactions, meaning that the structure of a DAG is usually not excessively complex. Consequently, the computational cost of DAG-SHAP is practical for the most of real-world applications. Regarding the model f, the impact of inaccurate modeling assumptions depends on the purpose of feature attribution. If the goal is to provide explanations that are faithful to the model's output, then inaccuracies in f do not introduce errors into the attribution results. However, if we aim to use DAG-SHAP for explanations that are faithful to the data—i.e., to interpret the contributions of features in the data-generating process—then f serves as a tool to approximate the unknown true data-generating function. In this case, f may introduce fitting errors. We have added a discussion on this point in the revised version of the paper.
Thank you so much for reviewing our work and sharing your valuable feedback—we truly appreciate the time and effort you’ve dedicated. We’d love to know if our earlier responses have addressed your concerns. If there’s anything that remains unclear or requires further discussion, please don’t hesitate to let us know.
I want to thank the authors for their response and addressing some of my concerns. I appreciate the discussion on the DAG complexity, but this was just a minor concern as I am aware of the practical limitations of Shapley based approaches.
However, I still don't see a clear novelty over these works:
- "Quantifying intrinsic causal contributions via structure preserving interventions" by Janzing et al.
- "On Measuring Causal Contributions via do-interventions" by Jung et al.
- "Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models" by Heskes et al.
If one defines an intervention on the node, then this is covered by "On Measuring Causal Contributions via do-interventions" or "Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models". If one considers structure-preserving interventions, then this is covered by "Quantifying intrinsic causal contributions via structure preserving interventions". The paper could benefit from a very clear distinction from these works and, seeing their close relatedness, some clear examples of why these methods would capture the "wrong" quantities (based on the paper's definition).
For edge interventions in DAG-SHAP, the edges for the instance are denoted as , and the marginal contributions of each edge in each arrangement are as follows:
| Marginal Contribution | |
|---|---|
| in | =3/2-0=3/2 |
| in | =3/2-0=3/2 |
| in | =3/2-5/4=1/4 |
| in | =3/2-3/2=0 |
| in | =2-2=0 |
| in | =5/4-0=5/4 |
| in | =2-3/2=1/2 |
| in | =2-3/2=1/2 |
| in | =2-3/2=1/2 |
Therefore, the attribution value for in DAG-SHAP is , and the attribution value for is . Similarly, since influences and it directly influences , it is clear that is more important than as half of the value of is dependent on . Therefore, the attribution value provided by symmetric node interventions is misleading, because it includes the contribution of in the marginal contribution of with an empty set, i.e., .
Structure-preserving interventions are based on the assumption that the distribution of exogenous variables for each feature is known, and then intervene on the exogenous variable of each feature accordingly. However, in reality, the distribution of exogenous variables is unknown. The authors use an additive structural causal model to fit the data generation process. By subtracting the distribution of the parent node from the distribution of the child node, they obtain the exogenous variable distribution for the child node. However, most real-world data cannot be represented by an additive structural causal model. For example, in this case , where is a random variable uniformly distributed on , representing the exogenous influence of and , where is another random variable uniformly distributed on . In the generation of the , the exogenous variable acts multiplicatively. Thus, fit the data using an additive causal model, we cannot get the right causal structure, so the attribution values calculated using this method would not be accurate. Consider a special case where when and , the resulting exogenous variable for is 0. This implies that has no effect on the data generation process, which clearly contradicts the true data generation process. DAG-SHAP does not require calculating the distribution of exogenous variables, nor does it assume the influence of exogenous variables in the data generation process is linear. It only need the causal direction between features. We think this is a significant distinction.
Thank you very much for your time and valuable feedback. We truly appreciate your thoughtful comments. If you have any further questions or need additional clarification, please don't hesitate to reach out.
We sincerely appreciate your feedback on our paper and have carefully addressed your concern regarding the lack of comparisons with other works. We included a detailed comparison section with a relevant example to highlight the distinctions and support our findings.
As the discussion deadline is approaching, we kindly ask if our response addresses your concerns or if any additional clarification is needed. Thank you for your time and support. We look forward to hearing from you.
I want to thank the authors for their detailed examples and explanations. Can the authors comment on the following concern as well: Regarding the referenced works, you mention:
their fundamental distinction lies in whether the goal is to explain the impact of features on the data generation process of or on a given predictive model .
However, couldn't one simply replace the in with , i.e., we get ? In other words, if we want to have the explanations with respect to a prediction model, we could simply add this as another node in the graph (connected to all features), also in your work. And the same argument can be made the other way around; in the referenced works, one could simply replace with and focus on the 'true' data generation process for explanation. So, this argument would only hold if the works particularly modify/require a modified DAG structure to explain the prediction models. This is e.g. the case in 'Feature relevance quantification in explainable AI: A causal problem', where causal relationships between features are simply represented as common confounders because one only focuses on the instead of explanation case. In the other works, however, the causal links between features remain. I might be missing an important difference here and hope the authors can clarify this.
Once again, thank you very much for the discussion.
Thank you for your response. We completely agree that using examples to clarify the distinction between our paper and several key related works will help to highlight the novelty. To this end, we have used a few simple examples to explain why their methods may fail while DAG-SHAP remains reasonable attribution. We have discussed this in the revised paper, specifically in the remark on lines 249-256. Due to space limitations, the detailed comparison is provided in Appendix Section D, lines 938-1036. Below is our specific comparison.
Regarding do-Shapley(On Measuring Causal Contributions via do-interventions) and Causal Shapley(Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models), both use node interventions, and their fundamental distinction lies in whether the goal is to explain the impact of features on the data generation process of or on a given predictive model . Therefore, the utilities in their definitions are and , respectively. Shapley-ICC(Quantifying intrinsic causal contributions via structure preserving interventions) employs structure-preserving interventions to measure the contribution of nodes to the reduction of uncertainty in the generation of . Structure-preserving interventions involves separating the exogenous variables within a feature from the influences of parent nodes. Interventions are then applied to each node's exogenous variables, where the intervention is still node-based. A key prior for this method is that the distribution of the exogenous variables must be known. We think that it's important to highlight the difference lies in the techniques of node intervention, structure-preserving interventions, and the edge intervention proposed in our paper, rather than in the attribution targets of these works (since these targets themselves already differ). Thus, we simplify the comparision by assuming that the model has perfectly learned the data generation relationship, i.e., the function of model is consistent with the generation function of . This allows us to avoid considering the distinction between the feature inference targets of do-Shapley and Causal Shapley, treating both as node interventions. We also apply structure-preserving interventions in the context of explaining the model 's output or the generation of , making the goals of all methods consistent and facilitating comparison. For node interventions, we consider two cases: symmetric sampling node interventions and asymmetric sampling node interventions. Although do-Shapley does not discuss sampling order, Causal Shapley does.
Asymmetric sampling node intervention. We use the following example to explain why it may fail. Let , where is a random variable uniformly distributed on , representing the exogenous influence of ; , where is another random variable uniformly distributed on , representing the exogenous influence of . The generation of follows . In summary, directly influences and indirectly influences through . influences with its own exogenous influence and transfers the indirect influence of . We aim to attribute values to each feature of a specific explained input with respect to the baseline [0, 0].
For asymmetric sampling node interventions, the only valid sample permutation is . The marginal contributions of and in the permutation are shown in the table below:
| Marginal contribution | |
|---|---|
| in | |
| in |
Thus, the attribution value assigned to by the asymmetric sampling node intervention is , and the attribution value assigned to is also .
For edge intervention, we denote the edge as , as , and as . We denote the edges of the instance as . According to the definition of DAG-SHAP , there are three valid edge permutations: , , and . The marginal contributions of each edge are as follows:
| Marginal Contribution | |
|---|---|
| in | =0-0=0 |
| in | =0-0=0 |
| in | =1/2-0=1/2 |
| in | =1/2-0=1/2 |
| in | =1-0=1 |
| in | =0-0=0 |
| in | =1-1/2=1/2 |
| in | =0-0=0 |
| in | =1-1/2=1/2 |
Thus, the attribution value assigned to by DAG-SHAP is ,and the attribution value assigned to is . As directly influences and it also indirectly influence through , it is clear that is more important than . Therefore, the asymmetric sampling node intervention provides unreasonable attribution, as it fails to account for the external contribution of .
Symmetric sampling node intervention.
We use , where is a random variable uniformly distributed on , representing the exogenous influence of ; , where is another random variable uniformly distributed on , representing the exogenous influence of . The generation of follows . We want to attribute a specific explained input with respect to the baseline set as [0,0].
For symmetric node intervention, both the permutations and are valid, and the marginal contributions of in the permutations and are shown in the following table:
| Marginal Contribution | |
|---|---|
| in | =3/2-0=3/2 |
| in | =2-2=0 |
| in | =2-0=2 |
| in | =2-3/2=1/2 |
Thus, the attribution value for is , while the attribution value for is .
Thank you for your timely and insightful response.
We think that feature attribution methods, which require DAGs as prior knowledge, are generally designed to support explanations for both the data generation process and a given predictive model from a technical perspective. For example, in [1], it is assumed that the data generation function cannot be explicitly obtained, and thus the data generation process is modeled by training a machine learning model that fits the generation function. The utility is approximated by , where is a model specifically trained to fit rather than a model optimized purely for prediction accuracy or similar metrics. However, the attribution calculation process in both cases does not differ significantly. In [2], the calculation of does not rely on an accessible model, but the authors also point out that if is a deterministic function, their method reduces to measuring the contribution of features to the model . Therefore, if the attribution method has accounted for the clear causal relationships between features, the primary difference between explaining the data generation process and the given predictive model lies in the distinction between the calculations of and . However, if a feature attribution method does not rely on the causal graph structure between features, it cannot accurately determine the impact of removing certain features and their contribution to the label . This is because they are unable to compute according to the causal data generation process. Therefore, it cannot attribute from the perspective of data generation. This is why methods that support both types of explanations require an understanding of the causal relationships between features.
Once again, thank you for your response. If our clarification has not addressed your concern, please let us know, and we will clarify further.
[1]Sun Q, Xia H, Liu J. Data-Faithful Feature Attribution: Mitigating Unobservable Confounders via Instrumental Variables[C]//The Thirty-eighth Annual Conference on Neural Information Processing Systems. 2024.
[2]Jung Y, Kasiviswanathan S, Tian J, et al. On measuring causal contributions via do-interventions[C]//International Conference on Machine Learning. PMLR, 2022: 10476-10501.
Thank you for your time and valuable feedback, especially for helping us improve the proof of the properties satisfied by DAG-SHAP and for pointing out related papers we hadn't compared with.
As the discussion period is nearing its end, we want to check if our responses have addressed your concerns. Would you be willing to adjust the score based on the clarifications and additional comparisons provided?
Once again, thank you very much for your valuable insights and feedback.
Thank you for your reply.
if a feature attribution method does not rely on the causal graph structure between features, it cannot accurately determine the impact of removing certain features
I generally agree with this statement, however, the referenced works mentioned above would use the actual/correct causal structure and claim to compute the same (or at least, closely related) interventional quantities (except for the intrinsic causal contribution paper as this focuses on the noise). As you mentioned that these work require certain assumptions/restrictions to be able to compute the interventional quantities, can the authors briefly re-iterate on how are computed in their approach as the paper (e.g. Algorithm 1) does not give detailed insights into the technical approach here. For example, how does line 11 in Algorithm 1 look exactly seeing that feature influence each other and one could not just take marginal distributions as in the "Feature relevance quantification in explainable AI" paper? I would have expected either some parametric assumptions are required to model it as a structural causal model (like in the intrinsic causal influence paper) or using some re-weighting (like in the do-Shapley paper).
Thank you for your response. Regarding how line 11 in Algorithm 1 is handled, we did not assume a structural causal model or use re-weighting methods. In fact, we employed the Mixture Density Network (MDN) to model the distribution of child node feature values after intervention on parent node feature values. We then sample from the trained model, which requires fewer assumptions about the generative relationships between features. As such, our DAG-SHAP method only needs to know the causal directions between features in the DAG. In the experimental section, specifically between lines 377-393, we explain the technique used in the practical computation: "We utilize a Mixture to predict the distribution of child vertices after intervention on parent vertices within the input features, consistent with the approach used in causal Shapley value (Heskes et al., 2020)." Once we have the estimated values of all features after the intervention subset of edges, we can input these into the explained model to estimate the utility, supporting the use of Monte Carlo methods for estimating attribution values for all edges. In future versions of the paper, we will add this specific technique to the description of Algorithm 1 for further clarification.
Thank you for reviewing our paper. As the discussion period is drawing to a close, we would like to confirm whether our responses have addressed your concerns. Would you be open to adjusting your score based on the clarifications and additional comparisons we've provided?
Thank you once again for your valuable insights and feedback.
Thank you for reviewing our paper. We have made several revisions in response to your comments, but we noticed that the score has not changed. We are unsure whether this is because you may not have seen the updated responses or if there are other unresolved concerns. If you think the issues have been addressed, we kindly request that the score be updated. However, if there are still unresolved concerns, it is unfortunate that we are unable to know the reasons for the rejection, as we can no longer receive further responses from you. Thank you again for the time and effort you have dedicated to reviewing our paper.
This paper modifies the popular Shapley-based framework for feature importance to satisfy certain properties related to causal inference. These include: "causality", such that attributions detect features' causal interactions towards the response, "externality", such that attributions also consider causal paths to the response through other variables, and "exogeneity", such that attributions do not confound other variables' influence towards the response. It is shown that previous Shapley based approaches, including on/off-manifold Shapley, causal Shapley, and Shapley flow, fail one or more of these criteria due to their treatment of other variables and/or their strict reliance on topological ordering. Given the underlying causal graph, the proposed solution, DAG-SHAP, obeys the desired properties, by considering interventions at the level of edges, rather than vertices. An algorithm is presented for practical evaluation, which is empirically tested on one simulated data example and two real datasets, where DAG-SHAP performs better than the other Shapley-based alternatives.
优点
The development of causality-based feature importance methods are an important field, since many researchers want to gain insights about how variables interact, while most feature importance methods remain model-based, in order to optimize model performance rather than gain insights about relationships in the data.
Externality and exogeneity are interesting and relatively well-founded causal axioms. Axioms for feature importance, particularly with respect to "explaining the data", are quite early in their devlopment and seem to be subject to frequent debate. Although I disagree that every causal feature importance method must obey externality to coherently explain how features explain the response, I see the appeal in guaranteeing it, in order to identify more layers of the causal structure.
Edge-based interventions are an interesting approach for obtaining coherent causal interpretations. The derivations for proving that DAG-SHAP obeys the desired properties (given that the causal graph is already known) appears to be sound.
Code is provided.
缺点
It seems that like Causal Shapley, DAG-SHAP also relies on prior causal knowledge. In particular, it seems to rely on knowledge of the causal graph, since only valid topological orderings are considered when averaging edge permutations. This is an extremely strong assumption, as the entire motivation behind many causal based feature importance methods is to learn information about the unknown causal structure of the data. Indeed, it is not required by either MCI or UMFI. If this is the case, then this strong assumption should be emphasized in the paper, as DAG-SHAP would only be applicable if the causal structure is already known, and serve to provide more detailed causal interpretations, in this setting.
By restricting discussion and comparision just to other Shapley-based methods, this does not consider many approaches which seek to explain relationships in the data, beyond just Shapley-based approaches. For example, MCI (ICML 2021) and UMFI (AISTATS 2023) were developed to learn aspects like causal interactions from the data, rather than just the model itself.
The discussion of externality and exogeneity should be clarified earlier (for example, externality is only defined in Section 3, while there is ample discussion beforehand). I also think that the motivation behind these axioms can be strengthened. An intuitive toy example is provided, but there should be more insight into why these properties are important.
On a related note, the discussion of the empirical evaluation should relate more explicitly to the proposed theory. The bulk of the paper discusses externality and exogeneity and why the proposed method works, while others fail. Although these properties are clearly relevant in the provided empirical experiments, the discussion can be more focused to tie back to these ideas in depth. In its current form, I also find the explanation in the desirable attribution section (L429) to be lacking (more in questions section).
Nits:
I think that there is too much focus on previous Shapley methods in the main text (Section 2.2). I think that this can be moved to the appendix.
The use of \mathcal{N} for the main features/vertices {1, ..., N} does not seem standard, as this notation typically represents the set of natural numbers. I would recommend [n] or V.
问题
Can the authors confirm the requirement of knowing the causal graph before running DAG-SHAP in order to satisfy causality, externality, and exogeneity? Do they think that this condition can be weakened while providing the same guarantees? If not, how do the authors see the motivation/practical benefit of the method, given the causal graph is already known?
Given externality, why is it that vertices with multiple paths to the response (ex. X1, X2 in the synthetic data example) are not given higher mean attribution score, but rather higher variance? This seems to suggest that in practice, analysts should look at the variance of attribution scores to rank features rather than just the means, and this seems a bit counter-intuitive. In practice, high variance can be indicative of many other factors, such as the quality of the data. As such, I remain unconvinced by the desirable attribution.
We are very grateful to the reviewer for the insightful comments. To address the concerns, we provide detailed point-to-point responses as follows.
Response to Assumptions and Comparisons in DAG-SHAP. We agree with the reviewer's observation that there exist methods that do not rely on knowledge of the causal graph, such as MCI or UMFI. However, due to the "no free lunch" theorem, these methods lack certain properties inherent to our approach. For example, MCI measures feature contribution based on the maximum marginal contribution a feature can bring which cannot capture the right causal contributin. For the example in [1]: there is an example with probability distribution for binary features and where the for . The conditional expectations for all subsets are , , and . In this case, feature is irrelevant since it does not influence the function , and any meaningful measure of feature contribution should reflect this irrelevance. Methods like MCI, which evaluate based on maximum marginal contribution, would give the attribution if . Therefore, MCI cannot capture such dependencies. In fact, if is an intermediate node between and Y, MCI would yield the same result. Additionally, as stated in the properties provided in the MCI paper, MCI satisfies Super-efficiency and Sub-additivity but does not satisfy Efficiency and Additivity in the context of feature attribution. This indicates that the attribution values assigned by MCI to all features do not sum up to the total effect contributed by all features collaboratively. Furthermore, when a feature is involved in multiple attribution tasks, its attribution value is not equal to the sum of its attribution values in each task.
The key difference between UMFI and MCI lies in the way marginal contributions are computed. UMFI removes the dependencies of the feature being attributed from other features before calculating its marginal contribution. As a result, it requires a causal graph to identify these dependencies. This is because, without such a graph, determining the dependencies would not be possible. Additionally, since UMFI relies on an approximately optimally preprocessed feature set for its computations, it does not satisfy properties like efficiency and additivity, which are inherent to the original Shapley value.
Our method can be used for explaining data, and whether a feature attribution method is employed to explain the data or the model output depends on the user's objective[2]. When our method is used for data explanation, the model is trained to approximate the unknown data-generating function from reality, serving as a tool for interpreting the data generation process. In attributing feature contributions to the generation of data labels, our method incorporates edge-based interventions. This ensures properties such as causality, exogeneity, and the efficiency and additivity axioms we discussed earlier. These properties, which are crucial for robust feature attribution, are not satisfied by methods like MCI and UMFI. We have added the discussion in the revised paper.
Response to Discussions on Externality and Exogeneity. We move the definitions to the beginning of section 3 for clarity.
Response to Empirical Experiments. We have added discussions on the relationship between the experimental results and the concepts of externality and exogeneity in the revised paper. Regarding the connection between externality and the experimental results, externality directly increases the variance in the attribution distributions of and , which cannot be captured by the attribution mean. This is because the box-plot illustrates the distribution of attributions for across all data points in the synthetic dataset. Our baseline input represents the overall distribution of the synthetic dataset. For a single data point, if exceeds the mean baseline, it contributes positively. If it is below the baseline, it contributes negatively. And if it equals the baseline, it contributes zero. Feature attributions are not always positive, and externality can also amplify the negative contributions of a feature when its value is small. The larger box range(variance) in the attributions of and reflects the greater influence of these two features. This is because they have more causal paths, leading them to significantly increase or decrease depending on the scenario. The reason why the mean cannot fully explain this is that our generated data distribution approximates a uniform distribution, resulting in nearly equal numbers of data points with positive and negative contributions. This causes the overall mean to approach zero. Additionally, the presence of externality is further evidenced by the smaller box range of and in the results of ASV. Since ASV does not account for externality, it underestimates the influence of these two features. This highlights the importance of externality in accurately capturing feature contributions.
Response to Nits. Following your suggestion, we have moved the discussion of previous Shapley methods to the Appendix and updated the notation to use V to represent the set of vertices.
Response to Q1. We confirm that knowing the DAG is required before running DAG-SHAP, and this condition cannot be weakened. However, the DAG we require specifies only the causal directions among features, which can be discovered using various causal inference methods, such as the Additive Noise Model, Post Nonlinear Model, Peter-Clark algorithm, and Inductive Causality. Our contribution lies in providing more reasonable explanations of feature contributions given the DAG. The advantages of our method over other feature attribution approaches are discussed extensively in our paper. So we discuss why DAG-SHAP is still necessary when causal knowledge can already be obtained from a known causal graph, rather than using purely causal methods like calculating the Average Causal Effect (ACE). This is because DAG-SHAP captures how the combination of features contributes to the model's output for a specific instance, which purely causal methods cannot achieve. For example, suppose we have a model that predicts the probability of a heart attack (Y) based on two features: X1 (BMI) and X2 (average sleep duration). Consider Sample A as the baseline (BMI = 25, Sleep Duration = 7 hours) and Sample B (BMI = 30, Sleep Duration = 5 hours) as the target case. Using ACE, we can determine that, relative to the baseline, increasing BMI by 1 unit raises the probability of a heart attack by 1%, while reducing sleep duration by 1 hour increases the probability by 2%. Based on these average effects, the total increase in probability for Sample B would be calculated at 9% (5 BMI units × 1% + 2 hours less sleep × 2%). However, because BMI and sleep duration may have synergistic effects, the true probability increase for Sample B could be higher due to their interaction. For instance, their combined influence might lead to an 11% increase in probability, rather than the additive 9% suggested by ACE. DAG-SHAP captures this interaction by distributing the contributions among the features based on their individual and joint effects. In this case, DAG-SHAP might allocate 6.5% to BMI and 4.5% to sleep duration, accurately reflecting the combined and interactive effects of these features on the prediction. This ability to provide local explanations for individual predictions is a key advantage of DAG-SHAP. It allows the method to account for interactions between features, distinguishing it from ACE, which provides only global average effects. This makes DAG-SHAP particularly effective for interpreting machine learning models with complex feature interactions.
Response to Q2. please refer to Response to empirical experiments.
[1] Janzing D, Minorics L, Blöbaum P. Feature relevance quantification in explainable AI: A causal problem[C]//International Conference on artificial intelligence and statistics. PMLR, 2020: 2907-2916.
[2] Chen H, Janizek J D, Lundberg S, et al. True to the model or true to the data?[J]. arXiv preprint arXiv:2006.16234, 2020.
Thank you for the response, it has addressed some of my concerns. The provided example demonstrates a pitfall of MCI for causality-based feature importance. Although UMFI does not fail in this example (the dependency removal step enables the blood relation axiom), the authors correctly point out that the causal structure remains unknown without prior knowledge. So the requirement for prior knowledge on the causal graph is well motivated. I think that this is good discussion to include in the paper in order to clarify the need for a strong prior.
I still struggle a bit to see the merit of the axioms. For externality, the argument is that we also want to model indirect effects to the response. But if the causal graph is assumed to be known, why not just have a method that models direct effects, and if one wishes to study the compounded causal effects along a causal chain, just re-define the response along the chain? I agree with exogeneity as an axiom. In terms of causal interpretation, I don't see why efficiency and additivity are necessary or even desirable properties.
I still have conerns about causal interpretations under externality. Since the mean feature importance score does not distinguish external influences (it is rather the variance), interpreting the variance of scores is impractical for numerous reasons. What if the numerical scales of the variables are inherent different? What if the quality of data is hampered, for example due to small samples?
We greatly appreciate your timely feedback and the critical questions you have raised. Below, we provide detailed responses to address your concerns. If you have any further questions or suggestions, please do not hesitate to reach out to us.
Response to related papers. We have updated the paper to include a discussion on MCI and UMFI, highlighting the distinctions between our study and theirs, as well as clarifying the necessity of prior knowledge.
Response to the Merit of the axioms. Regarding the necessity of DAG-SHAP even when causal effects are available, we believe the core reason lies in the fundamental difference between causal effects and feature attribution, as the primary focus of our paper is on feature attribution. Causal effects and feature attribution are fundamentally different research directions. The goal of causal effects is to quantify the causal influence of one variable on another, while feature attribution aims to analyze the contribution of all input features to a model's prediction. The most critical aspect of feature attribution is that it provides a method to understand, relative to a baseline input (where each feature has a given value or follows a specified distribution), why a machine learning model produces a specific output for a given input . Feature attribution needs to account for feature interactions, which are ignored by causal effects (i.e., the direct or indirect causal effects of each individual feature). To illustrate why this distinction is important, consider the following simple example. Suppose there are only two features, and , and the model's output is defined as . For a given baseline input and the input to be explained , causal effects are typically defined based on the intervention . If we intervene on while holding fixed at baseline value, the causal effect can be written as . Similarly, intervening on gives the causal effect The causal effects of and are independent of each other's values, which is clearly unreasonable when used to explain the model's output, as it fails to account for their synergistic effects in the product. Moreover, the total causal effect of and will be which not equal to . If we use feature attribution, the attribution of will be , and the attribution of is . Marginal contributions and guides how to allocate the synergistic effects in the product for and . Therefore, causal effects cannot substitute feature attribution, even when a DAG is available. We incorporate DAGs to address the issue of misattribution caused by causal relationships among features in feature attribution. For externality, our goal is to appropriately attribute the indirect effects of features to the model's output in feature attribution. The results of feature attribution not only ensure that the sum of attribution values equals the total change but also allocate each feature's contribution from the collaboration. Efficiency and additivity are widely adopted and fundamental properties in SHAP-based feature attribution methods[3]. The efficiency property is important because we aim to accurately attribute the model's output changes to the features. While causal effects can be normalized to satisfy this property through sampling-based methods, such post mappings fail to address the core issue—they inherently overlook the collaboration between features when allocating contributions. Regarding additivity, attribution methods that emphasize this property allow us to assess a feature's importance across multiple tasks by simply summing its attribution values from the subtasks linearly, providing a straightforward and consistent measure of its overall contribution.
[3]Scott M, Su-In L. A unified approach to interpreting model predictions[J]. Advances in neural information processing systems, 2017, 30: 4765-4774.
Response to Experimental results. We would like to clarify a few points. First, we think that the mean absolute feature attribution score can distinguish external influences. Regarding our previous response that "externality directly increases the variance in the attribution distributions of and ," the term variance does not refer to increased uncertainty in the attribution values for a single data point. Instead, it refers to the broader distribution of attributions for and across all data points, compared to methods that do not consider externality. We believe this aligns with your observation that and are more important than and .
Regarding why the mean absolute feature attribution score can distinguish external influences, the mean feature importance score itself may fail to do so because it considers the average of all attributions, where positive and negative contributions may cancel each other out. In the boxplots from our experiments, each box encompasses the attribution values of a given feature across all data points. Taking as an example, our method accounts for externality, meaning that values that have a positive contribution receive larger attribution scores, while those with negative contributions receive smaller attribution scores (albeit with larger absolute values). If we simply add up all the feature attribution scores across data points and compare their means, the positive and negative values would cancel each other out, making this comparison meaningless. However, the mean absolute feature attribution score for and are larger than those for and . This demonstrates that and have a greater impact, consistent with their having more paths in the DAG and aligned with the concept of externality.
Additionally, while the numerical scales of the feature values themselves may differ, their attribution scores are derived from their effects on the model output. Thus, the attribution scores for different features share a consistent numerical scale. When the data size is small, using variance to validate algorithm properties can be problematic due to higher uncertainty. Therefore, in our experiments, we attributed 1,000 data points to mitigate this uncertainty and provided the sampling error due to DAG-SHAP computations in Appendix D.2. Moreover, we believe that if we aim to validate the properties of the algorithm or evaluate the average importance of features, rather than merely interpreting a small number of data points, a sufficient amount of data is essential to ensure the reliability of the results.
Thank you for the response. I think that the axioms are sufficiently motivated and although I still have some misgivings about using the variance of the attribution scores to interpret importance with respect to externality, especially given small sample complexity, some of these aspects have been clarified for me. I believe that this discussion can be further explored and enhanced. For example, can the authors produce examples where the mean feature importance scores do distinguish the effect of external influences, rather than solely the variance, and do we still expect a greater range of scores in these cases? If so, can a threshold for the range of scores be used in practice to interpret the presence of external influences?
I have updated my score accordingly to marginal accept.
Thank you for your response. We are glad to have addressed some of your concerns. We think that there are situations where the mean feature importance scores can distinguish the effect of external influences while variance may not. For example, consider the following case: , where is a random variable uniformly distributed on , representing the exogenous influence of ; , where is another random variable uniformly distributed on , representing the exogenous influence of . The generation of follows . directly influences and indirectly influences through . influences both with its own exogenous influence and by transferring the indirect influence of . We aim to attribute values to each feature of a given input with respect to the baseline [0, 0]. The and values of the input to be explained are both greater than zero, since we set the baseline to the smallest possible values, the feature contributions of the input will always be positive. In this case, the mean feature importance scores of feature can be used to distinguish the effect of external influences. Because the externality effect is positive, and the contribution from the part outside the externality effect is also positive, this results in a larger overall attribution value for methods that can accurately measure the externality effect. However, if the input data we are explaining is very similar, such as [85.0, 77], [85.1, 77.2], [84.9, 76.9], then their attribution values will be close. In such cases, analyzing externality through variance becomes unreasonable. Here, our method will show that the attribution values are large due to external influences, which distinguishes it from other methods. In our original experiment, the larger variance was a result of sampling a large amount of data from the entire distribution (some of the explained data feature values were greater than the baseline, while others were smaller), which made it possible to demonstrate the presence of externality through the variance. We believe that using a range threshold to detect externality effect can be feasible under certain conditions. First, we should normalize the range obtained from our method against the range from other baseline methods to address potential issues with numerical scale differences across attribution tasks. Second, we need to assess whether the data being explained is overly concentrated. If the data is too concentrated, the range may lose significance, as the SHAP value calculation from sampling introduces some error. Therefore, we think that whether to use range or mean feature scores should be decided based on the specific context. If you have any further thoughts, we would be very happy to continue the discussion with you.
We sincerely appreciate the time and effort invested by the reviewers and the AC in reviewing our work and providing valuable feedback. Your suggestions, such as incorporating related works, adding theoretical analysis, and clarifying certain parts, have been very helpful.
Given the extensive discussions we have had with the reviewers, we would like to ensure there is no misunderstanding regarding the novelty of our work, even though the rebuttal period has ended. Specifically, we aim to clarify concerns about the novelty compared to the intrinsic causal influence paper. If the SCM framework is abandoned, as in our case with the use of Mixture Density Networks (MDN), the approach reduces to node intervention, as the expression of child nodes inherently depends on all parent nodes in SCMs. Depending on whether symmetric or asymmetric sampling is used, this can be interpreted as symmetry causal Shapley or asymmetry causal Shapley. We have already discussed these distinctions in our paper. Besides, we do not claim that the use of Mixture Density Networks constitutes our contribution. In both our paper and our responses to the reviewers, we explicitly state: "We utilize a Mixture to predict the distribution of child vertices after intervention on parent vertices within the input features, consistent with the approach used in causal Shapley value (Heskes et al., 2020)." The use of MDNs is a well-established method for modeling feature distributions in feature attribution tasks. Our primary contribution lies in the definition of edge intervention, which differs fundamentally from existing approaches. It simultaneously ensures externality and exogeneity, with the exogeneity property being similar to the concept of intrinsic contributions. While symmetry causal Shapley ensures externality and asymmetry causal Shapley ensures exogeneity, neither can guarantee both properties simultaneously, as our method does.
We are not discouraged by the rejection decision, as we have received many insightful comments that have helped us improve our work. Once again, we thank the reviewers and the AC for their time, effort, and suggestions