Disentangled and Self-Explainable Node Representation Learning
A GNN-based method for learning from scratch node embeddings with interpretable features, based on topological disentanglement of latent space components.
摘要
评审与讨论
This paper introduces a framework that generates self-explainable embeddings in an unsupervised manner. The method employs disentangled representation learning to produce dimension-wise interpretable embeddings, where each dimension is aligned with distinct topological structure of the graph. The paper drives new objective functions and optimizes simultaneously for both interpretability and disentanglement. Additionally, the paper proposes new metrics to evaluate representation quality and human interpretability.
优点
- The paper formalizes new and essential criteria for achieving disentangled and explainable node representations
- The paper introduces novel evaluation metrics to help quantifying the goodness of node representation learning
缺点
-
The paper introduces comprehensibility as a metric for evaluating explanations. However, how does the method perform on widely-used explanation metrics like AUC, Fidelity-, and Fidelity+? I strongly recommend that the authors also consider including these popular metrics to thoroughly assess the method's performance.
-
The paper proposes using comprehensibility and sparsity to enhance human interpretability. This raises a natural question: do these metrics truly support human understandability? I strongly recommend that the authors conduct human evaluations to validate that these metrics align with human comprehension.
-
The baselines are too weak. The authors compare with the node embedding methods: DeepWalk, Deep AE, DISE-GAE, DISE-FACE. I suggest the authors consider more modern methods to have the node embedding, such as normalizing flows [R1], diffusion models [R2], self-supervised methods [R3].
[R1] RicciNet: Deep Clustering via A Riemannian Generative Model, WWW 2024
[R2] Directional diffusion models for graph representation learning, NeurIPS 2024
[R3] GraphMAE: Self-Supervised Masked Graph Autoencoders, KDD 2022.
-
The method is self-explainable, but how effective is the visualization of its explanations? I suggest that the authors include a case study to show the visualization.
-
The paper claims utility for downstream tasks like link prediction and node classification. However, the baselines used for evaluating these tasks are relatively weak. I recommend that the authors consider incorporating more widely-used contrastive learning methods as baselines, such as [R4], to assess whether the proposed node embeddings achieve superior performance in these downstream tasks.
[R4]. Provable Training for Graph Contrastive Learning, NeurIPS 2024.
问题
See the weaknesses.
We gratefully thank the reviewer for the insightful comments provided in the review. For clarifications on the proposed evaluation metrics and key desiderata, please refer to the first global response. Here we provide detailed answers to the specific concerns.
-
Evaluation metrics: We hope that our global response on clarification of metrics has already resolved your concerns. Specifically, the AUC that you mentioned is the AUC score used by GNNexplainer for computing alignment of explanations with human perceived ground-truth explanations. Our Comprehensibility metric already serves a similar purpose, using another classification metric –F1– instead of AUC as the evaluation criterion. F1 is stricter in the sense that AUC can provide exaggerated scores for small sized explanations (i.e. the task of detecting small subgraphs is inherently highly imbalanced).
-
Enhancing human-interpretability: As detailed in our global response, we use plausibility to assess how well our explanations align with human reasoning and also point to its difference with Comprehensibility. Similar to GNNExplainer, we evaluate on synthetic datasets with planted subgraphs, which serve as ground-truth human rationales for explaining specific node labels. Our results in Appendix A.6.3 demonstrate that the most important embedding dimension identified by the downstream classifier consistently aligns with the true explanation subgraph. Additionally we add visualisations of the found explanations in Figure A7 of the Appendix. While this marks a significant step towards human-centric evaluation, conducting studies with actual human participants is beyond the scope of this work due to the substantial resources required. Moreover, designing effective human evaluation protocols remains a challenging research problem in itself. We view this as an important direction for future research and are optimistic that our proposed metrics provide a strong foundation for advancing human-in-the-loop evaluations.
-
Stronger baselines: We would like to clarify that the methods mentioned by the reviewer are not inherently interpretable and, therefore, do not serve as direct competitors to DiSeNE. Instead, DiSeNE is designed to complement existing unsupervised learning approaches by making their embeddings interpretable. To further highlight the versatility and effectiveness of DiSeNE, we are actively incorporating additional unsupervised learning methods into our framework—not as a comparison for competition, but to demonstrate the broad applicability and impact of the interpretability enhancements it provides. While these enhancements will be continuously added to our GitHub repository as part of an ongoing effort, they are beyond the scope of the current submission and will not affect the core contributions presented here.
-
Visualization of explanations: We appreciate the reviewer’s suggestion to enhance the clarity of the exposition through visualizations. We have included in Appendix A.5.1 a visualization of the explanation subgraphs in a simple scenario to highlight how the disentanglement helps extract more interpretable graph structures. Specifically, Figure A2 illustrates how various embedding methods can be interpreted once trained on the synthetic BA-Cliques dataset, where each dimension-specific subgraph is highlighted in a different panel. From the figure, it is evident that for DeepWalk and GAE the subgraphs exhibit significant overlap, which can be attributed to non-zero correlations between the latent features. In contrast, the uncorrelated features of DiSeNE produce distinct, non-overlapping explanations. Moreover, in Figure A7 (and related Appendix section) we report local explanations for binary node classification, comparing DiSeNE with well-established graph explainers, showing again competitive performance in highlighting the important sub-structures.
-
Please refer to Point 3.
In this paper, the authors propose a method for interpreting node embeddings. Their method is based on several desiderata in terms of graph structures. In addition, they also propose several new criteria to evaluate the explainability. Experimental results seem to validate their method on these criteria.
优点
-
The authors propose novel and interesting perspectives to consider node embedding interprebility. New metrics are also introduced to measure these perspectives.
-
Experiment results seem to be promising.
缺点
-
The proposed method for node interprebility is based on graph structures but semantic information is omitted, as verified by experiments that in GNN methods, node embeddings are initialized by identity matrix. However, in real-world applications this may not be true. For example, in social networks, if each node represents a user, it does not make sense that the user information is not considered in initial features. Whether or not the proposed method can be applied to semantic-rich graph scenarios are not discussed in the paper.
-
The three key desiderata is not well explained. In corresponding paragraphs, the authors directly derive how they achieve these goals without explaining them. The definition of Equation 1 does not make sense to me as well. Say when we have , it only means the the dimension is more important for embeddings of and (to compute the edge likelihood as in the paper, but what is the physical meaning of edge likelihood by the way?) than to other node embeddings with edges. Why could we use it to assign edges to dimensions? The dimension is shared across all node embeddings but the edge is not.
-
While the authors propose several metrics for evaluation, I am confused why they do not compare their methods with previous metrics and representative node classification explanation methods such as GNN Explainer. If there is a discrepancy, please specify.
Ying, Zhitao, et al. "Gnnexplainer: Generating explanations for graph neural networks." Advances in neural information processing systems 32 (2019).
问题
See the weakness above
We gratefully thank the reviewer for the insightful comments provided in the review. For clarifications on the proposed evaluation metrics and key desiderata, please refer to the first global response. Here we provide detailed answers to the specific concerns.
-
Application to semantic-rich scenarios: We acknowledge the reviewer’s concern regarding the applicability of our approach in semantic-rich scenarios. While GNNs are naturally designed to incorporate semantic input through node attributes, we intentionally took a different direction in our work. Our approach focuses exclusively on learning from the graph’s topological structure, as node attributes are not always aligned with structural information (e.g., in cases of heterophily [1]), which can degrade learning performance. Although semantic features can be integrated into the learning process in various ways [2], we chose a more straightforward approach that is broadly applicable in typical settings. Specifically, since the ultimate goal of node embeddings is to solve downstream tasks, our method produces interpretable structural features that can easily be combined with semantic attributes by concatenation . This results in a fully transparent feature set, enabling the use of well-established explanation techniques for tabular data to explain the downstream tasks effectively.
-
Clarifications on key desiderata: Please refer to our global response which mainly focuses on this clarification. We are happy to answer any further questions.
-
Comparisons with other interpretability-focused methods: We emphasize that our approach focuses on explaining model encodings, unlike methods such as GNNExplainer and PGExplainer, which explain model decisions. This makes direct comparisons unsuitable. However, we can use our interpretable embeddings to train downstream models. Explanations for model decisions are then extracted by using feature interpretability techniques. In that sense, existing feature interpretability techniques are complementary to our approach. Since the input features are interpretable, we can associate subgraphs with the most important features as explanations. Comparisons with GNNExplainer and PGExplainer are now provided in Appendix A.6.3, where we also included two additional synthetic datasets for node classification, Tree-Cliques and Tree-Grids. Reported results show that our method is capable of producing graph explanations with comparable, or even better, Plausibility metrics than GNNExplainer-PGExplainer.
[1] Zhu, Jiong, et al. "On the Impact of Feature Heterophily on Link Prediction with Graph Neural Networks." arXiv preprint arXiv:2409.17475 (2024).
[2] Tan, Qiaoyu, et al. "Collaborative graph neural networks for attributed network embedding." IEEE Transactions on Knowledge and Data Engineering (2023).
The paper introduces DISENE, a novel approach to self-explanatory node embedding, addressing the increasing need for interpretability in graph representation learning. DISENE’s design emphasizes disentangled representations, allowing each embedding dimension to independently capture specific, non-overlapping structural features of the graph. This self-explanatory capability is achieved through a combination of disentangled representation learning and the embedding of interpretability directly within the model architecture, moving beyond traditional post-hoc interpretability approaches.
优点
- Innovative Self-Explanatory Node Embedding Method: This paper introduces the DISENE model, which leverages disentanglement and self-explanatory design to enable each embedding dimension to correlate directly with specific structural features within the graph.
- Disentangled Feature Representation: DISENE ensures that each embedding dimension independently captures a unique graph structure feature, minimizing overlap across dimensions. This disentangled representation improves both the interpretability and robustness of the embeddings, providing more granular structural information for downstream applications that require feature independence.
缺点
- Reliance on Interpretability Metrics with Ambiguous Definitions: The paper relies heavily on various interpretability metrics, such as consistency and sparsity, to evaluate model performance. However, these metrics lack clear, systematic definitions in the text. For instance, Section 4.2: Evaluation Metrics introduces these terms conceptually but fails to provide precise mathematical definitions or specific calculation methodologies. This lack of clarity may lead to inconsistencies in metric implementation across studies, thereby impacting the reproducibility of the results and the fairness of comparisons. For interpretability metrics to serve as robust evaluation tools, a more rigorous definition and operationalization are essential.
- Ambiguity in Interpretability Metric Definitions: While the paper presents novel interpretability metrics, it leaves substantial gaps in detailing how these metrics are quantified. Section 3.4: Explanation and Evaluation discusses the interpretability aspects of the model but lacks the necessary formalism and computation process, making it challenging to assess the reliability and generalizability of these interpretability metrics. This ambiguity undermines the objective measurement of the model’s self-explanatory capabilities, calling into question the effectiveness of the proposed interpretability criteria.
- Lack of Competitive Edge in Link Prediction and Node Classification Tasks: DISENE exhibits only moderate performance in standard tasks like link prediction and node classification, without showcasing significant improvements. As reported in Table 3: Link Prediction Results and Table 4: Node Classification Results, DISENE’s accuracy and AUC scores do not exhibit a marked advantage over other established models. This highlights a trade-off between interpretability and predictive power, suggesting that while DISENE makes strides in interpretability, it may fall short in delivering competitive performance in traditional graph embedding tasks, limiting its overall efficacy in practical applications.
- Limited Dataset Scope, Focused on Small-Scale Citation Networks: The paper’s empirical analysis relies heavily on established citation networks such as Cora, CiteSeer, and PubMed (Section 4.1: Datasets). While these datasets are standard in graph embedding research, they are relatively small and structurally simple, which raises concerns about the model’s generalizability to larger, more complex graph structures. The limited dataset scope restricts the evaluation of DISENE’s performance on diverse, real-world graphs and challenges the model’s claims of broad applicability.
- Outdated Baseline Comparisons: The baseline models selected for comparison—primarily traditional graph embedding methods such as DeepWalk, node2vec, and GraphSAGE (Section 4.3: Baselines)—do not reflect the recent advancements in graph representation learning. Given the rapid evolution of graph neural networks, the lack of comparisons with modern methods, including GAT, Graph Transformers, and Graph Autoencoders, as well as other interpretable models, undermines the robustness of the empirical validation. Comparisons with more recent and sophisticated models would provide a stronger benchmark, allowing for a more accurate assessment of DISENE’s strengths and weaknesses in both interpretability and performance.
问题
See Weaknesses
In their response, the authors address the lack of definition and operationalization of explanatory metrics by providing conceptual explanations of the metrics and comparisons with existing methods, but do not detail specific mathematical formulas or pseudo-codes. In response to concerns about inadequate performance on traditional tasks, the authors acknowledged that the main contribution of the model is in explanatory enhancements and plan to supplement the revised version with trade-off analysis and optimization results. In response to concerns about the limited scope of the dataset, the authors illustrate the applicability of the model to complex graph structures through theory, but make no commitment to extend the scope of the evaluation. Finally, in response to the issue of an outdated baseline model, the authors do not respond directly to how to introduce more modern comparison methods, and the overall response fails to fully address all reviewer concerns.
Also considering the concerns raised by reviewer Y413, I have decided to drop my score.
We gratefully thank the reviewer for the insightful comments provided in the review. We hope the point-by-point responses that we are providing help to solve the remaining issues. Here we list detailed answers to your specific concerns.
-
Clarifications on the metrics: Please refer to our global response which mainly focuses on this clarification. We are happy to answer any further questions. We humbly disagree with the reviewer that mathematical definitions are not provided. We are happy to explain any provided mathematical definition which is unclear.
-
Ambiguity in Interpretability Metric Definitions: We hope that our global response helps clarify the metrics. We would request the reviewer to concretely point out what exactly seems ambiguous.
-
Lack of competitive edge in link prediction and node classification tasks: We respectfully disagree with the reviewer’s concern regarding the downstream task performance of our method. As demonstrated in Appendix Tables A3-A4, our approach achieves competitive results, with only minor performance losses. These losses are within the expected range of the already mentioned trade-off between interpretability and performance. It is important to highlight that this trade-off is a fundamental consideration in interpretable machine learning. While purely performance-oriented methods may marginally outperform ours, they often sacrifice transparency and human interpretability, which are critical for applications requiring explainability. Our method prioritizes interpretability without compromising task performance to an extent that would diminish its practical utility.
-
Datasets: We respectfully disagree with the reviewer's analysis regarding the datasets used in our study. Our empirical evaluation does not rely heavily on established citation networks, as we include only Cora from this domain. We acknowledge that exclusively relying on citation networks such as Cora, CiteSeer, and PubMed has been questioned in prior research [1]. To address this concern, we have deliberately incorporated datasets from a diverse range of network domains, including web pages (WIKI), social networks (FB), and biological networks (PPI). This broader selection ensures that our findings are not biased toward a specific type of network.
-
Outdated baseline comparisons: We highlight that some of the methods mentioned by the reviewer as modern competitors are already used in our experiments, i.e. GAEs, and other ones such as GATs are supervised models for node classification, while our focus is on unsupervised/self-supervised models. That said, we are happy to conduct experiments on additional baselines, even if we do not expect to identify more competitive methods, as most existing self-supervised models are designed primarily to optimize performance rather than prioritize explainability. We emphasize that our framework is more general and can be adapted to be used together with other unsupervised learning techniques. We complement other methods by making the learned embeddings interpretable. Nevertheless, in Appendix A.6.3 of the revised manuscript we now report a comparison for evaluating local explanations in node classification which includes GAT among the GNN methods.
[1] Salha, G., Hennequin, R. and Vazirgiannis, M., 2021. Simple and effective graph autoencoders with one-hop linear models. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2020.
We would like to emphasize that our Initial Response was just a clarification of the main concerns pointed out by several reviewers. We have now posted a complete rebuttal, where we have addressed each issue and included also the revised paper.
We would also like to point out that all our metrics were accompanied by mathematical formulas already in the initial version, and that pseudo-codes have been added in the Appendix. In the box above, we have also answered the reviewer’s concerns about the limited scope of the datasets and baseline models.
In our rebuttal, we clarified any further issues raised by reviewers. If you have concrete questions about any mathematical equations, or about other answers, please let us know.
This paper introduces DISENE (Disentangled and Self-Explainable Node Embedding), a novel framework designed to generate self-explainable and disentangled node embeddings in graph-based learning tasks. The method leverages disentangled representation learning to ensure that each dimension of the learned embeddings corresponds to a unique topological substructure in the input graph, aiming to improve the interpretability of learned representations. The authors propose new evaluation metrics, such as overlap consistency and dimensional sparsity, to quantify the degree of disentanglement and interpretability of the embeddings.
优点
The paper presents a well-structured approach with clearly defined disentanglement objectives and proposed evaluation metrics. The experiments cover multiple datasets and tasks, providing a comprehensive evaluation of the proposed method.
缺点
-
While the paper claims to generate "self-explainable" embeddings, the interpretability largely relies on proposed metrics (e.g., overlap consistency) without providing direct, human-interpretable explanations for individual embedding dimensions. The method lacks intuitive explanations for individual nodes or substructures, limiting the practical utility of the "self-explainable" claim, particularly for non-technical users or domain experts who might expect more direct insights from the embeddings.
-
The paper lacks comprehensive comparisons with other interpretability-focused methods in graph learning, such as GNNExplainer or PGExplainer, which provide more fine-grained explanations at node and edge levels. A detailed comparison with these methods in terms of interpretability (not just performance) would strengthen the paper's claims. Additionally, comparisons with simpler feature-based interpretability approaches would help highlight the unique advantages of the proposed method.
-
Evaluation metrics may be subjective: While novel, the new evaluation metrics are somewhat subjective and not widely used in the graph learning community. Calibrating these metrics against more established interpretability evaluation standards or providing more detailed justification for why these metrics are appropriate would be helpful. For instance, how does overlap consistency relate to human ability to interpret the learned embeddings? More human-centric evaluations, such as user studies or expert assessments, could help validate the practical interpretability of the embeddings.
问题
-
How does DISENE compare to graph interpretability methods like GNNExplainer in terms of explainability? A direct comparison with methods like GNNExplainer or PGExplainer would be helpful, not just in terms of performance but also in the quality of explanations provided. Can DISENE provide interpretable explanations at node or edge level similar to these methods?
-
How does DISENE scale to larger graphs? Could the authors provide more details about the computational complexity of the method? Specifically, how does the disentanglement objective affect training time and memory usage, and how does the method scale to larger graphs?
-
How do the proposed metrics relate to human interpretability? Since these metrics are novel, it would be useful to understand how well they align with human judgments of interpretability. While the proposed metrics help quantify disentanglement, it would be helpful to provide visualizations or examples where specific embedding dimensions correlate with interpretable graph structures (e.g., communities, topics). This would help demonstrate the practical interpretability of the method.
We gratefully thank the reviewer for the insightful comments provided in the review. For clarifications on the proposed evaluation metrics and key desiderata, please refer to the first global response. Here we provide detailed answers to the specific concerns.
W1. Practical utility of self-explainability: We appreciate the reviewer’s observation regarding the need for further clarification of the term "self-explainable." In our context, "self-explainable" refers to the ability to derive global explanations for the embedding space in the form of subgraphs (one per dimension), as detailed in Section 3.1. However, this feature would hold little value if the resulting subgraphs were not interpretable—i.e., if they could not convey meaningful insights to a human observer, despite being straightforward to extract. Therefore, the capability to generate dimensional subgraphs is intrinsically tied to identifying human-comprehensible functional components within the input graph, such as structural communities. In essence, the embeddings should align with significant substructures of the graph. To illustrate this, Figure A2 in the Appendix provides a practical visualization of the global explanations. Moreover, the corresponding subsection A.5.1 highlights why our method is particularly effective for extracting them.
W2. Comparisons with other interpretability-focused methods: We emphasize that our approach focuses on explaining model encodings, unlike methods such as GNNExplainer and PGExplainer, which explain model decisions. This makes direct comparisons unsuitable. However, we can use our interpretable embeddings to train downstream models. Explanations for model decisions are then extracted by using feature interpretability techniques. In that sense, existing feature interpretability techniques are complementary to our approach. Since the input features are interpretable, we can associate subgraphs with the most important features as explanations. Comparisons with GNNExplainer and PGExplainer are now provided in Appendix A.6.3, where we also included two additional synthetic datasets for node classification, Tree-Cliques and Tree-Grids. Reported results show that our method is capable of producing graph explanations with comparable, or even better, Plausibility metrics than GNNExplainer/PGExplainer.
W3. Calibration of proposed metrics against established metrics: We have addressed this point in our global response under clarification on metrics. Please let us know if you have further questions. We would like to emphasize that we use plausibility to assess how well our explanations align with human reasoning. Similar to GNNExplainer, we evaluate on synthetic datasets with planted subgraphs, which serve as ground-truth human rationales for explaining specific node labels. Our results in Appendix A.6.3 demonstrate that the most important embedding dimension identified by the downstream classifier consistently aligns with the true explanation subgraph. Additionally we add visualizations of the found explanations in Figure A7 of the Appendix. While this marks a significant step towards human-centric evaluation, conducting studies with actual human participants is beyond the scope of this work due to the substantial resources required. Moreover, designing effective human evaluation protocols remains a challenging research problem in itself. We view this as an important direction for future research and are optimistic that our proposed metrics provide a strong foundation for advancing human-in-the-loop evaluations.
Q1. Please refer to the response W2.
Q2. Scalability: We appreciate the reviewer’s question about the scalability of our approach. We put in the appendix A.3 an analysis of the complexity of the algorithm, showing that our method has runtime complexity and space complexity ( refers to the window size, to walks length). Despite the proposal of a scalable approach to very large graphs is not among the purposes of this work, we emphasize that these quantities are in line with well-established techniques for node embeddings [1].
Q3. Visualizations of interpretable graph structures: We appreciate the reviewer’s suggestion to enhance the clarity of our metrics through visualizations. Figure A2 illustrates how various embedding methods can be interpreted once trained on the synthetic BA-Cliques dataset, where each dimension-specific subgraph is highlighted in a different panel. From the figure, it is evident that for DeepWalk and GAE the subgraphs exhibit significant overlap, which can be attributed to non-zero correlations between the latent features. In contrast, the uncorrelated features of DiSeNE produce distinct, non-overlapping explanations.
[1] Tsitsulin, Anton, et al. "FREDE: anytime graph embeddings." Proceedings of the VLDB Endowment 14.6 (2021).
Dear Reviewers,
We sincerely appreciate the time and effort you have invested in providing valuable insights and suggestions. We acknowledge that many of your comments focus on clarifying our desiderata and evaluation metrics, which form the core of our paper. To facilitate further discussions, we will start by outlining key clarifications in these areas. In the coming days, we will provide a detailed rebuttal addressing each of your concerns individually and upload a revised version of the paper for your review. In the meantime, we kindly request you to review the following key clarifications and share any further questions or feedback they may raise. We are confident that addressing these critical aspects will effectively resolve many of the concerns you have highlighted.
Clarification on metrics and a perspective on how they relate to existing metrics
Based on the reviewers' suggestions, we provide a detailed explanation of how our evaluation metrics relate to and extend existing metrics in the context of explainability. Unlike prior works, which primarily focus on explaining model decisions, our approach is tailored to explain model encodings—a novel perspective. Here's how our metrics align with and build upon previous approaches:
- Comprehensibility measures how well the explanations of embedding dimensions align with human-interpretable structures, such as community structures in the graph. These are often the most intuitive units for understanding a graph.
- We evaluate comprehensibility using the F1-score, quantifying how closely the explanations match human-understandable ground truth.
- This approach is analogous to the accuracy metric used in GNNExplainer and subsequent works, where explanations for model decisions are compared to planted substructures in the graph, perceived as human-readable justifications for a node’s label.
- Sparsity quantifies the explanation size. As we produce soft masks we use entropy for this quantification. Smaller the entropy shorter the explanation. Motivation for using entropy for explanation size quantification can be found in one of the existing works, Zorro [1].
- Overlap Consistency and Positional Coherence serve as metrics to evaluate the faithfulness of explanations. In our context, faithfulness reflects how well the embeddings align with the explanation structures they are intended to represent. We measure this alignment as follows:
- Overlap Consistency: This metric assesses whether the correlation between two embedding dimensions is mirrored in their corresponding explanations. For example, if two embedding dimensions are correlated, their explanation substructures should also overlap. This ensures a direct alignment between the relationships in the embedding space and the structural explanations derived from them.
- Positional Coherence: This metric evaluates whether the feature value of a node in a specific embedding dimension corresponds to its spatial relationship with the explanation substructure for that dimension. Simply put, if a node has a high value in embedding dimension , it should be positioned closer to the substructure that explains dimension . This ensures that the spatial arrangement of nodes in the graph reflects the embeddings' latent features, reinforcing consistency between the embeddings and their explanations.
- Plausibility evaluates how closely the explanations align with human reasoning, a concept also referred to as "Plausibility" in existing works like Bagel [2]. Unlike Comprehensibility, which measures the alignment of explanation substructures with community structures in the graph (independent of task-specific information), plausibility focuses on explanations for decisions made in downstream tasks.
The process involves two key steps:
- Training on Downstream Tasks: Using the interpretable embeddings, we train a model to perform a downstream task. This model can either leverage feature interpretability techniques, such as SHAP, or use inherently interpretable models, such as logistic regression.
- Extracting and Evaluating Explanations: The substructures corresponding to the most important features (identified by the interpretability techniques) or the explanations directly returned by the interpretable model serve as the final explanations. These explanations are then compared with the available human rationale behind the decisions, and the alignment is quantified to compute the Plausibility Score.
[1] Funke, Thorben, et al. "Zorro: Valid, sparse, and stable explanations in graph neural networks." IEEE Transactions on Knowledge and Data Engineering 35.8 (2022): 8687-8698.
[2] Rathee, Mandeep, et al. "Bagel: A benchmark for assessing graph neural network explanations." arXiv preprint arXiv:2206.13983 (2022).
Clarification on key desiderata
One of the reviewers explicitly asked about dimensional interpretability which we first explain in detail below. We also provide intuitive explanations about the other two and are happy to provide more details if required.
- Dimensional interpretability. Our framework assigns "responsibility" for reconstructing a local relationship (an edge) to specific dimensions, based on their contribution to the edge likelihood or the probability of reconstructing that edge through the embeddings. Equation 1 quantifies this contribution. We appreciate the reviewer’s insight regarding the relationship between the locality of edges and the global nature of embeddings. Our framework bridges this gap as follows:
- Decomposing Relationships by Dimensions: Embeddings are multidimensional, with each dimension representing specific latent features (e.g., social similarity, geographic proximity, shared interests). A given edge between and may rely more heavily on certain dimensions. For instance, if dimension strongly influences edges where proximity is critical, it can be interpreted as capturing the property of "geographic proximity." This perspective shows how local relationships (edges) are directly informed by the global dimensions of the embeddings, with each dimension contributing uniquely to reconstructing particular relationships.
- Global Explanation composed of Local Roles: The global graph structure is reconstructed by aggregating these local relationships. Each dimension contributes to reconstructing specific edges or substructures, and together, these contributions create a unified representation of the entire graph—analogous to assembling a puzzle, where every piece is essential to completing the overall picture. Thus, the global nature of embeddings emerges as a direct consequence of their contributions to local relationships, resolving the apparent conflict between locality and globality.
- Structural faithfullness mainly says that the obtained embeddings should be faithful to input graph structure which we enforce using a graph reconstruction-based loss function.
- Finally, we have structural disentanglement which seeks to ensure that the dimensions of the embeddings are disentangled, meaning each dimension is responsible for reconstructing a distinct set of edges. This is achieved by minimizing the loss function in Equation 4, which penalizes correlation between the edge importance attribution scores of different dimensions. By reducing this correlation, we ensure that each dimension contributes uniquely to the reconstruction of edges, avoiding redundancy and promoting interpretability.
We sincerely thank the reviewers for their insightful comments, which have significantly contributed to improving our work. In response, we have submitted a revised version of the paper, and addressed all concerns through detailed point-by-point responses.
In the revised manuscript, we have substantially updated the Appendix to incorporate new experiments on node classification explanations with graph-based explainers, additional images for visualizing the explanations, and technical discussions as requested by the reviewers. We strongly believe that these additions better support the proposed methodology, extending the initial submission with new empirical evidence.
In the new experiments, we observed that computing reliable Plausibility scores in Table A5 requires focusing on node instances with correctly classified labels. This is because accurate explanations are contingent upon the correctness of the model's decisions. Based on this observation, we also updated the results presented in Table 3 of the main paper. However, we note that these updates resulted in only minor changes to the previously reported scores, because of the high overall accuracy of the classifiers as illustrated in Figure A6.
To further strengthen the new experiments, we have introduced two additional synthetic datasets to analyze node classification explanations, Tree-Grids and Tree-Cliques. Compared to the previously used BA-Cliques and ER-Cliques, Tree-Cliques introduces a tree structure as scaffolding, while Tree-Grids presents also a different ground-truth motif. These datasets have been also incorporated into the experiments of the main paper. This addition impacts Table 3 and the corresponding discussion on the same page, providing additional evidence to support our findings. Besides, our initial global response provides required clarfifications on key desiderata and evaluation metrics.
We are confident that these revisions address the reviewers’ concerns and further enhance the quality of our work. We kindly ask reviewers to increase their score if our rebuttal has addressed their main concerns.
Dear Reviewers,
We believe that we have addressed all of your concerns. As the discussion period is coming to an end, please let us know any of your remaining concerns so that we can provide timely answers.
Best, Authors
This paper proposes a new framework named DISENE (Disentangled and Self-Explainable Node Embedding) for generating self-explainable and disentangled node embeddings in graph-based learning tasks. Reviewers agreed that this paper brings an interesting perspective by considering node embedding interoperability and also presents new evaluation metics. However, reviewers raised many concerns regarding the newly proposed metrics, the selection of baselines, limited datasets, etc. Overall, the current version of this work is not ready for publication at ICLR.
审稿人讨论附加意见
Reviewers raised many concerns regarding the newly proposed metrics, the selection of baselines, limited datasets, missing human evaluations, etc. Unfortunately, the authors' responses did not fully convince the AC and the reviewers.
Reject