Disentangled Graph Spectral Domain Adaptation
摘要
评审与讨论
To break away from the attribute and topology entanglement on Unsupervised Domain Adaptation (UDA), this paper introduces a novel method, DGSDA, directly aligning complicated graph spectral filters. This paper conducts experiments on various types of graph datasets to demonstrate the effectiveness of DGSDA.
给作者的问题
Refer to Weaknesses.
论据与证据
Yes, the claims made in the submission are supported by clear and convincing evidence from both experimental and theoretical aspects.
方法与评估标准
The proposed DGSDA is well aligned with the problem of UGDA. The disentanglement of attribute and topology alignments, the use of spectral filter alignment, and the comprehensive experiments on diverse datasets collectively demonstrate its effectiveness.
理论论述
Yes, I have checked the correctness of the proofs for the theoretical claims presented in the paper. Specifically, I have verified the proofs for Theorems 4.3, 4.4, and 4.5, which are central to the theoretical analysis of DGSDA.
实验设计与分析
Yes, I have checked the soundness of the experimental designs (including the compared methods and experimental setups) and analyses.
补充材料
Yes, I reviewed the supplementary material, which includes the proofs for the theorems, detailed dataset statistics, and additional experimental results. The supplementary material provides comprehensive and detailed support for the claims and results presented in the main paper.
与现有文献的关系
The key contributions are directly related to the broader literature by proposing disentanglement techniques to solve the problems in traditional UGDA. It leverages recent advancements in spectral GNNs and builds on theoretical foundations of Lipschitz continuity and model alignment.
遗漏的重要参考文献
No, there are no essential related works missing in the paper that need further discussion.
其他优缺点
Strengths
- The paper is well-written and easy to follow.
- Experimental results on benchmark datasets are provided.
Weaknesses
The authors fail to clearly attribute the source of performance improvement in their method. First, DGSDA employs Bernstein polynomials, which are not commonly used in comparison methods. This choice alone may contribute to the performance gains, making it unclear how much of the improvement is due to the disentanglement strategy itself. Further empirical results are needed to isolate the specific role of disentanglement. Additionally, given the lack of labels in the target domain, it is unclear how the model ensures that the target domain parameters accurately capture the topological patterns.
其他意见或建议
-
Some equations, such as Eq. (4) and Eq. (9), as well as Theorem 4.4, appear to be slightly misaligned. It is recommended to adjust the line breaks for better formatting.
-
On line 182: It possesses the following three advantages instead of two.
-
The name "PairAlign" is misspelled as "PariAlign" in Tables 1 and 2.
Q1. The authors fail to clearly attribute the source of performance improvement in their method. Further empirical results are needed to isolate the specific role of disentanglement.
R1. To address your concerns, we have conducted an additional experiment to clearly identify the source of performance improvement. This experiment introduced a variant of DGSDA that uses Bernstein polynomials and directly aligns node representations instead of separately aligning attributes and topology. The variant model's performance is consistently worse than that of our full DGSDA, as shown in the following table. This indicates that the disentanglement play a crucial role in enhancing the performance of our model.
| A→C | C→A | A→D | D→A | C→D | D→C | |
|---|---|---|---|---|---|---|
| DGSDA | 83.570.22 | 75.540.28 | 76.900.51 | 74.070.56 | 78.380.28 | 82.920.15 |
| variant model | 81.010.32 | 73.250.22 | 73.150.19 | 72.030.24 | 76.320.17 | 80.250.21 |
Q2. Given the lack of labels in the target domain, it is unclear how the model ensures that the target domain parameters accurately capture the topological patterns.
R2. We have conducted two additional experiments to demonstrate that the unsupervised loss can provide effects similar to supervised loss in terms of model parameter optimization. In the experiment, the supervised variant of DGSDA employs the supervised loss from 10% labeled data in the target domain, replacing the unsupervised loss: the spectral alignment loss and the entropy loss. The results are shown below.
| A→C | C→A | A→D | D→A | C→D | D→C | |
|---|---|---|---|---|---|---|
| DGSDA | 83.570.22 | 75.540.28 | 76.900.51 | 74.070.56 | 78.380.28 | 82.920.15 |
| DGSDA (supervised) | 83.200.52 | 76.372.75 | 79.490.56 | 77.100.83 | 80.160.65 | 83.030.48 |
The results indicate that the unsupervised DGSDA achieves comparable performance to the supervised version, highlighting the effectiveness of the unsupervised losses. This can be attributed to two key factors. First, by regularizing the coefficients of Bernstein polynomials (in Eq. 6), the method explicitly aligns the spectral filters across different domains. This alignment enables the target filters to inherit topology-aware patterns from the source domain, even without labels. Second, the entropy loss sharpens cluster assignments, which implicitly encourages the model to learn more discriminative topological features.
In addition, we have compared the learned filter curves in both labeled and unlabeled target domains. The results can be found at https://anonymous.4open.science/r/DGSDA/figure/DC.png. In the supervised learning case, the model parameters capture the homophily topological pattern, characterized by increasing low-frequency information and suppressing high-frequency information. Similarly, in the unsupervised learning case, the target domain parameters can capture the same topological pattern.
Q3. Some equations, such as Eq. (4) and Eq. (9), as well as Theorem 4.4, appear to be slightly misaligned. It is recommended to adjust the line breaks for better formatting.
R3. We will adjust the line breaks in Eq. (4), Eq. (9), and Theorem 4.4 to ensure proper alignment and improve the overall formatting.
Q4. Expression error on line 182 and spelling mistake of model "PairAlign".
R4. We will perform a thorough review of the manuscript to correct any expression errors and spelling mistakes.
The authors' rebuttal addresses all my concerns. After checking the comments from other reviewers, I raise my score.
This study addresses the challenge of unsupervised graph domain adaptation in scenarios involving distribution shifts and missing labels by proposing a novel solution that disentangles the distribution shift. Specifically, the method DGSDA refines the topology alignment into GNN alignment and incorporates spectral filter alignment loss.
给作者的问题
See Weaknesses.
论据与证据
This paper provides a comprehensive evaluation of the DGSDA model, supported by both theoretical analysis and extensive experiments, thereby effectively demonstrating its efficacy. Thus, the claims made in the paper are well-substantiated by clear and convincing evidence.
方法与评估标准
The proposed method involves (1) disentangling embedding alignment into topology and attribute alignments and (2) exploiting alignments of filter parameters to flexibly implement topology alignment, both of which make sense for the graph-domain adaptation problem.
理论论述
After a detailed examination of the proof, I have essentially confirmed its correctness.
实验设计与分析
I examined all the experimental designs and analyses in Section 5 and Section B, and believe it effectively demonstrates the properties of the proposed model.
补充材料
Driven by my interest in this topic, I have thoroughly reviewed the supplementary material, including the theoretical proofs and additional experimental details.
与现有文献的关系
Current graph domain adaption works focus on proposing topology alignment strategies, including aligning the edge distributions of two domains using the CSBM. DGSDA utilizes a new filter alignment to improve flexibility.
遗漏的重要参考文献
No other related works that are essential to understanding the (context for) key contributions need to be discussed or cited.
其他优缺点
Strengths
1)The idea of GNN alignment is interesting. 2)The method is simple yet has solid theoretical support.
Weaknesses
-
The title appears to be somewhat ambiguous. The title does not reflect the focus on the unsupervised problem in graph domain adaptation, which is a key aspect of the study.
-
The notation used in the paper is not clear. For example, represents the node attribute matrix of the target domain, but it can also be interpreted as the transpose of the node attribute matrix.
-
The description of the proposed method lacks clarity. While and are described in words, they are not accompanied by clear, formal formulations.
其他意见或建议
- In Section 4.1 Distribution Shift Disentanglement, “Topology alignment中“the graph data shift can be simplified from to ” seems to be misstatement. The correct statement should be .
Q1. The title appears to be somewhat ambiguous. The title does not reflect the focus on the unsupervised problem in graph domain adaptation, which is a key aspect of the study.
R1. Thank you for pointing this out. The primary focus of this paper is indeed on the unsupervised problem in graph domain adaptation. While our architecture can also accommodate target labels when available, the unsupervised scenario remains our main emphasis. We will consider revising the title to better reflect this focus.
Q2. The notation T used in the paper is not clear. For example, X^T represents the node attribute matrix of the target domain, but it can also be interpreted as the transpose of the node attribute matrix.
R2. Thanks for your careful check. We will correct to to denote the transpose of the matrix.
Q3. The description of the proposed method lacks clarity. While and are described in words, they are not accompanied by clear, formal formulations.
R3. The formal formulations of and are presented as follows:
where represents the kernel function.
We will add them to the appendix to enhance the clarity of the manuscript.
Q4. In Section 4.1 Distribution Shift Disentanglement, "Topology alignment in the graph data shift can be simplified from to " seems to be misstatement. The correct statement should be .
R4. Thanks for pointing out this important detail. We will correct this formula in the revised manuscript and ensure that all related discussions are consistent with this accurate representation.
This paper introduces a novel pipeline for unsupervised graph domain adaptation by disentangling attribute and topology alignments by considering that attribute alignment has been widely investigated. Based on the aligned node attribute, the topology alignment is converted to the model alignment by taking into consideration the widely developed GNN models. Then, the Bernstein polynomial is employed as the backbone for its approximation property and spectral perspective. Theoretical analysis and experimental evaluations justify the pipeline and proposed models.
给作者的问题
Although the authors claim the use of model alignment avoids the requirements of pseudo-labels, I wonder whether pseudo-labels also benefit the model alignment since the polynomial coefficients can also be learned from labels as supervision. Whether the predicted pseudo-labels on the target domain can be employed?
The theoretical analysis focuses on the Bernstein spectral GNN alignment. Does the decomposition strategy possess solid theoretical findings?
Unsupervised graph domain adaptation is often composed of multiple terms as the objective function, and thus, training it is difficult to balance the impacts of terms. Can these terms be unified to felicitate the training?
论据与证据
The correctness of the introduced pipeline is verified by the derivation from the Bayesian theorem. The replacement of topology alignment with model alignment makes sense due to the connection between topology and GNN models. Theoretical analysis and experiments demonstrate the statements.
方法与评估标准
As shown in the previous section, I think both the pipeline and the proposed GNN alignment make sense. Besides, the employment of these two strategies reduces the requirement of the pseudo label in the topology alignment, which often relies on the accurate estimation of the node membership.
理论论述
I cursorily examined the proof of the theorems in the appendix and believe they are correct.
实验设计与分析
The experiments are extensive, including quantitative and qualitative analyses. The setting is widely used in this field, and the baselines are recently proposed competitive ones. Thus, the experimental evaluations are convincing.
补充材料
I've had a general look at what's in the appendix, especially the proof.
与现有文献的关系
Unsupervised graph domain adaptation is a critical topic in graph learning for the graph foundation model design. Although there exist GNN-based methods, as reviewed in the related work section, this paper gives a novel methodology by both decomposition and model parameter alignment. This is more powerful and efficient compared to the existing ones by considering the spectral perspective.
遗漏的重要参考文献
Sufficient. It covers most existing competitive SOTA and important milestones.
其他优缺点
This paper possesses high originality, which may inspire the following variants on model alignment. The claims are justified with a rigorous theory investigation and extensive experiments. The main weakness is the lack of source code. Since the proposed method is flexible and complicated with four terms as the objective function, it is necessary to provide source code to make the read easy to get the implementation details.
其他意见或建议
None
Q1. The main weakness is the lack of source code.
R1. The source code has been made available at (https://anonymous.4open.science/r/DGSDA) for verification purposes. We promise to make the code public once this paper is accepted.
Q2. Whether the predicted pseudo-labels on the target domain can be employed?
R2. To answer your valuable question, we have conducted experiments to verify the feasibility of using predicted pseudo-labels on the target domain. This experiment introduces a variant model named DGSDA+PL, which combines pseudo-labels of the target domain. The compared results reveal that pseudo-labels consistently lead to performance degradation in all domain adaptation scenarios, demonstrating the infeasibility of the mentioned scheme.
| A→C | C→A | A→D | D→A | C→D | D→C | |
|---|---|---|---|---|---|---|
| DGSDA | 83.570.22 | 75.540.28 | 76.900.51 | 74.070.56 | 78.380.28 | 82.920.15 |
| DGSDA+PL | 81.232.52 | 74.402.22 | 75.362.37 | 71.161.33 | 77.031.04 | 79.451.49 |
This is primarily due to the low reliability of the pseudo-labels generated in the early stages of training, which can cause error accumulation in learning processes, and the noise amplification effect in graph neural networks, where erroneous pseudo-labels propagate through message-passing mechanisms. This is also the reason why the proposed method outperforms topology alignment with pseudo-labels.
Q3. Does the decomposition strategy possess solid theoretical findings?
R3. We acknowledge that the current analysis focuses on demonstrating the feasibility of disentanglement without providing a precise error bound between the entangled and disentangled representations. This is a common limitation in the graph disentanglement field, where theoretical guarantees are still lacking. We will strive to address this in future work.
Q4. Unsupervised graph domain adaptation is often composed of multiple terms as the objective function, and thus, training it is difficult to balance the impacts of terms. Can these terms be unified to felicitate the training?
R4. We understand your concern for the stability of training. Unfortunately, these terms cannot be unified, as each of the loss terms focuses on different objectives as other GDA methods. To be specific, targets minimizing prediction error in the source domain, ensuring effective training on labeled source data. focuses on aligning spectral coefficients between the source and target domains. aims to align feature representations to reduce distribution differences. promotes model adaptation to the target domain through unsupervised learning. Moreover, the experiments in the hyper-parameter analysis demonstrated that our model is relatively robust to the hyper-parameters used for weighting these terms. We will explore integrating these loss terms in future work to facilitate easier balancing.
This paper proposes Disentangled Graph Spectral Domain Adaptation (DGSDA) to alleviate the inaccuracies of pseudo-labels and the limited expressive ability of graph encoders to capture rich topology information. It decomposes the attribute and topology alignments and replaces the topology alignment with the powerful model alignment. To harness the parameter efficiency of spectral GNNs, the Bernstein polynomial is employed, and the polynomial coefficients are aligned. Theoretical analysis shows its rationality and superiority compared to existing ones. Experimental experiments also justify the claims.
给作者的问题
-
Why is the performance of the proposed method lower than that of JHGDA on the traffic network dataset?
-
The legend in Figure 2 is not clear. What are the differences between source/target and A/C.
-
Why are the results in Figure 2 continuous curves? I think they should be discrete values.
论据与证据
The rationality and superiority of the proposed DGSDA are supported by both theoretical and experimental evidence. It is clear and convincing.
方法与评估标准
The proposed DGSDA makes sense and is novel by decomposing attribute and topology alignments. The experimental evaluations are reasonable with widely-sed criteria.
理论论述
The correctness of theorems is checked, as well as their proofs in the appendix. However, the symbols are very complex.
实验设计与分析
The soundness of the experiments is checked. The design is based on widely-employed datasets and criteria. The performances are verified on varying datasets. The ablation study and hyper-parameter analysis are conducted.
补充材料
The proof and experimental details in the appendix are checked.
与现有文献的关系
UGDA is an important topic in the graph machine learning field. Previous work focuses on the employment of DA methods in i.i.d. data. This paper is along the line of topology alignment. It alleviates the issue of pseudo-label inaccuracy by adopting spectral model alignment beyond the topology one. Therefore, it is novel. Besides, the decomposition of topology and attribute alignment are also novel and interesting.
遗漏的重要参考文献
The references are sufficient.
其他优缺点
Strengths
The motivations are interesting and make sense.
The proposed method is novel and solid.
The theoretical justification is rigorous.
The experimental evaluations are convincing.
Weakness
The symbols, especially the theory part, are too complex to read.
其他意见或建议
- Some explanations should clarify the theoretical differences from [You et al., 2023].
- It is better to provide the source code to enhance reproducibility.
Q1. The symbols, especially the theory part, are too complex to read.
R1. Thanks for your feedback. We will thoroughly review and modify all the symbols to make them easier to read.
Q2. Some explanations should clarify the theoretical differences from [You et al., 2023].
R2. The theoretical differences between this paper and the mentioned work [You et al., 2023] is Polynomial Choice & Lipschitz Properties: The proposed DGSDA adopts Bernstein polynomials, whose Lipschitz constant is determined by the ground-truth function (in Theorem 4.3), rather than being restricted by the basic polynomial coefficients as in the work [You et al., 2023]. This allows more flexible and accurate spectral domain adaptation.
Q3. It is better to provide the source code to enhance reproducibility.
R3. According to your suggestion, the source code has been made available at (https://anonymous.4open.science/r/DGSDA) for verification purposes.
Q4. Why is the performance of the proposed method lower than that of JHGDA on the traffic network dataset?
R4. The proposed method generally outperforms JHGDA on most tasks in the traffic network dataset and is less effective than JHGDA on the and tasks. The performance weakness may be attributed to overfitting due to limited training data. The Brazil and Europe datasets contain a relatively small number of nodes, which makes the proposed models with multiple constraints more prone to over-capturing the patterns of individual hub nodes. This, in turn, makes it difficult to effectively generalize to the overall structure of the target domain. Nonetheless, the extensive results illustrate the effectiveness of the proposed method.
Q5. The legend in Figure 2 is not clear. What are the differences between source/target and A/C.
R5. The solid lines represent the filter curves trained on the domain adaptation tasks, while the dashed lines represent the filter curves obtained by training BernNet only on the corresponding datasets. Thus, “source” and “target” in the legend mean the filter curves of source domain encoder and target domain encoder trained on the domain adaptation tasks; “A” and “C” in the legend denote the filter curves of BernNet on the A and C datasets, respectively. We will change them to “Training on A” or “Training on C” in the revised manuscript to enhance readability.
Q6. Why are the results in Figure 2 continuous curves?
R6. Figure 2 shows the learned polynomials instead of their coefficients, and thus contain continuous curves. The x-axis in Figure 2 denotes the normalized graph signal frequency , and the y-axis represents filter gain . They are independent of the polynomial order and coefficients . The Bernstein polynomial is defined as , where is the Bernstein basis function. For any given , it is first mapped to the [0, 1] interval (i.e., ), and is calculated using the learned coefficients and the corresponding Bernstein basis functions . Thus, by sampling sufficiently many values, we can plot the continuous curves shown in Figure 2.
Summary: This paper proposes DGSDA, a new method for unsupervised graph domain adaptation (UGDA) that disentangles attribute and topology alignment and introduces spectral filter alignment via Bernstein polynomials. The approach avoids reliance on pseudo-labels for topology modeling and instead aligns model parameters directly in the spectral domain. The paper presents theoretical insights (including a bound and analysis of Lipschitz continuity) and extensive experiments across multiple benchmark datasets, demonstrating the effectiveness of DGSDA. The rebuttal included further clarifications, new ablation results, and additional supervised comparisons.
Decision: All reviewers gave accept recommendations and highlighted the novelty, technical soundness, and practical value of the proposed approach. The method offers a fresh perspective by replacing traditional topology alignment with spectral model alignment, and the use of Bernstein polynomials is well-motivated both theoretically and empirically. While some concerns were raised—such as equation formatting, title clarity, and disentanglement attribution—the authors addressed them comprehensively in the rebuttal. In particular, they provided empirical comparisons to isolate the role of disentanglement and demonstrated that the unsupervised objective is competitive with a supervised variant. Overall, the paper is well-written, the contributions are meaningful, and the empirical validation is thorough. Therefore, I recommend acceptance.