PaperHub
7.8
/10
Poster4 位审稿人
最低4最高5标准差0.4
5
5
5
4
3.0
置信度
创新性3.3
质量3.0
清晰度2.5
重要性3.0
NeurIPS 2025

Let Brain Rhythm Shape Machine Intelligence for Connecting Dots on Graphs

OpenReviewPDF
提交: 2025-05-12更新: 2025-10-29

摘要

关键词
NeuroimageBrain dynamicsKuramoto modelGraph Learning

评审与讨论

审稿意见
5

This paper proposes a new learning framework called BRICK. It develops more effective machine learning algorithm design principles by simulating brain rhythm mechanisms, thereby connecting neuroscience and artificial intelligence. The goal of BRICK is to identify brain rhythms, and to develop an oscillator representation and control mechanism closely integrated with fundamental neuroscience principles. The representations output by this model highly reflect neuroscientific principles (i.e., brain rhythms). Furthermore, building upon this, the paper extends this concept to the graph domain, introducing a new graph neural network architecture called BIG-NOS. This model improves existing graph learning models and possesses the ability to avoid the over-smoothing issue.

优缺点分析

Strengths:

  1. This paper presents novel and significant work. It introduces a new perspective to the AI field, particularly in graph neural networks, by proposing a biologically inspired mechanism of brain rhythms for machine learning.
  2. This paper experimentally demonstrates the model's ability to identify brain rhythms. For example, BRICK can decode dynamic brain states based on neural synchronisation patterns and achieves significantly superior performance compared to existing GNN models on human brain datasets such as HCP-A, HCP-YA, and HCP-WM.
  3. This paper effectively avoids the over-smoothing issue in GNNs. The proposed BIG-NOS explicitly avoids over-smoothing by evolving graph features within a latent oscillatory phase space, rather than through traditional heat diffusion methods.

Weaknesses:

  1. The paper does not compare with popular models like graph diffusion and graph transformer, thus lacking a comprehensive comparison with the latest state-of-the-art graph learning models [1-3].
  2. The interpretability (biological plausibility) analysis is insufficient. In brain rhythm identification, the authors demonstrate visualisation results of brain region synchronisation, but do not provide further analysis on whether the output representations are consistent with biological characteristics of the brain. For instance, Line 305 states that the proposed mode uncovers functional communities that are not only align with known subnetworks but also capture novel patterns in brain organisation, but a more detailed biological interpretation is lacking.
  3. The paper lacks theoretical proof for its core mechanism (e.g., how oscillatory synchronisation mitigates over-smoothing). Although an energy function EE is proposed, it does not provide complete evidence to support its theoretical results. Although EE is claimed to be a "natural" Lyapunov function, no detailed mathematical derivations or proofs are given.

[1] Yang, H., Wang, B., & Jia, J. (2024). Gnncert: Deterministic certification of graph neural networks against adversarial perturbations. In The Twelfth International Conference on Learning Representations.

[2] Chamberlain, B., Rowbottom, J., Gorinova, M. I., Bronstein, M., Webb, S., & Rossi, E. (2021, July). Grand: Graph neural diffusion. In International conference on machine learning (pp. 1407-1418). PMLR.

[3] Yun, S., Jeong, M., Kim, R., Kang, J., & Kim, H. J. (2019). Graph transformer networks. Advances in neural information processing systems, 32.

问题

See weaknesses. Overall, I believe this paper is a good work. I’m happy to raise my score if the authors address my concerns well.

局限性

See weaknesses.

最终评判理由

The authors have addressed my concerns, so I'd like to raise my score to 5.

格式问题

N/A

作者回复

We sincerely thank the reviewer for recognizing the novelty and importance of our work. We also greatly appreciate the constructive feedback regarding empirical performance, neuroscientific interpretability, and theoretical grounding of the proposed model. Below, we provide detailed responses to each of the reviewer’s concerns.

1. Lacking a comprehensive comparison with the latest state-of-the-art graph learning models [1-3].

[Comparison with additional graph learning methods]
We have added two baseline models as reviewer suggested: GRAND, GTN. In addition, we also included KuramotoGNN (KGNN)[1] and GraphCON [2], which are both recent graph models for oscillatory dynamics. While we initially considered GNNCert, it focuses on adversarial robustness and its reported clean accuracy is based on existing backbones GCN, GAT, and GIN, which are already part of our benchmark. Nonetheless, we would include GNNCert to the related work to acknowledge its relevance in the broader graph learning. The updated results are presented in Table below, the results show that our models still achieve significant improvements in brain datasets and retain strong performance in general graph benchmarks.

Brain states identification:

Acc(%)HCP-AHCP-YAHCP-WM
GRAND86.18±1.7947.36±0.6733.27±2.63
GTN82.56±0.8753.42±1.2130.68±0.36
GraphCON87.87±2.2966.29±1.0444.12±2.1
KGNN85.53±1.6845.68±1.1235.79±1.54
BRICK95.55±0.7784.20±1.6089.22±1.71

Node classification:

Acc(%)TexasWisconsinActorSquirrelChameleonCornellCiteseerPubmedCora
GRAND67.42±7.8173.09±4.8333.34±1.1634.45±1.8438.43±3.3356.55±9.1572.1±1.0779.12±0.383.73±0.58
GTN74.52±10.268.89±11.935.42±1.536.45±1.7255.84±4.0858.81±15.0968.8±1.6976.90 ± 0.7778.60±1.72
GraphCON82.43±4.7283.72±4.4835.13±1.3826.9±2.1733.75±3.7774.59±2.4862.9±5.9571.21±7.3662.73±5.79
KGNN71.75±5.5769.43±6.1831.5±0.8235.02±1.6738.5±2.7861.16±9.1672.06±1.5367.24±0.3676.57±1.4
BIG-NOS81.35±3.5182.16±3.5635.8±0.668.06±1.6574.5±1.1373.24±4.7570.63±0.3677.8±0.2381.86±0.25

Graph classification:

Acc(%)ENZYMESPROTEINS
GRAND28±6.6771.97±4.26
GTN20.15±2.2775.48±4.27
GraphCON44.83±3.5369.82±6.12
KGNN29.26±2.8463.76±5.17
BIG-NOS60±4.2875.02±2.61

2. The interpretability (biological plausibility) analysis is insufficient.

[More detailed interpretability (biological plausibility) analysis] We now elaborate how the discovered patterns in Figure 3 support neurobiological plausibility:

  • Alignment with functional subnetworks (Figure 3a):
    BRICK captures task-specific synchronization across functionally defined brain network in HCP-A. For instance, Visual network regions (red) cluster clearly during the VISMOTOR task in the deeper representation X(L)X^{(L)}, indicating strong within-network coordination. These results suggest that our model respects functional boundaries in a biologically meaningful manner.
  • Intersubject consistency and task-relevant encoding (Figure 3b):
    This figure shows that individuals performing the same task (e.g., “SOCIAL” in HCP-YA) form well-separated synchronization in deeper layers X(L)X^{(L)}. This indicates BRICK captures latent dynamics consistent across subjects, supporting the notion that neural phase alignment underlines cognitive state representation.
  • Unsupervised Functional Differentiation (Figure 3c): Here, we assess whether BRICK can produce meaningful cortical parcellations without supervision. Compared to spectral clustering, BRICK yields phase-based partitions that are more spatially localized and better aligned with known functional networks [3]. This supports the claim that BRICK captures biologically coherent dynamics.

In this way, BRICK offers a mechanistic perspective grounded in neural oscillation theory, which allows us to move beyond black-box representations toward interpretable latent dynamics.

3. The paper lacks theoretical proof for its core mechanism (e.g., how oscillatory synchronisation mitigates over-smoothing). Although E is claimed to be a "natural" Lyapunov function, no detailed mathematical derivations or proofs are given.

[Why E is a natural Lyapunov function?]

  1. Physical intuition

The dynamical system (Eq.(4)) dx^_idt=ωi+γϕ_i(yi+_j=1Nw_ijx^_j)\frac{d\hat{x}\_i}{dt} = \omega_i + \gamma \phi\_i( y_i + \sum\_{j=1}^N w\_{ij} \hat{x}\_j) is a network of weakly-coupled oscillators driven by an external signal yiy_i. This suggests the following energy function (Eq.(5)):

E(x^)=_i,jx^_iw_ijx^_j_iyix^_iE(\hat{x}) = -\sum\_{i,j} \hat{x}\_i^\top w\_{ij} \hat{x}\_j - \sum\_i y_i^\top \hat{x}\_i,

where the first sum represents the coupling term and the second sum is the driving term. Because the update dx^idt\frac{d\hat{x}_i}{dt} explicitly attempts to align x^i\hat{x}_i with its neighbors (w_ijx^_j)(w\_{ij} \hat{x}\_j) and with yiy_i, the quantity measuring how well aligned they are, namely $E$, is the physically most natural Lyapunov candidate.

  1. Lyapunov property [4]

Let x^=[x^_1,...,x^_N]RN×d\hat{\mathbf{x}} = [\hat{x}\_1, ..., \hat{x}\_N] \in \mathbb{R}^{N \times d} and recall dynamical system Eq.(4) and energy EE Eq.(5). Because W=[w_ij]W = [w\_{ij}] is symmetric (undirected),

x^E=(Wx^+y),y=[y1yN]\nabla_{\hat{\mathbf{x}}} E = - (W \hat{\mathbf{x}} + \mathbf{y}), \quad \mathbf{y} = [y_1 \ldots y_N]

Define ui:=yi+jw_ijx^_ju_i := y_i + \sum_j w\_{ij} \hat{x}\_j. Then Eq.(4) may be rewritten as:

x^˙i=ωi+γϕ_i(ui)=ω_iγϕ_i(u_i)=ω_iγϕ_i([_x^E]_i)\dot{\hat{x}}_i = \omega_i + \gamma \phi\_i(u_i) = \omega\_i - \gamma \phi\_i(-u\_i) = \omega\_i - \gamma \phi\_i([\nabla\_{\hat{\mathbf{x}}} E]\_i)

Then the time derivative of EE is defined as:

E˙=_x^E,x^˙=_i[_x^E]_i,ωi_(i)γ[_x^E]_i,ϕ_i([_x^E]_i)_(ii)\dot{E} = \left\langle \nabla\_{\hat{\mathbf{x}}} E, \dot{\hat{\mathbf{x}}} \right\rangle = \sum\_i \underbrace{ \langle [\nabla\_{\hat{\mathbf{x}}} E]\_i, \omega_i \rangle}\_{\text{(i)}} - \gamma \underbrace{\langle [\nabla\_{\hat{\mathbf{x}}} E]\_i, \phi\_i([\nabla\_{\hat{\mathbf{x}}} E]\_i) \rangle }\_{\text{(ii)}}

The ωi\omega_i induces a pure phase rotation; it is orthogonal to the gradient direction, so [x^E]i,ωi=0 \langle[\nabla_{\hat{\mathbf{x}}}E]_i, \omega_i\rangle=0

And ϕ_i()\phi\_i(\cdot) is (i) odd: $\phi_i(-z) = -\phi_i(z)$, and (ii) monotone increasing: ϕi(z1)ϕi(z2),z1z20 \langle \phi_i(z_1) - \phi_i(z_2), z_1 - z_2 \rangle \geq 0. With $z = [\nabla_{\hat{\mathbf{x}}} E]_i$, we have $\langle z, \phi_i(z) \rangle \geq 0$, this therefore yields (ii) \leq 0\. Combining (i) and (ii),

E˙=0γiz,ϕi(z)0,z=[x^E]i\dot{E}=0-\gamma\sum_i\langle z, \phi_i(z)\rangle\leq0, \quad z=[\nabla_{\hat{\mathbf{x}}} E]_i

If $W \succeq 0$ and both $w_{ij}$ and $y_i$ are finite, then Eq.(5) is at most quadratic with a finite lower bound:

E(x^)λmax(W)x^2yx^c1x^2c2E(\hat{\mathbf{x}})\geq-\lambda_{\max}(W) \|\hat{\mathbf{x}}\|^2-\|\mathbf{y}\| \|\hat{\mathbf{x}}\| \geq c_1 \|\hat{\mathbf{x}}\|^2-c_2, for suitable constants c1>0c_1 > 0, c20c_2 \geq 0.

Eq.(5) and the bound above provide the classical Lyapunov conditions: E(x^)c1x^2c2,E˙(x^)0E(\hat{\mathbf{x}}) \geq c_1 \|\hat{\mathbf{x}}\|^2 - c_2, \quad \dot{E}(\hat{\mathbf{x}}) \leq 0

Hence $E$ is a Lyapunov function for the Eq.(4): it is lower-bounded and monotonically non-increasing along every trajectory, guaranteeing stability.

[Heuristic Analysis: Why BRICK alleviates over-smoothing]

We analyze why the BRICK can theoretically alleviate the oversmoothing by examining its simplified dynamics, steady-state solution and spectral response.

Simplified BRICK Dynamics. To derive an interpretable steady-state solution, we consider a simplified version of BRICK dynamics. We assume a constant natural frequency ωi\omega_i across all nodes and linearize the nonlinear projection ϕ\phi:

dx^dt=x^+Wx^+y\frac{d\hat{\mathbf{x}}}{dt} = -\hat{\mathbf{x}}+W\hat{\mathbf{x}}+\mathbf{y}

This corresponds to a linear consensus-like system, where the x^-\hat{\mathbf{x}} term mimics a dissipative force, ensuring stability.

Equilibrium Solution. At steady state (dx^dt=0)( \frac{d\hat{\mathbf{x}}}{dt} = 0), we obtain:

(IW)x^=yx^=(IW)1y(I - W)\hat{\mathbf{x}}^* = \mathbf{y} \quad \Rightarrow \quad \hat{\mathbf{x}}^* = (I - W)^{-1} \mathbf{y}

The inverse exists provided the spectral radius of WW is strictly smaller than 1 (i.e., λk1,k\lambda_k \neq 1, \forall k), a condition that is typically satisfied for normalized adjacency or attention matrices used in practice.

Spectral Interpretation. Let WW be symmetric and decomposed as W=UΛUW = U \Lambda U^\top, where UU is the orthonormal eigenvector matrix and Λ=diag(λ1,...,λn)\Lambda = \text{diag}(\lambda_1, ..., \lambda_n). Projecting into the spectral domain:

x^=U(IΛ)1Uy\hat{\mathbf{x}}^* = U(I - \Lambda)^{-1}U^\top \mathbf{y}

Let y~=Uy\tilde{\mathbf{y}} = U^\top \mathbf{y} and x^~=Ux^\tilde{\hat{\mathbf{x}}}^* = U^\top \hat{\mathbf{x}}^*, then:

x^~k=11λky~k\tilde{\hat{x}}^*_k = \frac{1}{1 - \lambda_k} \tilde{y}_k

Thus, each spectral mode is scaled by a transfer function: h(λk)=11λkh(\lambda_k) = \frac{1}{1 - \lambda_k}

Comparison with Diffusion-based GNNs. In standard diffusion-type GNNs (e.g., GCN), applying LL layers is equivalent to using a transfer function eLλke^{-L\lambda_k} in spectral space. This leads to exponential suppression of high-frequency signals (large λk\lambda_k), causing oversmoothing. In contrast, BRICK uses a transfer function that decays much slower: h(λk)=11λkh(\lambda_k) = \frac{1}{1 - \lambda_k}, which corresponds to 1/λ1/\lambda-level suppression, allowing high-frequency, discriminative signals to persist.

[1] Rusch, T. K. et al. (2022). Graph-coupled oscillator networks.
[2] Nguyen, T. et al. (2024). From coupled oscillators to graph neural networks: Reducing over-smoothing via a kuramoto model-based approach.
[3] Yeo, B. T. T. et al. (2011). The organization of the human cerebral cortex estimated by intrinsic functional connectivity.
[4] Takeru M. et al. Artificial kuramoto oscillatory neurons.

Thanks again for your feedback, if the reviewer has additional questions, we would be glad to provide further clarifications.

评论

Thank you for your response. I have updated my score.

评论

Thank you very much for updating the score. We truly appreciate your time and thoughtful feedback throughout the review process.

审稿意见
5

This framework presents a physics and neuroscience-informed deep learning framework. It integrates with a synchronization mechanism of neural oscillations inside a graph representation learning framework. Motivated by addressing the oversmoothing issue of conventional GNNs, it proposed to utilize the brain rhythms in the artificial dynamical system. This method integrates a Kuramoto model with attending memory for modeling oscillatory synchronization in brain regions. It has been applied to solve two tasks on brain rhythm identification and conventional tasks on graph data, demonstrating superior performance compared to existing baselines.

优缺点分析

Strengths

  1. This work presents a novel solution to address the oversmoothing issue of conventional GNNs, it incorporated with a physics-informed structure with Kuramoto model with attending memory inside a graph representation framework. This system has novel designs in governing equation, learning mechanism, and attention mechanism compared to conventional GNN models.
  2. The model demonstrates superior performance on two tasks (brain rhythm identification and graph tasks) compared to existing baselines.
  3. This model with good interpretability, and demonstrate interesting visualizations of synchronization patterns in Fig 3.
  4. Multiple ablations studies are conducted.

Weaknesses

  1. It would be more impactful to link the synchronization mechanisms to neuroscience domains, and explains why it might be more superior compared to conventional approachs in brain rhythm identification task, and how the discovery in Fig 3 might be relevant to existing studies.
  2. Missing comparisons of the computational cost of the proposed methods compared to existing methods.

问题

  1. What might be challenges for the optimization solver for the proposed graph learning procedure, would this guarantee to converge, and how it is sensitive to initialization and hyperparameters?
  2. Would this be more computational expensive compared to conventional graph learning methods, and how it will be scale to larger graph?

局限性

  1. Include more discussions with neuroscience literatures and mechanisms would be more impactful.
  2. Include more discussion and evaluations on the computational cost, and scalability.

最终评判理由

Thanks for the authors' detailed responses. I am especially impressed by the scientific insights and motivations clarified by the authors. I also appreciates the mechanistic understanding on neuronal synchronization that this work could bring. The responses on baselines comparisons, initialization, convergence also mostly addressed my concerns. I would like to increase my score to accept (5). Thanks for your efforts.

格式问题

Not applied.

作者回复

We sincerely thank the reviewer for recognizing the novelty and comprehensive nature of our work. We also greatly appreciate the reviewer’s interest and thoughtful feedback regarding the biological interpretability, computational efficiency and scalability, as well as the optimization dynamics underlying our framework. Below, we address each of the reviewer’s concerns in detail.

1. It would be more impactful to link the synchronization mechanisms to neuroscience domains, and explains why it might be more superior compared to conventional approachs in brain rhythm identification task, and how the discovery in Fig 3 might be relevant to existing studies.

[Link to neuroscience domains]
In principle, the learning behavior of BRICK is shaped by the same governing equations (i.e., Kuramoto model) that describe the neural oscillatory dynamics responsible for generating cognition and behavior. We further introduce structural-functional coupling, implemented through the geometric scattering transform (Line 159), to emulate the cross-frequency coupling observed in cognitive neuroscience. In addition, we incorporate an adaptive control term to enhance task-relevant modulation. Drawing inspiration from neural oscillatory synchronization, our BRICK model not only improves predictive accuracy but also enhances interpretability, thereby offering novel insights into cognitive processes (Section 3.1).

[Advances compared to conventional approaches in brain rhythm identification task]
While traditional models, including CNNs, RNNs, Transformers, and GNNs, have demonstrated strong empirical performance in brain rhythm identification, they largely remain data-driven and agnostic to the underlying neurobiological mechanisms, they often treat the brain as a generic data source without sufficient domain knowledge. In contrast, BRICK explicitly incorporates neural synchronization dynamics, a well-established principle in cognitive neuroscience, into the learning architecture. This alignment with neuroscience allows us to interpret learned patterns (e.g., coupling strengths or synchronization clusters) in biologically meaningful terms (as shown in Figure 3), such as inter-regional communication or pathological desynchronization. Therefore, compared to conventional approaches, BRICK is not just a deep model for feature representation learning, it shows the potential to establish a biologically inspired reasoning system. By embedding core principles of brain dynamics into a learnable, differentiable architecture, our proposed deep model enables achieving more interpretable, temporally-aware, and robust brain state identification.

[Relevance of Figure 3 to existing studies]
Below is a detailed explanation of how the discovered patterns in Figure 3 align with known neurobiological structures and functions:

  • Alignment with functional subnetworks (Figure 3a):
    BRICK captures task-specific synchronization across functionally defined brain network in HCP-A. For instance, Visual network regions (red) cluster clearly during the VISMOTOR task in the deeper representation X(L)X^{(L)}, indicating strong within-network coordination. Similarly, tighter phase clustering emerges in the Sensorimotor and Dorsal Attention networks under tasks such as CART and FACENAME. These results suggest that our model respects functional boundaries in a biologically meaningful manner.
  • Intersubject consistency and task-relevant encoding (Figure 3b):
    This figure shows that individuals performing the same task (e.g., “SOCIAL” or “EMOTION” in HCP-YA) form well-separated synchronization in deeper layers (e.g., X(L)X^{(L)}). This indicates BRICK captures latent dynamics consistent across subjects and datasets, supporting the notion that neural phase alignment underlines cognitive state representation.
  • Unsupervised Functional Differentiation (Figure 3c): Here, we assess whether BRICK can produce meaningful cortical parcellations without supervision. Compared to traditional spectral clustering, BRICK yields phase-based partitions that are more spatially localized and better aligned with known functional networks [1]. This supports the claim that BRICK captures biologically coherent dynamics.

By mapping learned feature vectors onto a phase manifold, we can interpret inter-regional phase alignment as a proxy for functional coordination. In this way, BRICK offers a mechanistic perspective grounded in neural oscillation theory, which allows us to move beyond black-box representations toward interpretable latent dynamics.

2. Missing comparisons of the computational cost of the proposed methods compared to existing methods. Would this be more computational expensive compared to conventional graph learning methods, and how it will be scale to larger graph?

[Comparisons of the computational cost]
We have provided a detailed comparison of the inference time per subject for all models across three brain datasets in Appendix A.2 Table 7. In addition, we have now included a summary table that reports the average inference time of each model on all brain datasets. As shown in the results, our proposed model BRICK has slightly higher inference time compared to lightweight baselines such as GCN, GIN, and GraphSAGE. However, it remains more efficient than several other models, while also delivering significantly better performance on brain datasets.

ModelGCNGINGATGCNIIGraphSAGESANBRICK
Avg inference time (ms/subject)0.90.741.311.090.721.621.25

[How to scale to larger graph]
To ensure the scalability of our method to large-scale graphs, we implement all matrix operations using sparse representations. This design allows our model to scale comparably to lightweight models such as GCN, and enables full-batch training on large graphs like ogbn-arxiv without exceeding memory limits.

Moreover, for even larger graphs, e.g., fine-grained applications such as vertex-level cortical modeling in neuroscience, our framework can readily incorporate subgraph sampling strategies [2] to enable smooth training.

3. What might be challenges for the optimization solver for the proposed graph learning procedure, would this guarantee to converge, and how it is sensitive to initialization and hyperparameters?

[Initialization and hyperparameters ablations]
Sensitivity to initialization. To assess the sensitivity of our model to random initialization, we conducted experiments on the Cora, Citeseer, and Pubmed datasets using 5 different random seeds. As reported in Table 2, our model achieves highly consistent results, with the maximum standard deviation across these runs being only 0.36%. This indicates that our method is stable and not significantly affected by initialization variance.

Sensitivity to hyperparameters. We conducted several ablation studies on the most critical hyperparameters and reported them in the appendix.

  • In Appendix A.2, we ablate two parameters specific to the BRICK architecture: L (outer iterations for control update) and Q (inner time steps for oscillator dynamics). Increasing both parameters increases inference time, while having only a marginal effect on accuracy. This suggests the model is not overly sensitive to these parameters within a reasonable range.

  • In Appendix A.3, we further analyze the sensitivity of geometric scattering transform (GST)-related hyperparameters: level and order. Our results show that increasing either improves performance by capturing more frequency components, but at the cost of slightly increased computation.

[Optimization challenge and converge guarantee]
We appreciate the reviewer’s insightful question regarding the optimization behavior of our proposed method. From a theoretical perspective, the dynamics of our model are governed by a gradient flow is: dx^idt=ωi+γϕi(yi+_j=1Nw_ijx^j)\frac{d\hat{x}_i}{dt}=\omega_i+\gamma\cdot \phi_i (y_i+\sum\_{j=1}^{N} w\_{ij}\hat{x}_j),

which corresponds to a gradient descent flow on the following energy landscape: E=i,jwijx^ix^jiyix^iE = - \sum_{i,j} w_{ij} \hat{x}_i^\top \hat{x}_j - \sum_i y_i^\top \hat{x}_i. This energy function EE acts as a natural Lyapunov function, satisfying the condition: dEdt0\frac{dE}{dt} \leq 0, thus ensuring that the system always evolves towards a (local) steady state. Therefore, the entire system remains a valid gradient flow on a constrained manifold, which theoretically guarantees convergence.

In practice, we also observe stable optimization behavior during training. To support this claim, we will include the training loss trajectories of our model on representative datasets in Appendix.

[1] Yeo, B. T. T. et al (2011). The organization of the human cerebral cortex estimated by intrinsic functional connectivity. Journal of Neurophysiology, 106(3), 1125–1165.
[2] Zeng, H. et al (2019). Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931.

Thanks again for your feedback, if anything is still unclear or the reviewer would like additional clarification, we would gladly continue the discussion and address any further concerns.

审稿意见
5

The paper presents BRICK, a novel deep learning-based framework designed to decode cognitive states with strong biological inspiration. Unlike traditional artificial neural networks (such as the perceptron or multilayer perceptron), which drew only loose inspiration from neuroscience and abstracted away much of the brain’s complexity, BRICK integrates neuroscientific principles, such as fluctuating brain activity, neural oscillatory synchronization, and attentional memory to more closely emulate neurobiological processes.

Specifically, BRICK extends these neuroscience principles into the graph domain, employing a physics-informed model for brain rhythm identification. This model utilizes a Kuramoto model-inspired control mechanism to capture memory-guided neural synchronization. This approach enables the development of a neurobiologically inspired GNN, called BIG-NOS, which specifically addresses main limitations like over-smoothing while offering scalability across node and graph classification tasks.

Extensive evaluation on publicly available neuroimaging and graph datasets demonstrate that BRICK effectively synchronizes brain regions, yielding synchronization patterns closely aligned with specific cognitive states. BIG-NOS achieves state-of-the-art (SOTA) performance on a range of GNN benchmarks.

优缺点分析

Strengths

  • Originality. The main strength of this paper is to introduce a new design paradigm for developing GNNs through the lens of coupled neural oscillators. The authors highlight a link between the neuroscience and machine learning, where the last one is used learn the parameters the Kuramoto model, where a new term is included to incorporate the concept of the attending memory.

  • Empirical validation. The claim related to BRICK ant to BIG-NOS are well-supported by the experimental results, that show:

    • BRICK ability to synchronize brain region. Especially even the experiments in the absence of ground truth, show the BRICK’s capability of identifying clusters of functional subnetworks.
    • BIG-NOSE achieve SOTA performance on node and graph classification showing robustness to over-smoothing and scalability issues.

Weaknesses:

Clarity. The main weakness is related to the clarity of the paper. Specifically, the two following points limits the readability of the paper.

  • Related work.

    • It is not clear what are the differences between the authors' modified Kumamoto model in BRICK and the ones proposed in [9,10].
    • It is not clear what are the differences between the Kumamoto model in BRICK and the work [44]. In that regard, the authors contribution is to put the spotlight on bridging the reciprocal relationship between neuroscience and AI through the lens of governing equations in the dynamical system.
  • Methods. The readability will improve to incorporate more information about the Big-NOS network architecture.

[9] Joana Cabral, Etienne Hugues, Olaf Sporns, and Gustavo Deco. Role of local network oscillaions in resting-state functional c onnectivity. Neuroimage, 57(1):130–139, 2011.

[10] Katerina Capouskova, Morten L Kringelbach, and Gustavo Deco. Modes of cognition: Evidence from metastable brain dynamics. NeuroImage, 260:119489, 2022.

[44] Takeru Miyato, Sindy Löwe, Andreas Geiger, and Max Welling. Artificial kuramoto oscillatory neurons. arXiv preprint arXiv:2410.13821, 2024.

问题

  • What are the pros of BRICK model w.r.t. to related work [9,10] that used Kuramoto model to describe the coupling between brain structure and function? What are the advantages of BRICK compared to the work in [44]?

  • Can you clarify why equation 4 can be interpreted as gradient flow? I guess that in line 185 the authors want to provide intuition behind eq. 5 not 4.

  • What kind on nerural nertwork architecture did the author use to o parameterize the natural frequency Ω (lines 201-202)?

局限性

Yes

最终评判理由

Dear authors, thank you for the rebuttal. After carefully reviewing all the comments and your response, I am confirming my original rating.

格式问题

No Paper Formatting Concerns.

作者回复

We sincerely thank the reviewer for recognizing the originality of our work and the comprehensiveness of our experimental evaluation. We also appreciate the thoughtful feedback regarding related work and methodology details. Below, we carefully address each concern and provide clarifications or revisions where needed.

1. It is not clear what are the differences between the authors' modified Kuramoto model in BRICK and the ones proposed in [9,10]. What are the advantages of BRICK compared to the work in [44]?

[Differences between our model and [9], [10], [44]]
BRICK and [9]:
While both BRICK and Cabral et al. [9] adopt the Kuramoto model to explore brain dynamics, their roles and applications diverge significantly. In their work, the Kuramoto model is used as a forward simulator to reproduce biologically observed functional connectivity (FC) patterns from empirical structural connectivity (SC), with the goal of explaining emergent brain phenomena. In contrast, our BRICK repurposes the Kuramoto model as a computational module within a machine learning framework. Rather than passively simulating dynamics, BRICK actively learns to perform representation learning, reasoning, and predictive tasks on graphs using dynamic synchronization mechanisms.

BRICK and [10]:
For methodological paradigm, while Capouskova et al. [10] follow a data-driven analytical paradigm, using autoencoders and clustering to uncover latent cognitive states from empirical fMRI data, they encode BOLD phase coherence data using modern machine learning tools without Kuramoto model. Their focus lies in neuroscientific pattern discovery and interpretation. In contrast, BRICK introduces a physics-informed learning framework that embeds the principles of neural synchronization, specifically via the Kuramoto model, directly into the learning process.

The outcomes of these two approaches are also distinct: Capouskova et al. generate empirical insights about cognitive brain states, whereas BRICK yields new machine learning models that not only excel in brain-related tasks but also generalize to broader graph-based AI problems.

BRICK and [44]:
Compared with [44], our BRICK framework offers two key advantages.

  • First, neuroscientific grounding: starting from fMRI-derived brain rhythms, we augment the Kuramoto core with a task-driven feedback controller yiy_i that mimics cognitive control, thereby preserving biological interpretability.

  • Second, domain-specific impact: on real HCP fMRI data BRICK achieves the best task decoding and unsupervised parcellation, while on graph benchmarks it delivers competitive accuracy and remains resistant to oversmoothing even at 128 layers.

Taken together, these properties make BRICK a biologically inspired but practically efficient module that bridges brain dynamics and general-purpose AI.

2. Methods. The readability will improve to incorporate more information about the Big-NOS network architecture.

[Details of BIG-NOS]
Conceptually, BIG-NOS can be viewed as an extension of BRICK to graph data. By interpreting graph nodes as analogs of brain regions and the adjacency matrix as a proxy for coupling strength among regions, we seamlessly adapt the physics-informed architecture of BRICK to arbitrary graphs. Both models share the same governing equations rooted in Kuramoto synchronization, but differ in their data domains and learning objectives.

In BRICK, the inputs consist of BOLD signals over brain regions along with structural or functional connectivity; in BIG-NOS, the inputs are general node features and graph topology. Likewise, the outputs diverge in form: BRICK predicts cognitive or disease-related brain states (analogous to graph-level classification), while BIG-NOS predicts node/graph labels in standard graph learning settings.

We are committed to include the formal network detail of BIG-NOS into Section 3.2 as part of the overall architecture description.

3. Can you clarify why equation 4 can be interpreted as gradient flow?

[Gradient flow interpretation of Eq.(4)]

Gradient flow refers to a class of dynamical systems that evolve along the steepest descent direction of an energy functional EE. The evolution equation takes the form x^˙i=Ex^i\dot{\hat{x}}_i = -\frac{\partial E}{\partial \hat{x}_i}

which guarantees that the energy E˙\dot{E} decreases over time, i.e., E˙=dEdt0\dot{E} = \frac{dE}{dt} \leq 0

leading the system toward a local minimum of EE. The energy function defined in Eq.(5) E=i,jwijx^ix^jiyix^iE = -\sum_{i,j} w_{ij} \hat{x}_i^\top \hat{x}_j - \sum_i y_i^\top \hat{x}_i encodes two key objectives:

  • The first term promotes pairwise alignment (synchronization) among oscillators x^_i\hat{x}\_i and x^_j\hat{x}\_j with strong coupling w_ijw\_{ij}.

  • The second term aligns each oscillator with its task-specific control pattern yiy_i (controlled synchronization).

Taking the gradient of EE with respect to x^i\hat{x}_i yields Ex^_i=jw_ijx^_jy_i\frac{\partial E}{\partial \hat{x}\_i} = -\sum_j w\_{ij} \hat{x}\_j - y\_i,

so that the negative gradient becomes Ex^_i=jw_ijx^_j+yi-\frac{\partial E}{\partial \hat{x}\_i} = \sum_j w\_{ij} \hat{x}\_j + y_i

Substituting this into Eq.(4) we obtain: x^˙i=ωi+γϕi(Ex^i)\dot{\hat{x}}_i = \omega_i + \gamma \phi_i \left( -\frac{\partial E}{\partial \hat{x}_i} \right)

Thus, Eq.(4) is considered a gradient flow because its dynamics are equivalent to evolving along the negative gradient direction of an energy functional EE. Here, the operator ϕ\phi does not destroy the gradient flow structure, but instead imposes structural constraints on the descent direction (e.g., ensuring motion remains on the unit sphere).

4. What kind on nerural nertwork architecture did the author use to parameterize the natural frequency Ω (lines 201-202)?

[Parameterization of natural frequency Ω]
The parametrization of natural frequency Ω does not rely on a deep neural network. Instead, Ω is parameterized using a set of learnable vectors whose norms determine the rotation speed (i.e., natural frequency) of oscillators. In the forward pass, these frequency values are used to rotate the 2D oscillator states (like turning a point in a circle), simulating how each node evolves over time due to its intrinsic dynamics.

[9] Joana Cabral, Etienne Hugues, Olaf Sporns, and Gustavo Deco. Role of local network oscillations in resting-state functional connectivity. Neuroimage, 57(1):130–139, 2011.373
[10] Katerina Capouskova, Morten L Kringelbach, and Gustavo Deco. Modes of cognition: Evidence from metastable brain dynamics. NeuroImage, 260:119489, 2022.375
[44] Takeru Miyato, Sindy Löwe, Andreas Geiger, and Max Welling. Artificial kuramoto oscillatory neurons. arXiv preprint arXiv:2410.13821, 2024

Should there remain any ambiguities or if the reviewer has further questions, we would be more than happy to engage in continued discussion and resolve any remaining issues.

评论

Dear authors, thank you for your rebuttal, which addresses my concerns. I confirm my rating to 5.

评论

Thank you very much for your kind follow-up. We truly appreciate your constructive feedback and your positive assessment of our work.

审稿意见
4

Inspired by neural oscillation mechanism, the authors propose novel deep learning methods that utilize neural oscillation, BRICK and BIG-NOS. BRICK was applied for brain fMRI datasets. Specifically, BRICK achieved the best performance in brain analysis tasks, compared to simple GNN baselines. BIG-NOS was applied for graph benchmark datasets, and the authors report its superior node and graph classification accuracies compared to simple GNN baselines. The authors argue that the proposed approaches represent a new graph learning mechanism with SOTA performance.

优缺点分析

The key strengths include:

  • [S1 novelty]. The proposed approaches seem somewhat novel.
  • [S2 visualization]. The visualizations are among the most beautiful ones I have seen in AI conferences.

However, I note two critical limitations to the present work.

  • [W1 poor experimental setting]. Many of the reported baseline performances in Table 3 are substantially lower than those reported in eariler studies. For instance, GCNII performance on Cora is reported as 79.92, whereas the other works generally report up to 86 in the same train/val/test split. In fact, I have run many experiments with the baselines before, and the reported performances are definitely not optimal. I assume this is due to poor choice of hyperparameters. According to Appendix A.2, the authors did not tune some of the key hyperparameters for all baseline methods, and some other key hyperparameters, such as learning rate, are not reported. With such poor experimental setting, I find it hard to trust the experimental outcomes.
  • [W2 lack of method justification]. I am not convinced of the justification to join neural oscillation with graph learning. Due to the issue in W1, empirical justification is hardly achieved. No formal theories were developed to justify them. The motivation illustrated in the introduction section (e.g., node in a graph as a couple oscillator) is only speculative and have not been analyzed or supporte by the prior works. Thus, I am not convinced why the proposed method is necessary or important.

Besides, I also find some notable limitations.

  • [W3 missing details]. Some of important details are missing. For instance, formal description of the proposed method BIG-NOS seems to be missing. Is the Governing Equation in Table 1 the exact functional form of BIG-NOS? If so, please clarify. Also, the Figure 3 results should be buttressed with numeric outcomes. With only the plots, it is hard to objectively discern how they support the authors’ claims.
  • [W4 missing prior works]. There have been some previous works that connect neural oscillation with graph neural networks [1,2]. However, the authors cited none of them. The authors should clarify how the proposed method compare to those earlier works.
    • [1] Graph-Coupled Oscillator Networks, ICML 2022
    • [2] From Coupled Oscillators to Graph Neural Networks: Reducing Over-smoothing via a Kuramoto Model-based Approach, ICML 2024
  • [W5 unsupported claims]. The authors claim for SOTA performance. However, all the baselines are outdated, with the most recent one published in 2021. Even if the issue raised in W1 is somewhat justified, there is no evidence to demonstrate that the proposed method is not SOTA.

问题

See weaknesses

局限性

Yes, but the authors only discussed marginal limitations.

最终评判理由

I raised concerns about the paper's unfair experimental setting for graph learning benchmarks. During rebuttal, the authors reported results with improved experimental settings and revised their argument accordingly. Additionally, the authors reported SOTA performance of the proposed method in more brain application benchmarks. Considering all those factors, I consider a weak accept to be a proper final score.

格式问题

N/A

作者回复

We sincerely thank the reviewer for acknowledging the novelty and the value of our work. Below, we provide detailed responses to all concerns.

[W1 poor exp setting],[W5 unsupported claims]

[Exp setting and model selection]

We acknowledge that some of the reported accuracy for baseline models are lower than those in prior studies that specifically optimized each model/dataset. However, our intention was to ensure a fair and controlled comparison across all models under uniform exp settings, particularly important consideration in neuroscience, where the focus lies in uncovering mechanistic and functional insights.

For all models, we fix major hyperparams as hidden dim 256; #layers 4 (2 for GCN/GAT); #epochs 1500; lr 5e-4 – 1e-3; weight decay 5e-4. We will report all the details in final draft.

While these settings may not be optimal for every model, e.g., GCNII perform best with 32–64 layers on Cora,which is not applicable for GCN or GIN due to over-smoothing or instability. Our choice thus ensures comparability under identical architectural constraints. We would also like to emphasize that the accuracy of GCN on Cora in our setting (81.66%) is highly consistent with prior reports (e.g., 81.5% in GRAND), which validates our implementation.

In addition, since our main focus is on neuroscience-related applications aligns with the Neuroscience and cognitive science track, we prioritize baseline models that have been validated in computational neuroscience, such as GCN, GAT, GIN etc. [1]. Nevertheless, we have extended our comparisons by including GTN (2019), GRAND (2021), GraphCON (2022) and KuramotoGNN (KGNN) (2024). The results show that our models achieve significant improvements in brain datasets and retain strong performance in general graph benchmarks.

Brain states identification:

Acc(%)HCP-AHCP-YAHCP-WM
GRAND86.18±1.7947.36±0.6733.27±2.63
GTN82.56±0.8753.42±1.2130.68±0.36
GraphCON87.87±2.2966.29±1.0444.12±2.1
KGNN85.53±1.6845.68±1.1235.79±1.54
BRICK95.55±0.7784.20±1.6089.22±1.71

Node classification:

Acc(%)TexasWisconsinActorSquirrelChameleonCornellCiteseerPubmedCora
GRAND67.42±7.8173.09±4.8333.34±1.1634.45±1.8438.43±3.3356.55±9.1572.1±1.0779.12±0.383.73±0.58
GTN74.52±10.268.89±11.935.42±1.536.45±1.7255.84±4.0858.81±15.0968.8±1.6976.90 ± 0.7778.60±1.72
GraphCON82.43±4.7283.72±4.4835.13±1.3826.9±2.1733.75±3.7774.59±2.4862.9±5.9571.21±7.3662.73±5.79
KGNN71.75±5.5769.43±6.1831.5±0.8235.02±1.6738.5±2.7861.16±9.1672.06±1.5367.24±0.3676.57±1.4
BIG-NOS81.35±3.5182.16±3.5635.8±0.668.06±1.6574.5±1.1373.24±4.7570.63±0.3677.8±0.2381.86±0.25

Graph classification:

Acc(%)ENZYMESPROTEINS
GRAND28±6.6771.97±4.26
GTN20.15±2.2775.48±4.27
GraphCON44.83±3.5369.82±6.12
KGNN29.26±2.8463.76±5.17
BIG-NOS60±4.2875.02±2.61

[W2 lack of method justification]

We now clarify the motivation, justification, and necessity of our method. Our primary focus is brain rhythm identification from neural oscillations, which are fundamental in coordinating communication and cognitive functions [2]. Moreover, Kuramoto model has long been used to study coupled neural oscillators [3]. Building on these foundation, BRICK explicitly models oscillatory dynamics of brain function by Kuramoto dynamics, which not only achieves strong performance in brain state decoding (Tab.2), but also yields interpretable, biologically meaningful synchronization patterns (Fig.3). We believe this justifies both the necessity and novelty of our approach in the perspective of computational neuroscience.

Encouraged by the promising results on brain data, we sought to genelize neural oscillation principles to general graph domains in AI/ML, which share the same notion of coupled dynamical systems as in brain networks. We hypothesize that learning distinctive graph representations is analogous to coordinating functional roles (graph-level) across brain regions (node-level).

We also provide the following methodology justifications. If there are specific aspects of the theoretical formulation that the reviewer would like further clarification on, we would be more than happy to provide detailed explanations as needed.

Proof of Lyapunov conditions.
Let x^=[x^_1,...,x^_N]RN×d\hat{\mathbf{x}} = [\hat{x}\_1, ..., \hat{x}\_N] \in \mathbb{R}^{N \times d} and recall dynamical system Eq.(4) and energy EE Eq.(5). Because W=[w_ij]W = [w\_{ij}] is symmetric (undirected),

x^E=(Wx^+y),y=[y1yN]\nabla_{\hat{\mathbf{x}}} E = - (W \hat{\mathbf{x}} + \mathbf{y}), \quad \mathbf{y} = [y_1 \ldots y_N]

Define ui:=yi+jw_ijx^_ju_i := y_i + \sum_j w\_{ij} \hat{x}\_j. Then Eq.(4) may be rewritten as:

x^˙i=ωi+γϕ_i(ui)=ω_iγϕ_i(u_i)=ω_iγϕ_i([_x^E]_i)\dot{\hat{x}}_i = \omega_i + \gamma \phi\_i(u_i) = \omega\_i - \gamma \phi\_i(-u\_i) = \omega\_i - \gamma \phi\_i([\nabla\_{\hat{\mathbf{x}}} E]\_i)

Then the time derivative of EE is defined as:

E˙=_x^E,x^˙=_i[_x^E]_i,ωi_(i)γ[_x^E]_i,ϕ_i([_x^E]_i)_(ii)\dot{E} = \left\langle \nabla\_{\hat{\mathbf{x}}} E, \dot{\hat{\mathbf{x}}} \right\rangle = \sum\_i \underbrace{ \langle [\nabla\_{\hat{\mathbf{x}}} E]\_i, \omega_i \rangle}\_{\text{(i)}} - \gamma \underbrace{\langle [\nabla\_{\hat{\mathbf{x}}} E]\_i, \phi\_i([\nabla\_{\hat{\mathbf{x}}} E]\_i) \rangle }\_{\text{(ii)}}

The ωi\omega_i induces a pure phase rotation; it is orthogonal to the gradient direction, so [x^E]i,ωi=0 \langle[\nabla_{\hat{\mathbf{x}}}E]_i, \omega_i\rangle=0

And ϕ_i()\phi\_i(\cdot) is (i) odd: $\phi_i(-z) = -\phi_i(z)$, and (ii) monotone increasing: ϕi(z1)ϕi(z2),z1z20 \langle \phi_i(z_1)-\phi_i(z_2), z_1-z_2\rangle\geq 0. With $z = [\nabla_{\hat{\mathbf{x}}} E]_i$, we have $\langle z, \phi_i(z) \rangle \geq 0$, this therefore yields (ii) \leq 0\. Combining (i) and (ii),

E˙=0γiz,ϕi(z)0,z=[x^E]i\dot{E}=0-\gamma\sum_i\langle z, \phi_i(z)\rangle\leq0, \quad z=[\nabla_{\hat{\mathbf{x}}} E]_i

If $W \succeq 0$ and both $w_{ij}$ and $y_i$ are finite, then Eq.(5) is at most quadratic with a finite lower bound:

E(x^)λmax(W)x^2yx^c1x^2c2E(\hat{\mathbf{x}})\geq-\lambda_{\max}(W) \|\hat{\mathbf{x}}\|^2-\|\mathbf{y}\| \|\hat{\mathbf{x}}\| \geq c_1 \|\hat{\mathbf{x}}\|^2-c_2, for suitable constants c1>0c_1 > 0, c20c_2 \geq 0.

Eq.(5) and the bound above provide the classical Lyapunov conditions: E(x^)c1x^2c2,E˙(x^)0E(\hat{\mathbf{x}}) \geq c_1 \|\hat{\mathbf{x}}\|^2 - c_2, \quad \dot{E}(\hat{\mathbf{x}}) \leq 0

Hence $E$ is a Lyapunov function for the Eq.(4): it is lower-bounded and monotonically non-increasing along every trajectory, guaranteeing stability.

[W3 missing details]

[Details of BRICK and BIG-NOS]

The governing equation presents in Tab.1 for BIG-NOS is shared with BRICK. This alignment is intentional and central to our design, as it enables a seamless adaptation of BRICK from human brain to general graph learning tasks. We'll give more details in final version.

[Interpretation of Fig.3]

We now elaborate how the discovered patterns in Fig. 3 support neurobiological plausibility:

  • Alignment with functional subnetworks (Fig.3a):
    BRICK captures task-specific synchronization across functional brain network in HCP-A (e.g., Visual network regions (red) cluster clearly during the VISMOTOR task in the deeper feature X(L)X^{(L)}, indicating strong within-network coordination). It suggest that BRICK respects functional boundaries.
  • Inter-subject consistency and task-relevant encoding (Fig.3b):
    Subjects performing the same task (e.g., “SOCIAL” in HCP-YA) form well-separated synchronization in deeper X(L)X^{(L)}. This indicates BRICK captures consistent latent dynamics across subjects, supporting that neural phase alignment underlines cognitive state representation.
  • Unsupervised functional differentiation (Fig.3c): BRICK can produce meaningful cortical parcellations without supervision. Compared to spectral clustering, BRICK yields phase-based partitions that are more spatially aligned with known functional networks [4].

To provide stronger quantitative evidence, we have already included numeric results in Tab.2 and Appendix A.4. However, we will also include the following numeric figures to support Fig.3:

  • Within-network vs. between-network phase variance for Fig.3a to quantify the degree of synchronization.
  • Purity and NMI to measure how well the unsupervised parcels in Fig.3c align with the existing functional network.

[W4 missing prior works]

We have carefully reviewed both papers and will include them in the paper with a discussion of differences.

  1. Compare with GraphCON: GraphCON is based on second-order ODEs representing damped nonlinear oscillators. Its formulation is inspired by mechanical oscillators to simulate oscillatory dynamics. BRICK is explicitly informed by neural oscillation, which is first-order and phase-based, aiming to model functional coordination in brain dynamics and translate those principles into learnable graph architectures.

  2. Compare with KuramotoGNN: Their work demonstrates the use of synchronization dynamics for improving message passing in GNN, thus mitigating oversmoothing. However, BRICK incorporates vector-valued oscillators and adaptive control mechanisms. we ground our model in neuroscientific applications and demonstrate its performance and interpretability on neural data, beyond generic graph benchmarks.

[1] Dan, T. et al. (2024). Exploring the enigma of neural dynamics through a scattering-transform mixer landscape for riemannian manifold.
[2] Fries, P. (2005). A mechanism for cognitive dynamics: Neuronal communication through neuronal coherence.
[3] Cabral, J.et al. (2011). Role of local network oscillations in resting-state functional connectivity.
[4] Yeo, B.T. et al. (2011). The organization of the human cerebral cortex estimated by intrinsic functional connectivity.

We are glad to provide further clarification or discuss in any format that would be helpful.

评论

Dear authors,

Thank you for your rebuttal. I have carefully read your rebuttal and would like to raise few more points and questions.

[On W1 poor experimental setting]. First, I am still not convinced about the fairness and rigor of experimental setting.

  • I agree that some baseline performance (e.g., GCN) seem reasonable.
  • on fairness: However, the authors used unreasonably detrimental hyperparameters for GCNII (specifically, small number of layers), and thus, I do not consider it a fair comparison. If the authors want to focus on the specific hyperpameter space, I strongly encourage the authors to remove the baselines that were not intended to be used with the hyperparameters. Likewise, I think the updated experiment with the added baselines is misleading as well, as both GraphCon and Grand have been reported to perform best at layers deeper than the ones the authors used.
  • on the focus of the paper: While the authors claimed that their focus is on “neuroscience-related applications aligns with the Neuroscience and cognitive science track”, they reported more results about graph learning (2 Tables and 1 Figure; 12 benchmark datasets) than those of neuroscience applications (1 Table and 1 Figure; 3 benchmark datasets). Moreover, in lines 314-315, the authors claim that their method “achieves SOTA performance on both heterophilic and homophilic graphs”. With such a heavy focus on graph learning, I must insist that the authors should not sidestep from the issue I raised (W1 poor experimental setting) by claiming they are not the focus. Besides, the reported performances of the proposed method is far away from SOTA performance. I encourage the authors to avoid using misleading claims.

[On W2 lack of method justification].

  • I understand that the authors designed their method by leveraging Kuramoto model and neural oscillation, which understandably may improve “brain learning”.
  • However, the authors’ motivation goes beyond brain learning. Specifically, they aimed to “rethink graph learning through the lens of coupled neural oscillators” (line 58) and design “a new graph learning mechanism via neural oscillatory synchronization” (line 90).
  • Why is leveraging neural oscillation and kuramoto model important for “graph learning”? As I stated earlier, the only motivation that the authors provided is speculative (”each node acts as a coupled oscillator, evolving through interactions governed by the graph topology”; line 78-79). In summary, the link between proposed method’s design principles and graph learning is unclear and unjustified.
  • Thus, contrary to the authors’ claim, I am not convinced this method is important for graph learning.

In the future version of the paper, I encourage the authors to either (1) substantially reduce their focus on graph learning or (2) implement more rigorous and standard experimental settings accepted in graph learning domain, with a clear demonstration of the superiorities introduced by the proposed method over existing graph learning methods. In the current version, with such a heavy focus on graph learning and its poor justification/experiments, I cannot recommend acceptance.

Sincerely,

评论

We sincerely thank the reviewer for the thoughtful and detailed comments regarding our experimental setup and the focus of the manuscript.

[On W1 poor experimental setting]

[On experimental fairness]

We fully agree that the choice of hyperparameters plays a critical role in determining the performance of these models, and we appreciate the opportunity to clarify our decisions and revisions.

To ensure faithful and fair comparisons, we closely followed the official implementations and default configurations provided by the original authors.

  • For all models whose original codebases include support for specific datasets along with recommended hyperparameters, we just re-run the code on these datasets and directly report their performance without any modification (for Citeseer, Pubmed, Cora, we re-run the code 5 times to get mean±std). These include:

    • GCNII: Texas, Wisconsin, Squirrel (not officially supported, but we followed the Chameleon setting as both are from WikipediaNetwork), Chameleon, Cornell, Citeseer, Pubmed, Cora
    • GRAND: Citeseer, Pubmed, Cora
    • GraphCON: Texas, Wisconsin, Cornell, Citeseer, Pubmed, Cora
    • KGNN: Texas, Wisconsin, Cornell, Citeseer, Pubmed, Cora

    *(We mark results using official hyperparameters in bold in the table.)

  • For datasets that are not covered by the official implementation, we report results as described in our submission.

ModelTexasWisconsinActorSquirrelChameleonCornellCiteseerPubmedCora
GCN58.65±3.6452.75±6.3528.38±0.9628.87±1.539.36±1.9345.14±4.8470.62±0.8477.76±0.5081.66±0.50
GIN57.03±5.9847.84±5.2025.91±1.0725.16±2.1732.17±1.7548.65±10.6158.28±3.0971.40±2.1070.76±2.66
GAT56.49±5.8553.53±7.6029.05±0.8030.08±1.0343.16±1.5651.62±4.7570.38±0.8476.80±0.7580.14±1.13
GCNII74.86±6.2973.73±4.6633.80±1.4138.74±1.2059.14±3.4974.86±6.2973.14±0.2380.12±0.3684.34±0.23
GCNII*(76.76±6.30)(81.37±4.58)(41.07±1.33)(62.28±2.68)(76.76±6.07)
GraphSAGE78.92±5.5179.61±7.5534.88±1.1936.90±1.0348.03±2.2270.81±3.1570.44±0.2376.36±0.2979.58±0.37
SAN73.24±7.4078.63±6.1732.94±0.9237.00±1.2851.80±1.9569.46±6.0566.30±0.8874.68±0.8177.98±1.31
GRAND67.42±7.8173.09±4.8333.34±1.1634.45±1.8438.43±3.3356.55±9.1572.10±1.0779.12±0.3083.73±0.58
GTN74.52±10.2068.89±11.9035.42±1.5036.45±1.7255.84±4.0858.81±15.0968.80±1.6976.90±0.7778.60±1.72
GraphCON82.43±4.7284.90±3.5135.13±1.3826.90±2.1733.75±3.7774.59±2.4873.94±1.6378.44±0.3483.45±0.56
KGNN71.75±5.5769.43±6.1831.50±0.8235.02±1.6738.50±2.7861.16±9.1672.06±1.5367.24±0.3676.57±1.40
BIG-NOS81.35±3.5182.16±3.5635.80±0.6068.06±1.6574.50±1.1373.24±4.7570.63±0.3677.80±0.3281.86±0.25

We have confirmed that our implementation for GraphCon on Cornell uses the optimal hyperparameter setting: 'cornell': {'model': 'GraphCON_GCN','lr': 0.00721,'nhid': 256,'alpha': 0,'gamma': 0,'nlayers': 1,'dropout': 0.15,'weight_decay': 0.0012708787092020595,'res_version': 1}. However, we observed that the performance is lower than the result reported in the original paper. We remain open to discussion in case you have any insights or suggestions regarding this discrepancy. A similar discrepancy is also observed for KGNN, despite using the optimal configuration in the code, and we also observed that reproducibility issues have been reported by others on the official GitHub.

To offer a more balanced view, we have also computed average accuracy across homophilic, heterophilic, and all datasets, as shown in the updated results. BIG-NOS model delivers strong performance on heterophilic datasets, we find that it remains competitive on homophilic datasets, supporting its general effectiveness.

ModelGCNGINGATGCNIIGCNII*GraphSAGESANGRANDGTNGraphCONKGNNBIG-NOS
Hete. Avg42.1939.4643.9959.1962.0158.1957.1850.5554.9956.2851.2369.18
Homo. Avg76.6866.8175.7779.2079.2075.4672.9978.3274.7778.6171.9676.76
Total Avg53.6948.5854.5865.8667.7463.9562.4559.8061.5863.7358.1471.70

All of our re-implemented models will be released in our codebase to ensure reproducibility and to foster transparent discussion around reproducibility and fairness in benchmarking. We hope this clarifies our intentions and reinforces the fairness and integrity of our comparisons. We remain open to further suggestions and discussions from the community.

评论

[On the focus and scope of the manuscript]

To ensure transparency and avoid any potentially misleading claims, we will revise the manuscript to remove generalized references to "state-of-the-art" (SOTA) performance. Instead, we now characterize our findings as "promising," with a focus on demonstrating the feasibility and potential of our approach.

That said, we would like to clarify that on heterophilic benchmarks such as Actor, Squirrel, and Chameleon, our method achieves the best performance among all compared approaches. In the revised manuscript, we will retain precise statements reflecting these leading results, while taking care to highlight the exploratory and proof-of-concept nature of our work.

评论

[On W2 lack of method justification]

We sincerely thank the reviewer for raising this important and thought-provoking question regarding the justification of our design principles in the context of graph learning. We’d like to clarify the conceptual link between neural oscillations, the Kuramoto model, and graph learning is essential for strengthening the manuscript. Your comments have provided us with valuable guidance, and we are grateful for the opportunity to improve both the clarity and the broader impact of our work.

  • Motivation and Rationale:

The core motivation for leveraging neural oscillation, specifically, the Kuramoto model, in graph learning is to introduce a dynamic, interaction-based mechanism for information propagation and integration on graphs. In the brain, phase synchronization among coupled oscillators is thought to underpin robust, flexible communication between distributed regions. Following this notion, treating nodes as oscillators in graph learning allows us to model information exchange as a process of achieving local/global synchrony, naturally capturing both local structure and long-range dependencies in the graph.

Conventional GNNs, based on message passing, can struggle with issues like over-smoothing and limited expressive power. By modeling node interactions as synchronization dynamics, we provide an alternative mechanism for feature propagation and aggregation, potentially leading to richer, more interpretable, and dynamically adaptive node/graph representations. Recent literature in network science and computational neuroscience suggests that synchronization-based mechanisms can capture critical aspects of network function, motivating their exploration within machine learning on graphs.

  • Proof-of-Concept Scope:

We position this submission as a proof-of-concept study, intended to demonstrate the feasibility and promise of incorporating biologically inspired principles, specifically phase synchronization, into graph-based learning models. While we have a larger table to provide a more comprehensive empirical view for graph learning, this is intended to more thoroughly validate the core concept and generalizability across domains.

  • Revision Action:

In light of your comments, we will revise the manuscript to clarify these points and to avoid any misleading statements. We now more explicitly emphasize the exploratory nature of our work.

If any part remains unclear or if the reviewer has further questions, we are more than happy to provide additional clarifications, explanations, or engage in further discussion in any form that may help.

评论

Dear Reviewer,

We would like to sincerely thank you for the time, effort, and expertise you devoted to reviewing our submission. Your thoughtful feedback, particularly regarding the experimental settings, has helped us significantly improve the quality of our work. We are truly grateful for the opportunity to improve our work through this revision process.

We hope that our latest revision has adequately addressed your concerns. If so, we would be most grateful if you would consider updating your score accordingly.

Of course, if you have any remaining concerns or further suggestions, please do not hesitate to let us know, we would be more than happy to address them to the best of our ability.

With sincere appreciation,
23570 Authors

评论

Dear authors,

I appreciate the clarification. I also appreciate the authors' attempts to address my concerns.

I agree that the current empirical results on graph learning are promising. However, I generally do not consider 'promising results' to be good enough for a top-tier publication, especially with only thin experiments on brain data. Can the authors convince me why graph learning might need the proposed method? I would prefer a formal or empirical demonstration, but informal and intuitive explanations are welcome, too. The earlier explanations that the authors provided are unconvincing. For instance, the authors mentioned oversmoothing and limited expressivity of GNNs, but no evidence is provided that the proposed method may address those limitations. The authors also mentioned that the proposed method may capture 'both local structure and long-range dependencies', but there are countless GNNs that can achieve it.

I am willing to raise my score, only if I end up agreeing that the proposed method may add value to the graph learning literature, despite its mediocre node classification performance.

Best,

评论

We thank the reviewer for the constructive feedback and for considering a score adjustment based on the conceptual value of our work. Below, we provide a more focused response to your concerns.

1. However, I generally do not consider 'promising results' to be good enough for a top-tier publication, especially with only thin experiments on brain data.

we appreciate the opportunity to clarify the breadth and depth of our experimental validation on brain data.

In the current submission, we included three brain task datasets (HCP-A, HCP-YA, and HCP-WM) where we quantitatively demonstrate our model’s performance (Table 2), along with visualizations of synchronized neural dynamics in feature space (Figure 3a and 3b). We further explored the model’s ability to uncover biologically meaningful structures without supervision for functional parcellation (Figure 3c). Together, these results aim to show both quantitative and qualitative merits of our method in brain applications.

Beyond the main manuscript, we have in fact conducted more extensive evaluations on neurological disease datasets, which were omitted due to space constraints, and we just focused on task-evoked fMRI datasets. We now summarize the results for three disease-related datasets (ADNI, PPMI, NIFD) here to provide a broader perspective:

  • Alzheimer’s Disease Neuroimaging Initiative (ADNI): includes 135 resting-state fMRI samples from subjects diagnosed with Alzheimer’s disease (AD) or cognitively normal (CN) controls.

  • Parkinson’s Progression Markers Initiative (PPMI): includes 173 samples spanning Parkinson’s disease (PD), SWEDD (scans without evidence of dopaminergic deficit), prodromal, and healthy controls.

  • Neuroimaging Initiative for Frontotemporal Lobar Degeneration (NIFD): includes 1010 samples across a spectrum of frontotemporal dementia subtypes, including logopenic variant of primary progressive aphasia (LSD), behavioral variant (BV), progressive non-fluent aphasia (PNFA), semantic variant (SV), and cognitively normal (CON).

ModelADNI (Acc)PPMI (Acc)NIFD (Acc)
GCN81.48 ± 7.7757.14 ± 6.7248.81 ± 1.55
GIN79.26 ± 6.4662.39 ± 5.7149.90 ± 1.97
GAT81.48 ± 7.8758.96 ± 2.8048.91 ± 2.06
GCNII81.48 ± 7.7759.46 ± 7.8849.21 ± 1.70
GraphSAGE82.22 ± 6.3761.83 ± 3.5049.21 ± 1.49
SAN82.96 ± 3.7862.99 ± 7.1649.91 ± 1.99
GRAND81.48 ± 6.2062.41 ± 7.1443.46 ± 2.06
GraphCON82.96 ± 5.5460.12 ± 2.0448.31 ± 1.35
BRICK83.05 ± 8.2863.12 ± 4.59*82.27 ± 6.42*

On all three datasets, our model maintains strong classification performance. In particular, the results on PPMI and NIFD are statistically significant (*, p<0.01p < 0.01, paired t-test), confirming the robustness of the observed improvements. This highlights the model’s potential in handling complex, heterogeneous brain disorders.

We believe this is because BRICK captures abnormal coordination patterns between brain regions, which is often a hallmark of neurodegeneration. The latent phase representation enables the model to detect subtle disruptions in functional connectivity that might be missed by conventional GNNs.

We hope this addresses the reviewer’s concern by demonstrating that BRICK is consistently effective across a wide spectrum of brain datasets.

评论

2. Can the authors convince me why graph learning might need the proposed method? I would prefer a formal or empirical demonstration, but informal and intuitive explanations are welcome, too.

Below, we provide both empirical evidence and intuitive explanations to support the motivation and usefulness of our model.

[Strong performance on structurally heterophilic graphs and balanced performance across all the datasets]

As shown in updated table above, our model achieves the best performance on structurally heterophilic datasets Squirrel and Chameleon. In the table summarizing average performance below, our model demonstrates consistently good results across both homophilic and heterophilic graphs, without major weaknesses.

ModelGCNGINGATGCNIIGCNII*GraphSAGESANGRANDGTNGraphCONKGNNBIG-NOS
Hete. Avg42.1939.4643.9959.1962.0158.1957.1850.5554.9956.2851.2369.18
Homo. Avg76.6866.8175.7779.2079.2075.4672.9978.3274.7778.6171.9676.76
Total Avg53.6948.5854.5865.8667.7463.9562.4559.8061.5863.7358.1471.70

This balance is an encouraging signal: while many existing models tend to specialize in one regime (e.g., GCN performs well on homophilic graphs), our model generalizes well to both homophilic and heterophilic graphs. We believe this robustness suggests that the underlying mechanism, neural oscillation, is more generalizable and structure-agnostic.

评论

[Resistance to over-smoothing: empirical and intuitive justifications]

In Figure 4 (we also provide the table below), we empirically demonstrate that BIG-NOS remains stable up to 128 layers, with negligible degradation in performance, which is an indication of strong resistance to oversmoothing.

Layer48163264128
Acc(%)81.080.481.481.981.981.9
Pre(%)81.9581.5982.4582.8082.8282.82
F1(%)81.1080.5081.5682.0482.0482.05

Intuitively, this robustness against oversmoothing stems from the oscillatory synchronization mechanism, which enables global coherence to emerge without collapsing all node features. This stands in contrast to traditional diffusive message passing, which tends to homogenize features as the network depth increases.

Then, we provide a heuristic analysis illustrating how the coupling dynamics and feedback control in our model jointly mitigate excessive feature smoothing, by examining its simplified dynamics, steady-state solution and spectral response.

Simplified BRICK Dynamics. To derive an interpretable steady-state solution and better understand the behavior of our model, we consider a linearized simplification of the BRICK dynamics. We start with the original formulation:

dx^idt=ωi+γϕ_i(yi+_j=1Nw_ijx^j)\frac{d\hat{x}_i}{dt} = \omega_i + \gamma \phi\_i( y_i + \sum\_{j=1}^{N} w\_{ij} \hat{x}_j)

To simplify the analysis, we make two assumptions:

  1. Constant natural frequency across all nodes: ωi=ω\omega_i = \omega, which can be absorbed into a baseline or ignored under steady-state assumptions.
  2. Linearization of the nonlinear projection: We approximate the nonlinear function ϕ_x^i()\phi\_{\hat{x}_i}(\cdot) with an identity function, i.e., ϕ_x^i()\phi\_{\hat{x}_i}(\cdot) \approx \cdot. This allows us to isolate the effect of network coupling and control term.

The resulting dynamics become:

dx^idt=γ(yi+_j=1Nw_ijx^j)\frac{d\hat{x}_i}{dt} = \gamma( y_i + \sum\_{j=1}^{N} w\_{ij} \hat{x}_j)

For stability, we introduce a dissipative force to prevent unbounded growth, leading to:

dx^idt=x^i+γ(yi+_j=1Nw_ijx^j)\frac{d\hat{x}_i}{dt} = -\hat{x}_i + \gamma( y_i + \sum\_{j=1}^{N} w\_{ij} \hat{x}_j)

In matrix form, this can be compactly written as:

dx^dt=x^+γWx^+γy\frac{d\hat{\mathbf{x}}}{dt} = -\hat{\mathbf{x}} + \gamma W \hat{\mathbf{x}} + \gamma \mathbf{y}

This is a linear consensus-like system, where the x^-\hat{\mathbf{x}} term acts as a stabilizing decay, and the γWx^+γy\gamma W \hat{\mathbf{x}} + \gamma \mathbf{y} term models network feedback and task-driven control.

Equilibrium Solution. At steady state (dx^dt=0)\left( \frac{d\hat{\mathbf{x}}}{dt} = 0 \right), we obtain:

dx^dt=0x^=γ(IγW)1y\frac{d\hat{\mathbf{x}}}{dt} = 0 \Rightarrow \hat{\mathbf{x}}^* = \gamma (I - \gamma W)^{-1} \mathbf{y}

The inverse exists provided the spectral radius of ρ(γW)<1\rho(\gamma W) < 1 (i.e., γλk<1|\gamma \lambda_k| < 1), where ρ()\rho(\cdot) denotes the spectral radius, i.e., the largest absolute eigenvalue. This condition is typically satisfied in practice when WW is normalized.

Spectral Interpretation. Let WW be symmetric and decomposed as W=UΛUW = U \Lambda U^\top, where UU is the orthonormal eigenvector matrix and Λ=diag(λ1,...,λn)\Lambda = \text{diag}(\lambda_1, ..., \lambda_n). Projecting into the spectral domain (when γ=1\gamma = 1):

x^=U(IΛ)1Uy\hat{\mathbf{x}}^* = U (I - \Lambda)^{-1} U^\top \mathbf{y}

Let y~=Uy\tilde{\mathbf{y}} = U^\top \mathbf{y} and x~^=Ux^\hat{\tilde{\mathbf{x}}}^* = U^\top \hat{\mathbf{x}}^*, then:

x~^k=11λky~k\hat{\tilde{x}}_k^* = \frac{1}{1 - \lambda_k} \tilde{y}_k

Thus, each spectral mode is scaled by a transfer function: h(λk)=11λkh(\lambda_k) = \frac{1}{1 - \lambda_k}

Comparison with Diffusion-based GNNs. In standard diffusion-type GNNs (e.g., GCN), applying MM layers is equivalent to using a transfer function eMλke^{-M \lambda_k} in spectral space. This leads to exponential suppression of high-frequency signals (large λk\lambda_k), causing oversmoothing. In contrast, BRICK uses a transfer function that decays much slower: h(λk)=11λkh(\lambda_k) = \frac{1}{1 - \lambda_k}, which corresponds to 1/λ1/\lambda-level suppression, allowing high-frequency, discriminative signals to persist.

评论

[Conceptual novelty and broader potential impact]

BRICK is a proof-of-concept model that integrates neural oscillator theory into brain network analysis and graph learning. We are motivated by two goals:

  • Explore new learning paradigms inspired by biological synchronization
  • Enhance interpretability in both neuroscience and general graph domains

On brain datasets, BRICK already achieves strong predictive performance and insightful phase-based visualizations. On graph benchmarks, it delivers promising results with seamless adaptation. We believe this success can inspire future work on leveraging neural resonance and dynamic coordination mechanisms to design more interpretable and robust GNNs.

In summary, our model contributes to brain network analysis by providing a biologically grounded framework for modeling neural dynamics, and to graph learning by introduce a new pwespective on addressing challenges such as heterophily and oversmoothing. We hope this convinces the reviewer of its relevance and potential.

We welcome any further questions and will do our best to respond with additional details or clarifications as needed.

评论

Dear authors,

I appreciate the thorough response. I have carefully read it, and many of my concerns have been adequately addressed. However, I am slightly concerned about the NIFD dataset result. The performance gap between the proposed method and baseline GNNs (~50%) seems unreasonably high to me. For comparison, I searched for papers that used GNNs on the same dataset, but I couldn't find any papers. For the camera-ready version, please make sure that the reported results are fair and reproducible.

Considering all factors, I decide to raise my score.

Best,

评论

We sincerely thank the reviewer for the constructive feedback and, in particular, for the willingness to raise the score. We deeply respect the reviewer’s rigorous reading and thoughtful concerns, and we truly appreciate the time and effort invested in evaluating our work.

Regarding the NIFD dataset performance: we understand that the improvement over standard GNN baselines may appear large. We have already noticed this gap and have carefully re-examined our experiments and confirmed that the reported results are accurate and reproducible. In fact, we observe similar behavior on another task-based dataset (HCP-WM), where BRICK also outperforms GNN baselines by a notable margin. Below, we provide a more detailed explanation for this phenomenon:

  • Topology Dependency of GNNs: In all brain experiments, we use BOLD signals as node features and structural connectivity (SC) as the graph topology. In the HCP-WM dataset, different tasks from the same subject share the same SC. Since GNNs heavily rely on the topology structure for message passing, they tend to weaken distinctions between different cognitive states of the same subject, leading to reduced performance [1].

  • Neural Oscillation vs. Static Propagation: Real neural coordination is not solely governed by static, instantaneous topology, but rather by temporal dynamics, such as synchronization and resonance [2, 3]. Standard GNNs are not designed to capture such time-evolving processes, while BRICK explicitly models oscillatory dynamics with feedback control. This enables it to simulate stable coordination across time and regions, particularly critical for tasks involving sustained activity like working memory [4].

  • Functional Network-Level Coordination: Working memory engages multiple regions in sustained co-activation (e.g., prefrontal, parietal, and temporal lobes). And in NIFD, structural heterogeneity is strong across a wide spectrum of frontotemporal dementia subtypes:

    • LPA: affects the left temporoparietal junction
    • bvFTD: involves frontal cortex, insula, and amygdala
    • PNFA: involves the inferior frontal gyrus and anterior insula
    • SV: mainly atrophies in temporal poles
    • CON: healthy controls

    These subtypes often involve broad disruptions in the frontotemporal network, which are not only spatially localized but also involve long-range connections [5, 6]. GNNs struggle to capture such patterns due to reliance on local aggregation. In contrast, BRICK models global coupling through synchronization, allowing it to dynamically detect these distributed, non-local disruptions, thus explaining its better performance on NIFD.

    Additionally, in smaller datasets like ADNI (135 samples) and PPMI (173 samples), all models are limited by sample size, making it harder to learn complex neural dynamics. However, in NIFD (1010 samples), the larger dataset enables BRICK to fully leverage its dynamic synchronization to learn more robust and generalizable patterns. In contrast, GNNs suffer from expressivity limitations like we mentioned above and they fail to use the additional data effectively.

We hope this addresses the reviewer’s concern and reassures that the reported improvements are both fair and grounded in meaningful neurophysiological mechanisms. We are committed to ensuring reproducibility of the results and will incorporate all the results and discussions in the final version.

[1] Misra, Joyneel, et al. "Learning brain dynamics for decoding and predicting individual differences." PLOS Computational Biology 17.9 (2021): e1008943.
[2] Buzsáki, György. Rhythms of the Brain. Oxford University Press, 2006.
[3] Fries, Pascal. "Rhythms for cognition: communication through coherence." Neuron 88.1 (2015): 220–235.
[4] Funahashi, Shintaro. "Prefrontal cortex and working memory processes." Neuroscience 139.1 (2006): 251–261.
[5] Seeley, William W., et al. "Neurodegenerative diseases target large-scale human brain networks." Neuron 62.1 (2009): 42–52.
[6] Zhou, Juan, et al. "Divergent network connectivity changes in behavioural variant frontotemporal dementia and Alzheimer’s disease." Brain 133.5 (2010): 1352–1367.

Once again, thank you for your thoughtful evaluation and support.

最终决定

The paper integrates a synchronization mechanism of neural oscillations with a graph representation learning framework. Motivated by addressing the oversmoothing issue of conventional GNNs, it proposed to utilize the brain rhythms in the artificial dynamical system. This method integrates a Kuramoto model with attending memory for modeling oscillatory synchronization in brain regions. The authors argue that the proposed approaches represent a new graph learning mechanism with SOTA performance.

Strengths identified by the reviewers include:

  1. The approach seems novel. However as noted by a reviewer there is prior work integrating neural oscillation with graph neural networks which was not cited in the original manuscript.

  2. Experiments on human brain datasets were appreciated by most reviewers. One reviewer however expressed some concerns which were addressed during rebuttal.

  3. Good visualizations

Weaknesses identified include:

  1. Important related prior work on neural oscillators not cited. The authors indicate in their rebuttal that they will cite missing work

  2. A reviewer expressed concerns about the fairness and rigor of experimental setting, especially with respect to hyperparameters chosen.

  3. More information needs to be reported on the computational cost of the proposed methods compared to existing methods

The authors did a good job addressing the reviewer concerns, which led reviewers to update their score. In particular the authors did a good job providing a better theoretical justification of their work, and a better justification of the biological plausibility, which is my understanding will be included in the final paper.

Overall the paper received 3 Accept ratings and 1 Borderline accept. Its score was significantly higher than the minimum threshold needed for acceptance so I recommend acceptance. However one of the reviewers did raise some valid concerns. The authors indicated in their rebuttal that they would address those concerns in their revised manuscript.