EvoBrain: Dynamic Multi-Channel EEG Graph Modeling for Time-Evolving Brain Networks
EvoBrain, a theoretically grounded and efficient dynamic GNN that models temporal and evolving spatial EEG patterns for seizure analysis.
摘要
评审与讨论
This paper tackles the issue of existing so-called Dynamic GNN models that are actually built on temporally static graphs, thus failing to reflect the evolving nature of brain connectivity specifically in the applied field of seizure prediction. The paper is included in a broader theoretical framework in which this new model named EvoBrain is defined as a "time-then-graph" approach. This new model is used for seizure detection and integrates a two-stream Mamba architecture with a GCN enhanced by Laplacian Positional Encoding, significantly improving metrics like AUROC and F1-score across different baselines and task complexity.
优缺点分析
Strengths
Overall, I find this paper to be well organised, with its contributions clearly framed within the broader literature and showing potential impact on clinical seizure prediction. Although I do not work with seizure applications and with dynamic graph modelling (and thus I cannot fully validate some of the novelty claims), I still believe the paper stands out in several key areas, namely: (1) good results in seizure prediction at 60s before seizure onset, as opposed to 20s supposedly more commonly used in previous literature, (2) superior and consistent results demonstrated in table 1, and (3) the use of evolving dynamic graph structures over time, which the authors contrast with the static graphs adopted in earlier studies. These strengths are further reinforced by a theoretical framework analysing different parts of the method and method's motivation.
In my view, the novelty of this work comes from a novel combination of existing approaches to address previously known limitations, resulting in what appears to be a very successful model. This makes the ability to predict seizures up to a minute in advance not only clinically meaningful, but also a computationally efficient solution supported by solid theoretical grounding.
Based on the significance of these strengths I'm leaning towards acceptance of this work; however, I'm only scoring it as borderline accept for now mostly given some points I need to clarify during the rebuttal period, as well as some weaknesses that I'll identify.
Weaknesses
I have three main concerns about the experimental settings and problem definition.
Firstly, I have the impression that the authors are mixing together two different claims: one thing is the dynamic graph creation at each timestep, and the other is the indication that the "time-and-graph" method is the most expressive representation. This taxonomy seemingly introduced by Gao and Ribeiro (2022) is new to me, but from section 2.2 it seems to me that all three representations could be used with only static graphs generated beforehand, as well as with a new graph being created at each timestep. In this way, I believe the experimental claim that time-then-graph is better than the others was not fairly evaluated because it seems that the method also included a dynamic graph creation on top of the time-then-graph framework, mixing claims. Thus, I believe a fairer comparison would include a dynamic graph creation in the baselines.
Secondly, no traditional ML model was used. The authors seem to claim that this work could be useful for clinicians, but an important step of such thing is to present them what they currently use, assuming not so much deep learning. Baselines with proper hyperparameter search like for instance RandomForests/XGBoost/SVM would be important to report.
Finally, and this is not as important as my previous two weaknesses given the final results, the paper only uses a GCN as "the" graph neural network model. I'm sure a straightforward GCN model helps in bringing the fast results claimed in the paper, but with so many GNN models currently in literature, it would be interesting to understand how a couple of other more GNN models would influence the performance and at what time/parameter cost (and as I mention in Questions, it would also be important to ablate this specific step).
Other small suggestions:
- Typo in abstract: "GMN"
- I'm not sure the use of "nascent" in the abstract is clear. What does that mean, "recent"? How does that fit with the interactions claim?
- In section 4.1, it would be good to specify how the threshold to rank the correlations is defined, and maybe ablated.
- In section 4.1, it is not clear how the creation of the sparse graphs prevent information redundancy.
- In Figure 4, one would expects some measure of variation when comparing running times across different models, instead of showing only the averaged values.
问题
As I mentioned, I am leaning towards acceptance of this work given the strengths of the paper; however, given the weaknesses I have identified, I can only recommend borderline accept at the moment. I will be happy to increase my scores if the authors satisfactorily tackle the weaknesses I have identified, as well as help me better understand some of the claims of the paper, which I leave in the following bullet points for easier identification:
- Shouldn't the application of the GCN layer at the end of the model be also ablated? The way the paper is written it seems that this so-called two-stream Mamba handling of the different graphs at different snapshots seems to be able to generate a very rich spatial model of the evolution of the different graph structures throughout time. In this sense, wouldn't it make sense to directly apply a fully connected layer on this spatial representation?
- To better evaluate the claimed novelty of the experiments, could the authors be more specific about the challenging task of seizure prediction at 60s before onset? Is it that they are the first ones using this timestep, or just that most works only use the 20s timestep?
- The dataset seems particularly imbalanced, which I'm sure is a common thing in this area. Did the authors use techniques to account for this for example by using a weighted loss function? This tends to help in these situations, but I'm not sure the authors considered it. In this sense, the AUROC and F1 scores are definitely good choices, but I believe the paper would benefit from having more fine-grained metrics like sensitivity and specificity in appendix.
局限性
The checklist says that limitations as well as potential negative societal impacts of the work were discussed, but I don't see them anywhere. Could the authors please clarify where these points are located?
最终评判理由
The rebuttal period was extremely useful and I thank the authors availability to answer my questions. As a result I have increased the score to Accept, and indeed I think this is a very interesting work!
格式问题
None.
We appreciate Reviewer w1Ha for recognizing the paper organization, clear contributions, experiments and solid theoretical grounding. We appreciate your time and constructive comments. We hope our clarifications below can address your concerns and raise your rating of the paper.
W1: Two different claims.
As you correctly pointed out, our contributions include two distinct components: (1) a dynamic graph creation and (2) a time-then-graph model. All three model variants (graph-then-time, time-and-graph, and time-then-graph) are compatible with either static or dynamic graph inputs.
To ensure a fair comparison, we applied our dynamic graph structure to all GNN baselines across three model variants, as shown in Figure 3. Our time-then-graph model consistently outperforms the others even when all models use the static or our dynamic graph structures. Moreover, applying our dynamic graph to baseline architectures also improves their performance.
W2: No traditional ML model was used.
We have conducted experiments using Random Forests and SVM on the 12-second seizure prediction task. The AUROC scores were 0.778 for Random Forests and 0.765 for SVM.
We will include all results of these traditional baselines in the final version.
W3: Other GNN models.
Thank you for your comments.
We conducted additional experiments replacing the GCN layers with a more expressive Graph Isomorphism Network (GIN). The GCN layers contain 24,441 parameters, whereas the GIN layers increase the parameter count to 39,016.
In the 12-second seizure detection task, GIN achieved an AUROC of 87.2, which is slightly lower than the 87.7 obtained by the GCN. The excessive model complexity may overfit to transient or noisy patterns rather than capturing stable dynamics.
Based on our experiments, we observed that how different components are integrated (i.e., time-then-graph) is more critical for effective dynamic EEG modeling rather than the complexity of individual modules.
We will include these additional findings in the final version of the paper.
S1: Typo.
Thank you for your suggestion.
S2: Nascent.
Thank you for your thoughtful comment regarding the use of the term "nascent". Our intention was to emphasize that while there have been emerging efforts to model temporal signals and graph structures jointly, the modeling of their interactions remains at an early and underdeveloped stage. These methods are often exploratory and lack consistent theoretical or empirical grounding, leading to unstable performance across tasks.
S3: Threshold to rank the correlations.
In our experiment, the value of τ (number of top neighbors) is set to 3 following prior work on EEG functional connectivity modeling. We additionally conducted experiments with τ = 5 and τ = 7, and observed less than 1% variation in performance, suggesting that most connections beyond the top-3 are not informative.
We will include these results and relevant discussion in the final version.
S4: Sparse graphs.
Thank you for the discussion. By retaining only the top-τ most correlated neighbors for each node, we eliminate weak or noisy correlations that may not reflect meaningful functional connectivity. This sparsification reduces redundant or task-irrelevant information and improves both interpretability and performance.
S5: Variation in running times.
Thank you for your comments, and we will report the variation across multiple runs in the final version.
Q1: Ablation of GCN.
While the two-stream Mamba effectively captures the temporal dynamics of node activity and edge strength independently, it lacks mechanisms to jointly model the interactions between node representations and their evolving connectivity patterns. Given that seizures are a network-level phenomenon, it is essential to capture not only which nodes are active, but also how their activity propagates through dynamic, task-relevant connections. The GCN layer explicitly fuses node features with their neighboring context, enabling the model to learn spatiotemporal patterns such as synchronized activation or pathological propagation across brain regions.
From our theoretical perspective of time-then-graph modeling, GCN complements the two-stream backbone by capturing cross-modal interactions between node activity and edge-defined structure.
We conducted an ablation study, removing the GCN layer and concatenating node and edge features, then applying a linear classifier. This setup yielded an AUROC of 84.7 on the 12-second seizure detection task, which is notably lower than the 87.7 achieved with the GCN layer.
We will include this ablation along with the GIN comparison in the final version.
Q2: Seizure prediction.
Thank you for the constructive comments.
Compared to seizure detection, the prediction task holds greater clinical value, as it enables early intervention by clinicians.
Seizure prediction is inherently more difficult than detection, as it requires identifying subtle pre-ictal patterns rather than overt seizure activity. Longer prediction windows (such as 60s vs. 20s) contain less seizure-indicative information, making the task even more challenging and demanding greater model capacity.
To the best of our knowledge, this is the first work to conduct seizure prediction on the large-scale TUSZ dataset, which involves a substantial number of patients and supports long prediction windows up to 60 seconds before seizure onset.
This setting is more challenging than those considered in most prior studies, which typically focus on shorter prediction windows.
We will address these points in the experimental setup section of the revision.
Q3: Imbalanced dataset.
Thank you for your suggestion.
To ensure a fair evaluation, we did not apply techniques such as weighted loss functions. Instead, we focused on balanced and robust metrics, F1 score, and AUROC.
We will include sensitivity and specificity in the appendix in the final version.
Limitations and negative social impacts.
Thank you for pointing out this important aspects. We will explicitly state these concerns as part of the limitations and broader societal impacts in the final version. Below is our intended discussion:
In seizure prediction, where pre-ictal patterns are typically weaker and more spatially diffuse. While EvoBrain achieves the best performance among lightweight GNN-based models, LaBraM benefits from having approximately 30 times more parameters and pretraining on large-scale multiple EEG corpora, which enhances its ability to generalize under limited and noisy pre-seizure data conditions. While we focus our evaluation on the seizure task, which is particularly critical and life-threatening among various EEG applications, generalization to other tasks remains a limitation of our work.
Regarding potential negative societal impacts, we recognize key risks such as bias and system malfunction. Specifically, models trained on EEG data from specific demographic groups may exhibit biased performance when applied to broader populations, potentially leading to unequal diagnostic accuracy. Additionally, miscalibrated early seizure prediction could lead to false alarms, causing unnecessary interventions or patient distress. These challenges highlight the importance of incorporating fairness assessments, demographic audits, and human-in-the-loop strategies in future development and deployment stages.
I thank the authors for their detailed, point-by-point answer. Overall, all my points (except two) were well tackled, and I believe I'll be changing my score after the rebuttal, unless something very obvious comes from the other reviewers.
There's only two points I'd like to follow-up:
W1 I'm not sure I see where your time-then-graph model consistently outperforms the other models when using the static graph structures. From what I see, all experiments were done using dynamic graphs, right? What I mean in this context is that clearly your dynamic creation claim is well supported experimentally, but the "time-and-graph" doesn't seem to be, because all experiments have been done with dynamic graphs, thus not being clear whether the "time-and-graph" claim is a strong one when we don't have dynamic graphs. Just to highlight again, I'm talking about the experimental claims, not the theoretical ones.
Q2 I believe the authors did not directly answer my question, instead giving quite a long answer to what I think was a very simple question. Very directly, my question was whether you were the first ones to try to predict seizure with 60s onset (overall or on this specific dataset). My point being that if indeed you were the first ones to do that, you should point that out as a novel experimental contribution. It seems you still vaguely talk about "most prior studies", which seems to indicate that indeed you are not the first ones to use this more challenging task. I want to be clear that this is totally fine and does not change my evaluation of this paper, I was just honestly trying to better understand your claims, and that making this clearer would be important for the readability of the paper.
Dear Reviewer w1Ha,
We are very happy to hear that we were able to address most of your concerns, and we sincerely thank you again for your recognition. Below, we provide further responses to your remaining two questions:
(1) Please kindly focus on the purple bars in Figure 3. All models, including our EvoBrain, are built on top of a static graph structure. Our method achieves the best performance, with the highest bar in both 12s detection (subfigure a) and 60s detection (subfigure b). This clearly demonstrates that our time-then-graph outperforms the others.
(2) We apologize for the confusion. Yes, we are the first to conduct 60-second early prediction on TUSZ. We truly appreciate your thoughtful suggestion to highlight this contribution more clearly. In the revision, we will explicitly emphasize this as a novel experimental contribution.
Thank you again for your constructive comments, and we truly appreciate your feedback.
I want to thank the authors on following-up about W1. It totally makes sense, and indeed all my questions are successfully tackled and I'll wait for the other reviewers until the end of the rebuttal period, thanks!
We sincerely appreciate your recognition and are pleased to hear that our responses have addressed your concerns and contributed to the increased score. We thank you again for your kind suggestions and dedication to the review process.
The authors propose EvoBrain to capture both temporal and spatial dynamics in multi-channel EEG data for accurate seizure detection. By explicitly constructing and integrating dynamic graphs, the model effectively learns the evolving characteristics of brain networks. Experiments show the effectiveness of the proposed methods.
优缺点分析
Strengths:
- The manuscript is clearly written, and the idea of constructing and utilizing dynamic graphs is interesting and compelling.
- The study offers an in-depth exploration of the issues in the field.
Weaknesses:
- The problem definition and data description are unclear. For example, in line 39, the terms 'initial snapshot' and 'subsequent EEG snapshots' are not easy to understand, as it is not clear which parts of Figure 1 they refer to. A detailed description and visualization of the data and task would make the article more accessible to a wider audience.
- The novelty of the proposed method appears limited, as Graph Convolutional Networks and Laplacian Positional Encoding are well-established and general techniques. Could you clarify which components of your model are specifically tailored to this task? Additionally, it would be useful to highlight the differences between your approach and existing models in the literature.
- The comparison methods in the experiments seem outdated, which mostly consists of approaches from 2022 and 2023. Could you consider including comparisons with more recent methods from the past two years? This would provide a more comprehensive evaluation of the effectiveness of your approach.
- The evaluation metrics used are not comprehensive. For medical tasks, recall and precision are also essential. Why were these not analyzed in the comparative and ablation studies? F1 and AUC may not always fully reflect the clinical performance of the model.
问题
See Weaknesses
- A detailed description and visualization of the data and task would make the article more accessible to a wider audience.
- Could you clarify which components of your model are specifically tailored to this task? Additionally, it would be useful to highlight the differences between your approach and existing models in the literature.
- You are strongly recommended to include comparisons with more recent methods from the past two years (2024-2025). This would provide a more comprehensive evaluation of the effectiveness of your approach.
- Why were recall and precision scores not analyzed in the comparative and ablation studies? F1 and AUC may not always fully reflect the clinical performance of the model.
局限性
The description of limitations in line 365 is rather superficial, making it difficult for readers to fully grasp the future directions. The authors should carefully discuss the limitations and highlight them in the checklist.
最终评判理由
Most of my concerns have been addressed during rebuttal. I will thus increase my scores.
格式问题
NA
We appreciate Reviewer SFyk for recognizing the paper presentation, motivation, and in-depth theoretical exploration. We appreciate your time and constructive comments. We hope our clarifications below can address your concerns and raise your rating of the paper.
W1 and Q1: Problem definition, data and task descriptions.
Thank you for your comments. We would like to clarify that the problem definition is provided in Section 2.2, and the data and task description are detailed in Section 5.1 and Appendices F and G. The term "EEG snapshot" refers to each short segment (i.e. 1-second) of EEG data that is extracted by sliding a window over the continuous EEG signal, serving as the input to the RNN-based model (as illustrated in Figure 1).
We will revise the final version to include a more detailed description and updated visualizations.
W2 and Q2: Novelty and components specifically tailored to this task.
Our time-then-graph architecture and explicit dynamic graph structure are specifically tailored to EEG tasks. That is, our model design is fundamentally guided by brain dynamics.
The choice of the two-stream Mamba architecture is grounded in our time-then-graph design, which separates the temporal modeling of node and edge attributes for explicitly modeling brain dynamics. This architecture incorporates a selective mechanism that mirrors the neurobiological processes of selectively retaining relevant information and adaptively integrating new stimuli. In addition, we incorporate LapPE to preserve brain region specificity. We propose explicit dynamic graph structures for each EEG snapshot rather than using a single static graph, enabling it to capture transient connectivity changes of brain state, such as seizures. We hope that EvoBrain introduces a new architectural perspective to seizure research.
Thank you for your insightful comments, we will highlight this discussion more clearly in the revised version and relate it to the relevant literature.
W3 and Q3: Recent models.
Thank you for your comments.
We compared our method against LaBraM (ICLR'24), one of the most recent and powerful EEG foundation models. LaBraM is pretrained on large-scale datasets across multiple domains and possesses a larger number of parameters.
Moreover, we include two new baseline comparisons: EEGPT (NeurIPS’24) and AMAG (NeurIPS’24). EEGPT is a recent foundation model for EEG, while AMAG is a deep graph model focused on neural dynamics.
Similar to LaBraM, EEGPT is pretrained on large-scale EEG datasets across multiple domains and contains 50 million parameters—approximately 277× larger than our EvoBrain model.
On the 12-second seizure detection task, EEGPT and AMAG achieve AUROC scores of 80.4 and 81.7, respectively, while our EvoBrain model achieves 87.7. This further supports the effectiveness of the proposed method.
We will include all results in the final version.
W4 and Q4: Evaluation metrics.
In seizure tasks, seizure periods are typically much shorter than non-seizure periods, and our dataset reflects this class imbalance. In such cases, accuracy can be misleading, as a model that always predicts the majority class (i.e., non-seizure) may achieve high accuracy but provide little clinical value.
In contrast, F1 score balances precision and recall, making it more appropriate when positive (seizure) instances are rare but important to detect. Similarly, AUC (Area Under the ROC Curve) evaluates the model’s ability to distinguish between classes across different thresholds and is less affected by class imbalance.
Our experimental settings are fully aligned with the reference work DCRNN [1] and GRAPHS4MER [2]. We will also include precision and recall metrics in the final version, following your suggestion.
- [1] Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis, ICLR, 2022.
- [2] Modeling Multivariate Biosignals With Graph Neural Networks and Structured State Space Models. CHIL, 2023.
Limitations.
Thank you for your comments and for pointing out this aspect, and we will enhance the discussion of limitations in the final version of our paper. Below is our intended discussion:
In seizure prediction, where pre-ictal patterns are typically weaker and more spatially diffuse. While EvoBrain achieves the best performance among lightweight GNN-based models, LaBraM benefits from having approximately 30 times more parameters and pretraining on large-scale multiple EEG corpora, which enhances its ability to generalize under limited and noisy pre-seizure data conditions. While we focus our evaluation on the seizure task, which is particularly critical and life-threatening among various EEG applications, generalization to other tasks remains a limitation of our work.
Thank you for providing detailed explanations. Most of my concerns have been addressed. I will thus increase my scores.
We sincerely appreciate your recognition and are pleased to hear that our responses have addressed your concerns and contributed to the increased score. We thank you again for your kind suggestions and dedication to the review process.
The authors provide a¡ framework for EEG seizure detection and pre-seizure identification using Graph Neural Networks. They demonstrate that dynamic representations of the graph, as well as temporal modeling of the graph using Mamba exhibits a better representation of the dynamics of the brain network. They demonstrate how its model outperforms some of the existing models, with a similar performance but an improvement in AUROC.
优缺点分析
The manuscript presents a solid mathematical foundation for their claims. They showcase how their model outperform other models. It is a strong manuscript.
The manuscript lacks of a discussion about the limitation of the model. For instance in Table 1 it is shown that for prediction of seizures LaBraM outperforms their method. A deep discussion about this is needed. Also generally a faster computation is achieve by an increased burden in memory. Then a discussion about the memory consumption for the training of the model is lacking. In general the manuscript lacks a discussion about the limitations of their approach.
问题
I wonder about the limitation of their approach and a deep discussion about the limitations of this model. Also a discussion of why LaBraM outperformed EvoBrain in the prediction of seizures.
局限性
There is a lack of discussion about limitations, please refer to my prior comments.
格式问题
none
We appreciate Reviewer qoYr for recognizing the solid mathematical foundation and acknowledging this as a strong manuscript. We appreciate your time and constructive comments. Below, please find our point-by‑point responses to your concerns and questions. We hope our clarifications below can address your concerns and raise your rating of the paper.
W1: LaBraM outperforms ours on seizure prediction task.
Thank you for your comments, and we will discuss this in the final version as a limitation. Below is our intended discussion:
In seizure detection, our method demonstrates superior performance due to its ability to highlight strong, localized network changes with high representational power. These changes are indicative of seizure onset and are effectively captured by our dynamic graph modeling.
However, in seizure prediction, where pre-ictal patterns are typically weaker and more spatially diffuse. While EvoBrain achieves the best performance among lightweight GNN-based models, LaBraM benefits from having approximately 30 times more parameters and pretraining on large-scale multiple EEG corpora, which enhances its ability to generalize under limited and noisy pre-seizure data conditions.
W2: Memory.
We appreciate the reviewer’s insightful comment regarding memory consumption and limitations. As suggested, we have measured the maximum GPU memory usage with a batch size of 1 for all GNN models.
| Model | Training (MB) | Inference (MB) |
|---|---|---|
| EvoBrain | 51.35 | 46.64 |
| GRU-GCN | 54.61 | 52.09 |
| Graphs4mer | 369.46 | 93.02 |
| DCRNN | 21.10 | 20.54 |
| EvolveGCN | 22.06 | 20.07 |
While DCRNN and EvolveGCN are indeed more memory-efficient , our EvoBrain achieves over 10× faster compared to these baselines as shown in the figure 4.
Among the three time-then-graph models, EvoBrain is the fastest and provides a well-balanced trade-off between memory usage and computational speed.
We will include these points in the final version of our paper.
The paper proposes EvoBrain, a dynamic multi-channel EEG modeling framework for seizure detection and prediction. The authors identify two key gaps in existing dynamic GNN-based EEG models: (1) the overuse of static graph structures, which fail to capture evolving brain connectivity, and (2) insufficiently expressive spatio-temporal modeling architectures. EvoBrain addresses these issues through (a) an explicit dynamic graph structure that updates both nodes and edges over time, and (b) a time-then-graph modeling approach using a two-stream Mamba architecture for temporal dynamics, followed by GCNs with Laplacian Positional Encoding for spatial modeling. Theoretical analysis demonstrates the expressivity advantages of this approach. Experiments on large public datasets (TUSZ, CHB-MIT) show clear gains in AUROC and F1, with higher computational efficiency than previous baselines.
优缺点分析
Strengths:
- The motivation is well-defined, focusing on the clinical importance of modeling the evolving connectivity in EEG for seizure prediction.
- The paper provides solid theoretical analysis, rigorously comparing different spatio-temporal dynamic graph architectures, and establishes the superiority of time-then-graph with explicit dynamic graphs.
- The method is evaluated on strong baselines and large, realistic datasets, with results showing significant improvements in both performance and efficiency.
- Engineering details, ablation, and neuroscientific interpretability are well-addressed, supporting clinical relevance.
Weaknesses:
- The core contribution is primarily an integration and refinement of known components: dynamic graphs, GNNs, and sequence models—rather than a fundamentally new paradigm.
- The “explicit dynamic graph” idea is conceptually logical and increasingly common in recent GNN+EEG works; the main difference here is in more rigorous theoretical justification and cleaner engineering.
- The method focuses mainly on standard seizure detection/prediction. Generalizability to more diverse EEG tasks (multi-class classification, cross-patient generalization, multi-modal fusion) is not explored.
- Some implementation choices (two-stream Mamba, LapPE) feel like plug-and-play improvements, and the overall novelty is incremental relative to the field’s direction. not big innovation.
问题
no
局限性
yes
最终评判理由
I think the authors have addressed my concern. It is enough to accept.
格式问题
no
We thank Reviewer Ma49 for recognizing our well-defined motivation, solid theoretical analysis, strong baselines, large realistic dataset experiments, and neuroscientific interpretability. We appreciate your time and constructive comments. Below, please find our point-by‑point responses to your concerns and questions, and we hope our clarifications below can address your concerns and raise your rating of the paper.
W1: The core contribution is primarily an integration and refinement of known components, rather than a fundamentally new paradigm.
Thank you for the discussion. Our contribution lies in the theoretical foundation for modeling brain dynamics, along with the proposal of novel seizure detection models based on this foundation.
As you mentioned, this work is to first investigate and provide theoretical analyses for EEG modeling using dynamic graph neural networks. From a technical perspective, while EvoBrain is not a fundamentally new paradigm, it is a first time-then-graph architecture that combines explicit dynamic modeling for seizure tasks, guided by our theoretical foundation, that effectively captures evolving brain network dynamics.
W2: The main difference here is in more rigorous theoretical justification and cleaner engineering.
We hope that moving from intuitive modeling to a theoretically guided and clearly structured approach can provide a foundation for future research on GNN-based seizure modeling.
Several recent works (e.g., GRAPHS4MER) learn explicit dynamic graphs internally; however, they begin with a temporally fixed input graph. In contrast, our method computes the graph structure initially and directly based on individual temporal snapshots (e.g., 1 second) from the beginning.
W3: Generalizability to more diverse EEG tasks.
Thank you for your discussion. Our results already demonstrate the effectiveness of our method in terms of cross-patient generalization. We would like to clarify that training and test sets are composed of entirely different patients, and we will explicitly state this in Line 268 of the revision.
We also appreciate your insightful comments. We will expand the discussion to include diverse EEG tasks and multi-modal fusion modeling in the Discussion Section.
W4: Some implementation choices (two-stream Mamba, LapPE) feel like plug-and-play improvements, and the overall novelty is incremental relative to the field’s direction. not big innovation.
Our model design is fundamentally guided by brain dynamics.
The choice of the two-stream Mamba architecture is grounded in our time-then-graph design, which separates the temporal modeling of node and edge attributes for explicitly modeling dynamics. This architecture incorporates a selective mechanism that mirrors the neurobiological processes of selectively retaining relevant information and adaptively integrating new stimuli. In addition, we incorporate LapPE to preserve brain region specificity. We hope that EvoBrain introduces a new architectural perspective to seizure detection research.
We will further clarify these points in the subsequent vision of the paper.
Thank you for providing detailed explanations and for addressing the concerns I raised in my initial review. The idea and work itself is interesting. I need more clarification from the authors.
W1: Core Contribution and Novelty: Although the paper introduces an innovation with the "time-then-graph" architecture, I suggest that you clarify the uniqueness and significance of your innovation relative to existing methods (such as GRAPHS4MER) in the introduction section.
W2: Dynamic Graph Modeling: How does your dynamic graph modeling differ from the graph construction based on temporal snapshots? Why does it lead to greater improvements in seizure tasks?
W3: Generalizability to More Diverse EEG Tasks: Thank you for clarifying that the training and test sets come from entirely different patients. This is quite interesting. Since we are also conducting similar research, we have found that transfer learning across different subjects is challenging. I would be curious to know how the authors controlled for the effectiveness of cross-patient generalization.
W4: About the Two-Stream Mamba Architecture and LapPE Method: Your explanation of the two-stream Mamba architecture and LapPE method helped me better understand their relevance in modeling brain dynamics and seizure detection. Although these methods are indeed innovative, compared to other methods in the field, I still feel that these implementation choices are incremental improvements rather than major breakthroughs.
We sincerely thank the reviewer for the additional discussion and recognition of our work. Below, we provide further responses to your remaining concerns:
W1: Core Contribution and Novelty.
Thank you for your suggestion. We will clarify the uniqueness and significance of our method to better position it in the final version of the paper.
W2: Dynamic Graph Modeling.
The main difference lies in feature coverage. Our explicit dynamic modeling approach process both node and edge features from temporal snapshots.
We would like to clarify the difference between existing and our dynamic graph modeling.
Previous studies such as DCRNN and GRAPHS4MER use node feature (same to ours) and adjacency matrix , which lacks time dimention . We call them as implicit dynamic modeling.
This means that the graph structure will be shared for any , that is, existing methods hold static graph.
In contrast, our explicit dynamic graph modeling uses time-varying adjacency matrices , as seizures are typically networking disorders. Therefore, we explicitly maintain dynamics in time-evolving node and edge features. That is the reason for improved performance.
W3: Generalizability.
Thank you for this interesting discussion. In fact, we recognize there is still room for improvement and, we have not designed any specific modules for cross-patient generalization. Possible reasons for the improved performance are that our expressive method effectively estimates dynamic graph representations across different patients, or that the experimental datasets were well-processed. But transfer learning is a promising direction and deserves further exploration in future work.
W4: About the Two-Stream Mamba Architecture and LapPE Method.
Thank you for acknowledging the innovativeness of our method. Our main focus and contribution lie in investigating a foundation for learning dynamics and proposing reasonable models accordingly. We believe this is a long-term research direction, knowning that it may take several papers to thoroughly explore and develop this line of work.
Once again, thank you for the thoughtful discussion.
The author has addressed my concerns, and I have slightly increased my score as a result.
We sincerely appreciate your recognition and are pleased to hear that our responses have addressed your concerns and contributed to the increased score. We thank you again for your kind suggestions and dedication to the review process.
Hello, Would greatly appreciate your response to the authors' rebuttal particularly at this critical point in time. Thanks Best
EvoBrain: Dynamic Multi-channel EEG Graph Modeling for Time-evolving Brain Network
The authors propose a Dynamic GNNs, to seamlessly integrate temporal and spatial features in Electroencephalography (EEG) data with a goal of detecting seizures. The primary challenge they address is that of the dynamics underlying these two states which reflect the brain evolving connectivity during a seizure. They proposed EvoBrain, as a novel seizure detection model by integrating a two-stream Mamba architecture with a GCN enhanced by Laplacian Positional Encoding, following neurological insights. Their integration of dynamic graph structures in EvoBrain, allows for the time evolution of both nodes and edges over time. Their theoretical analysis shows the expressivity advantage of explicit dynamic modeling and time-then-graph over other approaches. Their evaluation of the novel and efficient model shows a significant improvement of AUROC by 23% and F1 score by 30%, in comparison with the dynamic GNN baseline. All the reviewers agreed that the explicit techniques exploited in the proposed methodology were not novel (i.e., G. Conv. Net. With temporal evolution) or as one reviewer succinctly described as a plug-and-play, and they also agreed that the theoretical analysis developed for the method is useful in that it provides insight into the functionality and the resulting improvement. Satisfactorily numerous experiments were conducted, and one reviewer insisted upon an explicit claim of a “first 60s ahead seizure prediction” (if that was the case, which the authors confirmed) and hence a first experimental breakthrough. Despite some minor critiques, the reviewers evidently all agreed that it was a solid paper with the majority ranking of an outright “accept”. Given the reviews and upon a quick reading of the paper, the AC is recommending acceptance for publication. The AC checked the provided source code location for anonymity, the link is non-operational : https://anonymous.4open.science/r/EvoBrain-FBC5