PaperHub
6.4
/10
Poster4 位审稿人
最低3最高5标准差0.7
4
4
5
3
4.0
置信度
创新性2.5
质量2.5
清晰度3.0
重要性2.3
NeurIPS 2025

HetSyn: Versatile Timescale Integration in Spiking Neural Networks via Heterogeneous Synapses

OpenReviewPDF
提交: 2025-04-13更新: 2025-10-29
TL;DR

We propose a SNN modeling framework that incorporates synaptic heterogeneity, an essential property largely overlooked in previous studies, and demonstrate its computational advantages and generalizability.

摘要

关键词
Spiking Neural NetworksSynaptic HeterogeneityMulti-Timescale IntegrationNeuron Modeling

评审与讨论

审稿意见
4

This paper introduces HetSyn, which incorporates synaptic heterogeneity into spiking neural networks (SNNs) to enable multi-timescale integration. The authors instantiated HetSyn in combination with LIF neurons as HetSynLIF and experimentally validated it in pattern generation, delayed match-to-sample, speech recognition and visual recognition tasks.

优缺点分析

Strength:

  1. HetSyn introduces synaptic heterogeneity into the SNN, which is consistent with biological mechanisms.
  2. The authors analyze the proposed HetSynLIF and point out that other LIF neuronal variants that exist can be treated as special cases of HetSynLIF.
  3. The authors demonstrated the effectiveness of the proposed method across multiple tasks.

Weakness:

  1. The details of HetSynLIF need to be described in more detail, such as how the synaptic time constants are set and their impact on performance; if the synaptic time constants are learnable, what is the evolutionary trend during training?
  2. Could the theoretical contribution of HetSynLIF to the spatio-temporal representation performance of SNNs be analyzed in depth?
  3. The authors should analyze the impact of setting up individual time constants for each synapse on the overhead of the SNN, as this incurs a large number of additional parameters.
  4. The experiments in this paper are limited to tiny taskss, making it difficult to demonstrate the effectiveness of the proposed method for challenging tasks and its scalability. Does HetSynLIF work for typical visual recognition tasks, such as CIFAR and ImageNet? Furthermore, [1] achieved significantly better performance on SHD, even with a single-layer SNN. How does this justify the performance advantage of HetSynLIF?

[1] Advancing Spatio-Temporal Processing in Spiking Neural Networks through Adaptation. arXiv. 2024.

问题

See weaknesses.

局限性

Yes.

最终评判理由

Although the author's response indicates that this paper has limited insights into typical static object recognition (such as CIFAR and ImageNet), I maintain my positive score given the paper's other strengths.

格式问题

No formatting issues.

作者回复

We sincerely thank you for your thoughtful feedback and hope that the responses below could address your concerns.


Weakness 1: The details of HetSynLIF need to be described in more detail, such as how the synaptic time constants are set and their impact on performance; if the synaptic time constants are learnable, what is the evolutionary trend during training?

WA1: We appreciate your attention to the setup and evolutionary trend of synaptic time constants, and provide further clarifications below.

Actually, we have included the initialization of synaptic time constants (τ\tau), along with other task-specific configurations in Appendices B–E to ensure reproducibility. Based on our experiments, the initialization of τ\tau does not significantly impact performance, as it is learnable and adjusts according to task requirements, either increasing or decreasing as needed. Additionally, while we have shown the changes in τ\tau before and after learning in Fig. 4D and Fig. A4, we have carefully considered your feedback and agree that the previous description lacked sufficient detail. Therefore, we conducted an additional experiment to provide a more detailed explanation of its evolution trend.

The DMS task inherently requires both short-term and long-term memory mechanisms: long-term memory is needed to retain cue information (e.g., "left" or "right") across the delay period, while short-term memory helps to suppress irrelevant noise. Ideally, training encourages the cue-channel τ\tau to increase toward the delay duration (~800 ms), while the noise-channel τ\tau decreases to promote fast forgetting. We initialized τ\tau from Gaussian distributions with means μ100,200,400ms\mu \in \\{100, 200, 400 \\} \text{ms} , where the standard deviation was set to 0.1μ0.1\mu, and trained the model using 5 independent runs.

InitializationCue-channelNoise-channel
N(100, 102)\mathcal{N}(100,~10^2)154.38(+54.38%)85.59(-14.41%)
N(200, 202)\mathcal{N}(200,~20^2)252.43(+26.22%)181.93(-9.04%)
N(400, 402)\mathcal{N}(400,~40^2)430.70(+7.68%)382.29(-4.43%)

We observed consistent trends across all initialization settings, with the learned τ\tau values exhibiting a clear divergence between input pathways.

  • For synapses receiving cue signals, the mean τ\tau increased significantly, enabling long-term information retention required for task completion.
  • For synapses receiving noise inputs, the mean τ\tau decreased notably, suggesting a functional adaptation to suppress irrelevant temporal integration.

The results align with our expectations and validate our hypothesis that τ\tau automatically learns and optimizes in the direction required by the task.


Weakness 2: Could the theoretical contribution of HetSynLIF to the spatio-temporal representation performance of SNNs be analyzed in depth?

WA2: We would like to provide the following analysis of the theoretical contribution of HetSynLIF to the spatio-temporal representation performance of SNNs.

Featuring synapse-specific decay factors, HetSynLIF effectively represents dynamic spatio-temporal inputs and captures the natural variation in temporal behaviors observed in biological systems. From a biological perspective, synaptic heterogeneity is a crucial factor in enabling neural systems to encode diverse temporal patterns. Our approach leverages these biological principles, applying them computationally to enhance spatio-temporal encoding in SNNs.

Empirical results show that during training, HetSyn allows each synapse to adapt its τ\tau according to task requirements, with synapses for long-term memory increasing τ\tau and those for short-term memory decreasing it. Additionally, theoretical proof in Appendix A demonstrates that HetSynLIF generalizes to several models, showing its ability to combine the strengths of these models based on the specific task requirements.

In summary, these features collectively enable HetSynLIF to excel in spatio-temporal representation performance. Although our primary focus is on its computational potential, we believe that theoretical analysis is also crucial and we will include the above analysis in the revised manuscript.


Weakness 3: The authors should analyze the impact of setting up individual time constants for each synapse on the overhead of the SNN, as this incurs a large number of additional parameters.

WA3: We conducted both a theoretical analysis and an additional set of experiments to evaluate the training cost. Due to the 10,000-character constraint, please refer to our response to Reviewer auyx (WA1 & QA1) for full table of results. We summarize the conclusions below and hope these analyses help clarify the impact of our design and address your concerns.

Theoretical and empirical results indicate that HetSynLIF indeed introduces a slight increase in training cost when evaluated under identical architectures and neuron counts. Given the notable performance gains and the potential of synaptic heterogeneity in SNNs, we consider the added training cost a reasonable trade-off at this early stage. This complexity is expected and biologically plausible, as synapses greatly outnumber neurons in the brain.

Moreover, HetSynLIF achieves comparable or even superior performance with significantly fewer neurons. For example, as the table shown in our response to Reviewer auyx (WA1 & QA1), F-HetSynLIF (20 neurons) outperforms R-HetNeuLIF (100 neurons) with lower memory and time costs. Additionally, Fig. 2D, Fig. 3B, Fig. 4A, and our response to Reviewer L2NP (WA2 & QA1) demonstrate that HetSynLIF converges faster, requiring fewer training iterations. Together, these findings suggest that the increased per-unit complexity is effectively offset and may even serve as an advantage in overall training cost.

In future work, we plan to explore methods to reduce the computational complexity of HetSyn, such as incorporating sparse connectivity designs. We will also explicitly discuss potential future directions in the revised manuscript.


Weakness 4.1: The experiments in this paper are limited to tiny tasks, making it difficult to demonstrate the effectiveness of the proposed method for challenging tasks and its scalability. Does HetSynLIF work for typical visual recognition tasks, such as CIFAR and ImageNet?

WA4.1: Below, we first clarify the rationale behind our task selection and then present preliminary results on an event-based visual recognition task to further support our claims.

Our work aims to explore the computational potential of synaptic heterogeneity for versatile timescale integration in SNNs. To this end, we prioritize tasks with rich or extended temporal structures. While HetSynLIF can work with typical visual recognition tasks like CIFAR and ImageNet, these datasets are static and lack meaningful temporal structure. In the SNN field, such tasks are often handled by repeating the same image over several timesteps, which does not provide meaningful temporal dynamics and is not aligned with our focus on versatile timescale integration.

Nevertheless, we have carefully considered your suggestion and conducted extra experiments on the DVS Gesture dataset, an event-based and real-world visual recognition task with richer temporal dynamics, and present preliminary results on the table below. Due to time constraints, we did not extensively optimize the architecture or hyperparameters. However, HetSynLIF outperformed HomNeuLIF in accuracy, which is sufficient to demonstrate the effectiveness of HetSyn in versatile timescale integration.

ModelDVS Gesture
F-HomNeuLIF95.49%
F-HetSynLIF96.18%

Weakness 4.2: [1] achieved significantly better performance on SHD, even with a single-layer SNN. How does this justify the performance advantage of HetSynLIF?

WA4.2: We apologize for not noticing this timely and insightful work [1] earlier, as it was still under review when we wrote our manuscript. After carefully reading the paper, we find its contributions both insightful and inspiring. However, we believe that the work presented in [1] does not conflict with our HetSynLIF approach.

In SNNs, the dynamics of neuron models are typically represented using differential equations. To simulate these dynamics on digital hardware, they must be discretized into a time-stepped form, with updates occurring at fixed intervals for computationally feasible approximations. A commonly used discretization method is the forward Euler scheme, favored for its simplicity and efficiency. In line with standard practice in the field, our HetSynLIF also adopts this approach. Despite its widespread use, the forward Euler method can have accuracy limitations. The work in [1] rigorously demonstrates the shortcomings of the conventional forward Euler discretization used in the ALIF model and addresses them by adopting the alternative Symplectic Euler discretization approach.

Therefore, the work in [1] and ours address distinct yet complementary challenges. While [1] focuses on addressing the stability and parameterization challenges in solving differential equations for SNNs, our work explores the computational potential of synaptic heterogeneity for versatile timescale integration and introduces a new model to leverage this potential.

We believe that incorporating this advanced discretization technique into HetSynLIF could further improve its performance. This presents a promising direction for future research, and we will cite and discuss this work in the Future Work part of the revised manuscript. We hope this response clarifies the relationship between [1] and our contributions and we welcome further discussion.


References

[1] Baronig M, et al. Advancing spatio-temporal processing through adaptation in spiking neural networks[J]. Nature Communications, 2025, 16(1): 5776.

评论

Dear Reviewer,

Thank you for your recognition of our work and continued engagement in the review process. We would appreciate your feedback on whether our clarifications — such as those on training cost and the evolution of parameters during training— have adequately addressed your concerns. Please let us know if any further explanation would be helpful.

Thank you again for your time and consideration!

Best regards,

Authors

评论

Thanks for the author's reply. I would like to maintain my positive score.

审稿意见
4

The authors propose the use of spiking neural networks (SNNs) for classification or temporal signal generation tasks, enhanced by introducing heterogeneity in synaptic time constants. They demonstrate that these networks outperform conventional networks lacking this variability.

优缺点分析

One of the notable strengths of this paper is its foundation in physiological data, illustrating how such biological inspiration can enhance the performance of spiking neural networks. The performance improvements are evident across the tasks presented, with classification results that surpass those of similar networks (RSNNs) without heterogeneity. The application to a delayed match-to-sample task is particularly compelling.

A limitation of this study is its incremental nature compared to previous work that has explored variability in SNNs, such as citation 22. Additionally, the paper lacks an interpretation of the parameters obtained after training, which could provide insights into the underlying mechanisms driving the improved performance.

问题

When training your network on the delayed match-to-sample task, can you interpret the parameters obtained post-training? Specifically, in the distribution of time constants, do you observe values on the order of the delay duration (around 800 milliseconds)?

The state-of-the-art performance for the SHD dataset has been achieved using spiking neural networks with delays. Do you believe it would be feasible to combine these two approaches to further enhance performance?

局限性

A limitation of this work is its relatively incremental progress compared to existing research, particularly citation 22.

最终评判理由

The response to my comments were encouraging, however, I will not raise my score and keep it to 4.

格式问题

some minor typos : "The results of the experiments shows significant improvement." - "After analyzing the data. We found significant trends." - "The model was trained on a large dataset it achieved high accuracy." - ...

作者回复

We sincerely thank you for your thoughtful feedback and hope that the responses below could address your concerns.


Weakness 1 & Limitation 1: A limitation of this study is its incremental nature compared to previous work that has explored variability in SNNs, such as citation 22.

WA1 & LA1: We appreciate your concern regarding the contribution of our work in comparison to prior studies, and your thoughtful engagement with the relevant literature. While we respectfully disagree with the assessment that our work is relatively incremental progress, we appreciate the opportunity to clarify its contributions and highlight the key distinctions between our approach and prior efforts, particularly citation [22].

As we stated in the introduction, we focus on strengthening the biological foundations of SNNs by revisiting fundamental neurobiological mechanisms. Specifically, we focus on synaptic heterogeneity, a well-established phenomenon in neuroscience known to support versatile timescale integration and cognitive functions. Despite its recognized importance in biological systems, its computational potential in SNNs remains largely unexplored and highly promising. To the best of our knowledge, our work is the first to investigate this direction and is fundamentally different from existing approaches. We believe this represents a meaningful and impactful contribution to the field, which we summarize below.

  1. Bio-Grounded Design: Drawing on biological evidence, HetSyn is the first SNN modeling framework to incorporate synaptic heterogeneity for versatile timescale processing.

  2. Detailed Theoretical Proof: We provide a theoretical analysis demonstrating HetSyn’s generalization capacity with diverse spiking neuron models.

  3. Strong Empirical Results: We instantiate HetSyn as HetSynLIF and demonstrate its competitive performance across multiple tasks and benchmarks.

  4. Parameter Insight: We provide analysis of the learned time constants, offering interpretability and task-aligned insights (see Fig. 4D, Fig. A4, and our responses to Weakness 2 and Question 1).

We greatly appreciate the valuable insights provided by citation [22] on heterogeneity in SNNs, which we regard as an important step toward understanding diversity in neural computation. Nonetheless, our work differs in several important ways and provides a complementary perspective to prior work, as detailed below.

  1. Finer Level of Heterogeneity: While citation [22] investigates neuron-level heterogeneity by varying membrane properties across neurons, our work focuses on the more fine-grained and biologically fundamental synapse-level heterogeneity.

  2. Different Core Objectives: Citation [22] emphasizes neuron-level variation for improving robustness, while our work centers on synaptic-level mechanisms to support versatile timescale integration.

  3. Broader Experimental Validation: In addition to the robustness and timescale generalization tests conducted in [22], we further evaluated our model on working memory and limited-resource scenarios, thereby providing a more comprehensive validation of its capabilities.

Notably, other reviewers have also acknowledged the originality and value of our contributions. That said, we recognize that the initial manuscript may not have been sufficiently clear, and we have carefully revised the manuscript to present our ideas more clearly. Again, we sincerely thank you for your thoughtful review and we hope this clarification helps address your concerns regarding the scope and contribution of our work.


Weakness 2: Additionally, the paper lacks an interpretation of the parameters obtained after training.
Question 1: When training your network on the delayed match-to-sample task, can you interpret the parameters obtained post-training? Specifically, in the distribution of time constants, do you observe values on the order of the delay duration (around 800 milliseconds)?

WA2 & QA1: We thank you for this insightful question regarding the interpretability of learned time constants in the delayed match-to-sample (DMS) task. Motivated by your comment, we have conducted an additional analysis to explore this question in greater depth. We hope this new experiment addresses your concern, and we sincerely thank you for bringing up this important point.

The DMS task inherently requires both short-term and long-term memory mechanisms: long-term memory is needed to retain cue information (e.g., "left" or "right") across the delay period, while short-term memory helps to suppress irrelevant noise. Ideally, training encourages the cue-channel τ\tau to increase toward the delay duration (~800 ms), while the noise-channel τ\tau decreases to promote fast forgetting.

To test this hypothesis, we initialized τ\tau from Gaussian distributions with means μ100,200,400 ms\mu \in \\{100,200,400\\}~\text{ms} , where the standard deviation was set to 0.1μ0.1\mu, and trained the model. For each setting, we conducted 5 independent runs, and reported the averaged results in the table below.

InitializationCue-channel(after training)Noise-channel(after training)Synapses with τ\tau > 700 ms (after training)AccuracyAccuracy(masked long τ\tau)
N(100, 102)\mathcal{N}(100,~10^2)154.38(+54.38%)85.59(-14.41%)5.25%99.97%75.68%(-24.29%)
N(200, 202)\mathcal{N}(200,~20^2)252.43(+26.22%)181.93(-9.04%)6.05%100.00%76.47%(-23.53%)
N(400, 402)\mathcal{N}(400,~40^2)430.70(+7.68%)382.29(-4.43%)5.80%99.87%77.15%(-22.72%)

We observed consistent trends across all initialization settings, with the learned τ\tau values exhibiting a clear divergence between input pathways.

  • For synapses receiving cue signals, the mean τ\tau increased significantly, enabling long-term information retention required for task completion.

  • For synapses receiving noise inputs, the mean τ\tau decreased notably, suggesting a functional adaptation to suppress irrelevant temporal integration.

We further analyzed the distribution of τ\tau after training, and observed values on the order of the delay duration (around 800 ms). Specifically, we used τ700 ms\tau\ge 700~\text{ms} as the statistical criterion and found that approximately 5-6% of synapses exhibited time constants in the same order of magnitude as the delay duration.

We hypothesize that this small subset of long-τ\text{long-}\tau synapses plays a critical role in solving the task. To verify this, we masked these long-τ\text{long-}\tau synapses by resetting their τ\tau values to their original initialization means, and re-evaluated the model. The results showed a significant drop in accuracy after masking, supporting the importance of these long-timescale synapses.

In summary, we analyzed and interpreted the τ\tau obtained after training and demonstrated the task-aligned functional specialization of synaptic time constants. These results further support the effectiveness of HetSyn in handling versatile timescale integration by adaptively adjusting τ\tau values based on input characteristics. We sincerely thank you for raising this insightful and technically relevant question. We will include the above analysis and results in the appendix of the revised manuscript.


Question 2: The state-of-the-art performance for the SHD dataset has been achieved using spiking neural networks with delays. Do you believe it would be feasible to combine these two approaches to further enhance performance?

QA2: We thank you for highlighting the success of delay-based SNNs on the SHD dataset. Indeed, recent works [1, 2] have demonstrated that incorporating synaptic delays can significantly improve task performance in temporally complex datasets like SHD. In recognition of their relevance and contributions, we believe it is necessary to cite these works in the revised manuscript.

Although both synaptic delays and our proposed method operate at the synaptic level, they reflect different biological properties and address distinct aspects of temporal processing and may complement each other.

  • Synaptic delays primarily facilitate spatial-temporal synchronization by shifting spike timings, thereby aligning neural activity across channels. However, they do not alter the intrinsic temporal dynamics of individual synapses.
  • Our trainable synaptic time constants enable versatile timescale integration by dynamically adjusting how long past information is retained, allowing the model to better capture temporally diverse input patterns.

We believe that combining delays with HetSyn is not only feasible but potentially beneficial. While delays enhance spatial-temporal synchronization by aligning spikes across input channels, HetSyn enables fine-grained control over information retention across versatile timescales. Together, these mechanisms may allow SNNs to handle both precise spike timing and temporally diverse patterns more effectively, potentially leading to improved performance on complex temporal tasks. We consider this an exciting direction for future work, and we are interested in exploring how the synergy of delay-based alignment and versatile timescale integration can further enhance performance on temporal tasks.


Paper Formatting Concerns: some minor typos : "The results of the experiments shows significant improvement." - ...

Response to Paper Formatting Concerns: We are encouraged that you took a close interest in our work, and we greatly appreciate your careful reading and attention to detail. We have carefully reviewed the manuscript for any potential typos or clarity issues.


References

[1] Hammouamri I, et al. Learning delays in spiking neural networks using dilated convolutions with learnable spacings[J]. arXiv preprint arXiv:2306.17670, 2023.

[2] Baronig, M., et al. Advancing spatio-temporal processing through adaptation in spiking neural networks. Nat Commun 16, 5776 (2025).

评论

Dear Reviewer,

Thank you for your recognition of our work and continued engagement in the review process. We would appreciate your feedback on whether our clarifications — such as those on parameters obtained after training — have adequately addressed your concerns. Please let us know if any further explanation would be helpful.

Thank you again for your time and consideration!

Best regards,

Authors

评论

dear authors,

the responses were satisfactory and I have no more comments or changes to make to my evaluation.

cheers

评论

Dear Reviewer,

Thank you for your time and for acknowledging our rebuttal. We greatly appreciate your effort in reviewing our paper, especially considering your demanding workload. We are glad that our responses addressed your concerns to your satisfaction. As we carefully reviewed your comments, we believe that the revisions and clarifications made in response to your feedback further strengthen the contribution of our work. We hope these adjustments align with the expectations for NeurIPS and offer clearer insights into the significance of our research, and we would be honored if you could consider these improvements in your final assessment.

Best regards,

Authors

审稿意见
5

This work proposes HetSyn, a variant of a spiking neural network where the synaptic time constant of each synapse is treated as an independent trainable parameter. The authors applied the proposed model to four different tasks and demonstrated that it outperforms the model where only the neuron-level membrane time constant is trainable and models without plastic decay time constants.

优缺点分析

Strength:

The numerical experiments clearly demonstrate the advantage of the proposed model against the model with adaptive membrane time constant in pattern generation, working memory, and speech recognition tasks.

Given the popularity of the membrane time constant adaptation in spiking neural network literature (e.g., Fang et al., ICCV 2021), I thought trainable synaptic time constant has also been implemented before, but I couldn’t find a previous work that specifically implements this variant. Several works previously introduced trainable synaptic delays (see the reference below), but these mechanisms are mathematically distinct from trainable synaptic time constant introduced here.

Weakness:

The data analysis presented in Fig. 1A is misleading from a neuroscience perspective. In excitatory synapses in the cortex, there are two neuro-transmitter types: AMPA and NMDA, having different decay timescales. AMPA decays fast in a few milliseconds, while NMDA decays slowly in the order of 100 milliseconds. Fig. 1A shows a bimodal distribution which presumably corresponds to AMPA and NMDA components. The claim of ‘mildly long-tailed distribution’ is somewhat odd, given this underlying biology. Additionally, the importance of having these two neurotransmitters with different timescales has been studied extensively in computational neuroscience literature (e.g., Brunel and Wang, J Comp Neurosci, 2001; Panzeri et al., Trends Neurosci 2010).

Some of the comparisons are not well controlled. For instance, in Fig. 2D, the authors claim that the HetSynLIF model converges faster than the other models. However, the comparison was performed at a single learning rate; thus it does not rule out the possibility that this particular learning rate is optimized for HetSynLIF over others. To claim faster convergence, the hyperparameters of each model should be optimized individually for their best performance for a fair comparison.

问题

Have you compared the proposed model with the previous models where synaptic delays are trained? In what kind of tasks do you think the proposed model performs better?

Refs:

Zhang, Malu, et al. "Supervised learning in spiking neural networks with synaptic delay-weight plasticity." Neurocomputing 409 (2020): 103-118.

Yu, Qiang, et al. "Improving multispike learning with plastic synaptic delays." IEEE Transactions on Neural Networks and Learning Systems 34.12 (2022): 10254-10265.

局限性

Please see the comments above

最终评判理由

The rebuttal effectively addressed my concerns regarding the relationship with previous related works and hyper-parameter issues. While their claims about the long-tailed synaptic time constant distribution in the brain are somewhat exaggerated, the argument is sufficiently sound to serve as motivation for artificial spiking neural networks training. Overall, I believe the manuscript meets the criteria for acceptance.

格式问题

none

作者回复

We sincerely thank you for your thoughtful feedback and hope that the responses below could address your concerns.


Weaknesses

Weakness 1: The data analysis presented in Fig. 1A is misleading from a neuroscience perspective. ... Fig. 1A shows a bimodal distribution which presumably corresponds to AMPA and NMDA components. The claim of 'mildly long-tailed distribution' is somewhat odd, given this underlying biology.

WA1: Thank you for the insightful and biologically grounded comments relevant to Fig. 1A. We apologize for the insufficient explanation in the original manuscript and provide further clarification below.

To the best of our knowledge, AMPA and NMDA are receptor types rather than neurotransmitters. Both are ionotropic glutamate receptors, and glutamate is the primary excitatory neurotransmitter in the cortex. As you rightly noted, biological neurons express various receptor types beyond AMPA and NMDA, each mediating post-synaptic currents with distinct decay timescales. However, our model does not aim to explicitly capture neurotransmitter- or receptor-level mechanisms. Instead, we work at a higher level of abstraction(the synapse level) and use the effective synaptic time constant, which refers to the characteristic timescale over which a post-synaptic response decays from its peak back to baseline. This effective timescale typically reflects the combined influence of multiple biophysical processes, such as neurotransmitter release and clearance, receptor kinetics (e.g., AMPA, NMDA), among others.

Regarding the bimodal distribution shown in Fig. 1A: We would like to clarify that it does not correspond to AMPA and NMDA components. As discussed above, the distribution represents the effective synaptic time constant, not receptor-specific dynamics. The data were obtained without any modification from a published dataset (Campagnola L, et al., Science, 2022 [1]), which reports empirically measured values of effective synaptic decay time constants across biological synapses. Notably, since long-tailed distributions of synaptic time constants are commonly observed in biological systems, the dataset applied a clipping procedure by setting all values above 500 ms to 500 ms. As a result, some data points accumulate at 500 ms, giving rise to an artificial secondary peak at that point. Thus, supported by biological evidence, the dataset structure, and the distribution in Fig. 1A, our description of it as a 'mildly long-tailed distribution' is appropriate and consistent with the data. To avoid potential misunderstanding, we will revise the figure in the updated manuscript by using a distinct color for the final histogram bin and excluding it from the KDE fit.


Weakness 2: Some of the comparisons are not well controlled. For instance, in Fig. 2D, ..., the hyperparameters of each model should be optimized individually for their best performance for a fair comparison.

WA2: We appreciate your concern regarding the fairness of the convergence comparison in Fig. 2D, and we would like to clarify our motivation and provide further evidence to support our findings. In pattern generation task, we used a fixed learning rate of 1e-3 and kept all training settings identical across models except for their architectures to ensure fair and consistent comparisons. We emphasize that this learning rate was not selected to favor HetSynLIF. Instead, it's the default setting of the Adam optimizer in PyTorch and is widely adopted. Furthermore, we observe that many prior studies (e.g., [2, 3]) also adopt the same learning rate across models when evaluating convergence speed, suggesting that our setup is consistent with common practices in related work.

Nevertheless, we have carefully considered your suggestion and found it necessary to include additional experimental results to further support our conclusions. However, due to time and computational resource constraints, we were unable to conduct an exhaustive grid search across all models. To address this concern, we evaluated all models mentioned in Fig.2D under four different learning rates uniformly sampled between 1e-4 and 5e-3, and we present the corresponding MSE with respect to training iterations for each model in the table below. The results show that, under identical training iterations, HetSynLIF consistently demonstrates the fastest convergence and the lowest final MSE, suggesting that it completes the task more effectively than its counterparts. We will revise the manuscript to better clarify this point and include these additional results to strengthen the experimental rigor.

Training Iteration (lr = 1e-4)501002005001000
F-HomNeuLIF0.2820.2640.2440.2250.222
F-HetNeuLIF0.2770.2510.2200.2010.192
F-HomNeuALIF0.2610.2440.2320.2240.221
F-HetSynLIF0.2540.2270.2140.1600.125
Training Iteration (lr = 5e-4)501002005001000
F-HomNeuLIF0.2170.2110.2080.2070.206
F-HetNeuLIF0.2160.2110.2010.1670.145
F-HomNeuALIF0.2340.2240.2230.2190.218
F-HetSynLIF0.2110.1960.1340.0590.041
Training Iteration (lr = 1e-3)501002005001000
F-HomNeuLIF0.2130.2080.2060.2050.204
F-HetNeuLIF0.2130.2060.1810.1410.137
F-HomNeuALIF0.2240.2200.2180.2110.207
F-HetSynLIF0.2090.1780.1390.0480.021
Training Iteration (lr = 5e-3)501002005001000
F-HomNeuLIF0.2190.2150.2140.2140.214
F-HetNeuLIF0.2150.1930.1870.1630.162
F-HomNeuALIF0.2180.2170.2150.2090.213
F-HetSynLIF0.1670.1610.1570.1190.015

Questions

Question 1: Have you compared the proposed model with the previous models where synaptic delays are trained? In what kind of tasks do you think the proposed model performs better?

QA1: We sincerely thank you for the insightful comments and for recognizing the novelty of our proposed approach. To the best of our knowledge, we are the first to incorporate synaptic heterogeneity into SNNs, which offers a biologically plausible yet computationally powerful approach for versatile timescale integration.

We appreciate you for pointing out the two excellent works [4, 5] on synaptic delays. These studies highlight the potential of synaptic delays to enhance SNN performance, which aligns well with our motivation as stated in the introduction: "This calls for new learning paradigms, either by adapting established methods from ANNs or by drawing inspiration from biological mechanisms." Synaptic delay is indeed a crucial biological feature at the synapse level and clearly falls into the latter category. Based on this, we believe it is necessary to cite these two works in our paper. We will include the aforementioned delay-based studies in the revised manuscript and acknowledge their relevance in the Introduction section.

The first part of this question: While we did not explicitly discuss synaptic delays in the main text, we did include one representative delay-based method (corresponding to the first work mentioned by the reviewer) in our empirical comparisons in Table 1. We did not initially elaborate on this comparison because synaptic delays and our proposed method reflect different biological properties and address distinct aspects of temporal processing, although both operating at the synaptic level.

  • Synaptic delays primarily facilitate spatial-temporal synchronization by shifting spike timings, thereby aligning neural activity across channels. However, they do not alter the intrinsic temporal dynamics of individual synapses.
  • In contrast, our trainable synaptic time constants enable versatile timescale integration by dynamically adjusting how long past information is retained, allowing the model to better capture temporally diverse input patterns.

As a quick empirical comparison, we observed that both referenced works and our proposed model have been evaluated on the TiDigits dataset, where our model achieves higher accuracy than the delay-based models. We include this result to offer a reference point, but we emphasize that it does not constitute the core of our contribution. Our main contribution lies in introducing synaptic heterogeneity as a novel biologically inspired mechanism for SNNs, and demonstrating its effectiveness in versatile timescale integration. This expands the design space of SNNs and strengthens the link between biological plausibility and computational capability.

MethodsAccuracy
ReSuMe-DW[4]92.45%
PBSNLR-DW[4]96.50%
TDP-DL[5]97.16%
HetSynLIF(Ours)98.99%

The second part of this question: Based on the aforementioned differences between synaptic delays and our proposed method, we believe that our proposed model is especially beneficial in tasks that involve rich or diverse timescale structures, such as speech recognition, working memory–dependent tasks, and complex spatio-temporal information processing.


References

[1] Campagnola L, et al. Local connectivity and synaptic dynamics in mouse and human neocortex[J]. Science, 2022, 375(6585): eabj5861.

[2] Fang W, et al. Incorporating learnable membrane time constant to enhance learning of spiking neural networks[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 2661-2671.

[3] Liu F, et al. STNet: A novel spiking neural network combining its own time signal with the spatial signal of an artificial neural network[J]. Frontiers in Neuroscience, 2023, 17: 1151949.

[4] Zhang M, et al. Supervised learning in spiking neural networks with synaptic delay-weight plasticity[J]. Neurocomputing, 2020, 409: 103-118.

[5] Yu Q, et al. Improving multispike learning with plastic synaptic delays[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 34(12): 10254-10265.

评论

I thank the authors for their reply which addressed majority of my concerns.

Regarding the distribution of synaptic time constants, could you provide more details on the data handling? Looking at the histogram in Fig. 3C of Campagnola et al., Science 2022, it seems that synapses with PSP decay times longer than 100 ms are rare. Additionally, the long-tailed components appear to be mostly attributed to inhibitory synapses, whereas excitatory connections generally exhibit fast decay times. Though this suggests my initial hypothesis about the AMPA/NMDA ratio variability was incorrect, it implies that the overall long-tailed distribution is due to different types of synapses rather than heterogeneity within a single type of synapse. While this point may not be crucial for a machine learning proceeding, it should be discussed thoroughly for scientific rigor.

评论

Dear Reviewer,

We are glad to hear that our previous response has helped clarify your initial hypothesis. We would also like to take this opportunity to further address potential misunderstandings regarding the data distribution and its implications, which are central to the motivation of our study.

First, the data distribution shown in Fig. 1A of our paper is extracted from the publicly available dataset by Campagnola et al. (Science, 2022). In the original dataset, timescales are bounded at 500 ms, as reflected by the rightmost bar in our plot. Our initial curve fitting incorporated this boundary, inadvertently resulting in a secondary peak that should not have been included. We will clarify this and explicitly cite the data source in the caption of Fig. 1.

Second, you are right in pointing out the sparse occurrence of longer decay times (e.g., >100 ms). Our Fig. 1A includes all valid connections from the dataset, whereas Fig. 3C in the reference paper shows histograms for only the major connection classes (as indicated in the figure caption), which may explain the differing impression you referred to.

Third, biological synapses exhibit strong heterogeneity: not only between excitatory and inhibitory types, but also within specific connections. This is clearly shown in the third column of Fig. 3C in the reference, which displays broad timescale distributions for both excitatory (top) and inhibitory (bottom) synapses. Even individual connection types, such as “l5et  →  l5et” (blue) and “sst  →  vip” (pink), show marked diversity in the swarm plots. This synaptic heterogeneity directly motivates our theoretical investigation from a machine learning perspective.

Finally, we appreciate that our work resonates with experts like you who emphasize the fundamental processing mechanisms of the central nervous system. We are more than happy to incorporate your suggestions and expand our discussion accordingly in the revision.

Best regards,

Authors

评论

Dear Reviewer,

We hope this message finds you well. Given that the discussion window will close within the next 24 hours, could you kindly let us know whether our response has addressed your concerns, and whether the revisions have had any impact on your evaluation. We are grateful for your kindness and suggestions for the improvements of our revision. Thanks.

Best regards,

Authors

评论

Thank you for your reply. Overall, I believe the manuscript makes a solid contribution to the spiking neural network literature.

On a side note, I find your second and third points in the replies above somewhat contradictory. Your second point implies that very slow time constants (>100ms) result from minor connection classes, and the heterogeneity within major connection classes is relatively small. Nevertheless, if you assume that artificial neural network training reflects the learning of connection classes, not just weights, I suppose it won't be an issue. As I mentioned earlier, please describe the details of the plot and discuss the limitations of its interpretation in the revised manuscript.

评论

We thank the reviewer for recognizing our contribution to the SNN literature. Both Points 2 and 3 are interpretations of figures in the referenced Science paper. That work analyzed specific connection classes in detail, whereas our figure plots the overall distribution directly from the complete released raw dataset without any further processing. Motivated by the overall heterogeneity observed in neuroscience experiments—as the reviewer noted—we assumed our SNNs could be trained to reflect and benefit from such heterogeneity. We will clarify the plot details and note interpretation limitations in the revision, and we sincerely appreciate the reviewer’s insightful comments, which will help strengthen the manuscript.

审稿意见
3

The author proposes a framework HetSyn for modeling prominent heterogeneity by combining the widely observed but underutilized characteristic of synaptic heterogeneity in biological neurons. This framework combines synaptic specific time constants to reflect synaptic heterogeneity and is instantiated through HetSyn LlF, achieving time integration across different time scales at the synaptic level. This model takes several existing types of neurons (such as Alif, heterogeneous LIF) as special cases and evaluates them on multiple tasks, including pattern generation, delayed matching samples, samples, speech recognition (SHD), and visual recognition.

优缺点分析

Strengths:

  1. Introducing synaptic heterogeneity to achieve versatile timescale integration of SNNs, based on empirical neurobiological evidence. This contributes to computational neuroscience and neuromorphic computing.
  2. Comprehensive evaluation, spanning multiple tasks and datasets. The design of ablation research and robustness testing (noise, time distortion, resource constraints) is reasonable.
  3. The paper provides theoretical analysis for the proposed method.

Weaknesses:

  1. Introducing trainable specific synaptic time constants can significantly increase the number of parameters and may increase the complexity of optimization. Suggest quantifying training costs (such as power consumption, memory, time) and comparing them with existing work.
  2. This paper lacks an evaluation of event based datasets in the real world, such as DVS Gesture or DVS-CIFAR10. Given that the model emphasizes the temporal dynamics of synapses, testing it on asynchronous, event driven data will better demonstrate its practicality and robustness.

问题

  1. Can author quantify training costs (such as power consumption, memory, time) and comparing them with existing work.
  2. Can the authors either provide preliminary results on event-driven benchmarks or clearly justify why these were excluded? Would the model need adaptation to handle asynchronous input?

局限性

The Limitation section is brief. It is recommended to discuss potential limitations, such as introducing time constants for each synapse, which significantly increases the optimization dimension and may lead to gradient instability or slow convergence.

格式问题

No Formatting Concerns

作者回复

We sincerely thank you for your thoughtful feedback and hope that the responses below could address your concerns.


Weakness 1: Introducing trainable specific synaptic time constants can significantly increase the number of parameters and may increase the complexity of optimization. Suggest quantifying training costs (such as power consumption, memory, time) and comparing them with existing work.
Question 1: Can author quantify training costs (such as power consumption, memory, time) and comparing them with existing work?

WA1 & QA1: Thank you for this thoughtful and comprehensive comment. We agree that introducing trainable synaptic time constants may increase the model complexity, and it is indeed important to quantify the associated training costs. Motivated by this valuable suggestion, we conducted both a theoretical analysis and an additional set of experiments to evaluate the training cost, and we report the results below. We hope these analyses help clarify the impact of our design and address your concern.

Theoretical Analysis: Following the methodology of [1], we conducted a theoretical comparison of the number of parameters and operations (multiplications and accumulations) across several models, based on the neuronal dynamics that define their computation processes. These include:

  • vanilla SNNs (denoted as HomNeuLIF)
  • dendrite heterogeneity models (DH-SFNN and DH-SRNN [1])
  • neuron heterogeneity models (HetNeuLIF [2])
  • neuron with threshold adaptation (HomNeuALIF [3])
  • our proposed synaptic heterogeneity model (HetSynLIF)

For consistency, we assume each layer contains NN neurons and receives MM inputs, with average input and output spike rates denoted as rinr_{\text{in}} and routr_{\text{out}}, respectively. DD represents the number of dendritic branches per neuron, as used in [2].

ModelsSynaptic parametersNeuronal parametersTotal parametersTotal multiplications/timestepTotal accumulations/timestep
F-HomNeuLIFMNMN0MNMNNNMNrin+Nrout+NMNr_{in}+Nr_{out}+N
F-HetNeuLIFMNMNNNMN+NMN+NNNMNrin+Nrout+NMNr_{in}+Nr_{out}+N
F-HomNeuALIFMNMN0MNMNNNMNrin+Nrout+NMNr_{in}+Nr_{out}+N
F-HetSynLIF2MN2MN02MN2MNMN+NMN+NMNrin+Nrout+NMNr_{in}+Nr_{out}+N
DH-SFNNMNMNN+NDN+NDMN+(D+1)NMN+(D+1)N(2D+2)N(2D+2)NMNrin+Nrout+(2+3D)NMNr_{in}+Nr_{out}+(2+3D)N
R-HomNeuLIFMN+NNMN+NN0MN+NNMN+NNNNMNrin+NNrout+Nrout+NMNr_{in}+NNr_{out}+Nr_{out}+N
R-HetNeuLIFMN+NNMN+NNNNMN+NN+NMN+NN+NNNMNrin+NNrout+Nrout+NMNr_{in}+NNr_{out}+Nr_{out}+N
R-HomNeuALIFMN+NNMN+NN0MN+NNMN+NNNNMNrin+NNrout+Nrout+NMNr_{in}+NNr_{out}+Nr_{out}+N
R-HetSynLIF2MN+2NN2MN+2NN02MN+2NN2MN+2NNMN+NMN+NMNrin+NNrout+Nrout+NMNr_{in}+NNr_{out}+Nr_{out}+N
DH-SRNNMN+NNMN+NNN+NDN+NDMN+NN+(D+1)NMN+NN+(D+1)N(2D+2)N(2D+2)NMNrin+NNrout+Nrout+(2+3D)NMNr_{in}+NNr_{out}+Nr_{out}+(2+3D)N

Experimental Analysis: We then conducted an empirical evaluation using the DMS task as a representative benchmark. Specifically, we measured the training time for 1000 iterations, the maximum GPU memory usage, and the final accuracy on a single NVIDIA RTX 4060 GPU. The results are summarized in the table below.

n_hid=20F-HomNeuLIFF-HetNeuLIFF-HomNeuALIFF-HetSynLIFR-HomNeuLIFR-HetNeuLIFR-HomNeuALIFR-HetSynLIF
Training Time(s)96.4106.36111.26136.24110.11124.11124.47176.02
Peak GPU Memory(MB)19.5619.8219.5620.7819.5720.3419.5731.92
Accuracy(%)~50~50~65~100~50~70~100~100
n_hid=50F-HomNeuLIFF-HetNeuLIFF-HomNeuALIFF-HetSynLIFR-HomNeuLIFR-HetNeuLIFR-HomNeuALIFR-HetSynLIF
Training Time(s)96.78110.1112.5139.22111.24124.97125.84180.6
Peak GPU Memory(MB)22.5023.8422.5150.1522.5725.2322.57118.75
Accuracy(%)~50~60~70~100~50~75~100~100
n_hid=100F-HomNeuLIFF-HetNeuLIFF-HomNeuALIFF-HetSynLIFR-HomNeuLIFR-HetNeuLIFR-HomNeuALIFR-HetSynLIF
Training Time(s)96.11105.51111.96140.86119.54138.07123.53194.16
Peak GPU Memory(MB)27.5730.1427.5899.1327.7932.927.8379.05
Accuracy(%)~50~70~75~100~50~90~100~100

Theoretical and empirical results indicate that HetSynLIF indeed introduces a slight increase in training cost when evaluated under identical architectures and neuron counts. Given the notable performance gains and the potential of synaptic heterogeneity in SNNs, we consider the added training cost a reasonable trade-off at this early stage. This complexity is expected and biologically plausible, as synapses greatly outnumber neurons in the brain.

Moreover, HetSynLIF achieves comparable or even superior performance with significantly fewer neurons. For example, as shown in the above table, F-HetSynLIF (20 neurons) outperforms R-HetNeuLIF (100 neurons) with lower memory and time costs. Additionally, Fig. 2D, Fig. 3B, Fig. 4A, and our response to Reviewer L2NP (WA2 & QA1) demonstrate that HetSynLIF converges faster, requiring fewer training iterations. Together, these findings suggest that the increased per-unit complexity is effectively offset and may even serve as an advantage in overall training cost.

In future work, we plan to explore methods to reduce the computational complexity of HetSyn, such as incorporating sparse connectivity and low-complexity designs in [1, 3]. We will also explicitly discuss this limitation and potential future directions in the revised manuscript.


Weakness 2: This paper lacks an evaluation of event based datasets in the real world,..., event driven data will better demonstrate its practicality and robustness.
Question 2: Can the authors either provide preliminary results on event-driven benchmarks or clearly justify why these were excluded? Would the model need adaptation to handle asynchronous input?

WA2 & QA2: We sincerely thank you for highlighting the importance of evaluating our method on real-world, event-based datasets such as DVS Gesture and DVS-CIFAR10. Below, we first explain why these datasets were not included in the original submission and then present preliminary results on DVS Gesture dataset in response to your insightful suggestion.

In this work, we focus on exploring the potential of synaptic heterogeneity in versatile timescale integration. For this reason, we prioritize tasks that involve longer temporal dependencies, where the benefits of synaptic heterogeneity can be more effectively observed. While DVS datasets are indeed event-driven and asynchronous, their temporal windows are typically short (often fewer than 20 timesteps in common configurations). Given that the membrane time constant of a standard LIF neuron is commonly set to 20 ms, it is typically sufficient to capture the temporal structure within these brief input windows. As a result, these datasets may not fully exploit the diverse temporal integration capabilities that HetSyn is designed to provide. In contrast, Pattern Generation (2000 timesteps) and Delay Match-to-Sample (1050 timesteps), along with datasets such as SHD (250 timesteps, which is also an event-based dataset) and S-MNIST (784 timesteps), offer richer temporal dynamics over longer time scales, making them more suitable for evaluating the advantages of HetSyn.

Nevertheless, in full respect of the your valuable suggestion, we conducted preliminary experiments on the DVS Gesture dataset to assess the generality of our approach. We adopted a 5Conv+1FC architecture, with HetSyn incorporated into the FC layer. The model achieved 96.18% accuracy, compared to 95.49% for the baseline HomNeuLIF configuration. This result demonstrates that even in a limited application setting, HetSyn can improve performance on event-driven data.

ModelDVS Gesture
F-HomNeuLIF95.49%
F-HetSynLIF96.18%

Due to time constraints, we did not extensively optimize the architecture or hyperparameters; with further tuning and broader integration of HetSyn, the results may be even better. That said, our primary contribution lies not in achieving state-of-the-art accuracy on specific benchmarks, but in proposing a general and biologically inspired mechanism to enable more flexible and effective temporal integration in SNNs. We appreciate your feedback and plan to explore broader architectural integration and evaluate HetSyn on additional datasets in future work.


Limitation 1: The Limitation section is brief. It is recommended to discuss potential limitations, such as introducing time constants for each synapse, which significantly increases the optimization dimension and may lead to gradient instability or slow convergence.

LA1: Thank you for pointing out this important issue. As discussed in WA1 & QA1, we are aware of introducing synapse-level time constants in our model inevitably increases training cost. We acknowledge that these practical limitations are indeed important considerations.

In the revised manuscript, we have included a more detailed discussion of these challenges in the Limitation section, and additionally outlined potential directions for addressing them in future work. To provide a clearer and more complete view of the resource demands associated with our approach, we also include the theoretical and empirical training cost analyses from WA1 & QA1 in the Appendix.

As for the potential concern regarding slow convergence, we are surprised to observe that HetSynLIF in fact converges faster in practice (Fig. 2D, Fig. 3B, Fig. 4A). Due to space limitations, we kindly refer you to our response to Reviewer L2NP (WA2 & QA1) for detailed empirical evidence.


References

[1] Nicolas Perez-Nieves, et al. Neural heterogeneity promotes robust learning. Nature communications, 12(1):5791, 2021.

[2] Hanle Zheng, et al. Temporal dendritic heterogeneity incorporated with spiking neural networks for learning multi-timescale dynamics. Nature Communications, 15(1):277, 2024.

[3] Guillaume Bellec, et al. Long short-term memory and learning-to-learn in networks of spiking neurons. Advances in neural information processing systems, 31, 2018.

评论

Dear Reviewer,

Thank you for completing the mandatory acknowledgement. We are writing to kindly follow up on the rebuttal we provided and hope you have had a chance to review it. We would be grateful to know whether your concerns about training cost and event-driven benchmarks have been addressed and your assessment of the paper has changed. If there are any additional questions or concerns, we would be more than happy to provide further clarification.

Thank you again for your time and consideration!

Best regards,

Authors

评论

Hi Reviewer,

Sorry to bother you again, but given the limited time left, we would greatly appreciate hearing your thoughts on whether our rebuttal has addressed your concerns and whether these revisions have influenced your evaluation.

Thank you for your time and consideration. We look forward to your feedback.

Best regards,

Authors

评论

Hi, reviewers,

Did the authors' rebuttal fix your doubts? Please show your approval or ask any further questions.

Notice that reviewers submitted “Mandatory Acknowledgement” without posting a single sentence to Authors in discussions - such action is not permitted.

Thanks,

AC

评论

dear AC,

I have no further comment.

knowing reviewers have usually five papers to review and each has also as many reviewers and lengthy author responses, it gets to the detriment of the time one can spend on each paper.

cheers.

最终决定

This paper proposes a framework HetSyn for modeling prominent heterogeneity by combining the widely observed but underutilized characteristic of synaptic heterogeneity in biological neurons. This framework combines synaptic specific time constants to reflect synaptic heterogeneity and is instantiated through HetSyn LlF.

After rebuttal, three reviewers believed that their doubts are fixed well. Although Reviewer auyx keeps a negative score, the weaknesses or suggestions seem to be supported well by the supplementary experiments.