/10

Poster4 位审稿人

最低2最高4标准差0.7

ICML 2025

Temporal Misalignment in ANN-SNN Conversion and its Mitigation via Probabilistic Spiking Neurons

Velibor Bojkovic,Xiaofeng Wu,Bin Gu

OpenReview PDF

提交: 2025-01-24更新: 2025-07-24

TL;DR

Identification of a phenomenon in ANN-SNN conversion and its mitigation through novel approach in ANN-SNN conversion.

摘要

关键词

Spiking Neural NetworksSNNANN-SNN conversionprobabilistic spiking

评审与讨论

审稿意见

评分: 42025-02-22

This paper analyzes the spike temporal dynamics in the ANN-SNN conversion framework and investigates the impact of spike firing timing on the stability of the conversion. It identifies a phenomenon termed temporal misalignment, where random spike rearrangements across SNN layers can lead to performance improvements. Furthermore, the paper introduces a bursting probabilistic spiking neuron to simulate and enhance this phenomenon, supported by theoretical justification, which improves performance on several tasks.

给作者的问题

My questions are aligned with those discussed in points 1–5 above. I hope the authors will address point 1 and provide clarification on point 2. Additionally, I particularly recommend a more explicit theoretical and practical analysis of point 3.

I am generally enthusiastic about this work but may reconsider my rating if some key issues remain unclear.

论据与证据

I appreciate the idea presented in this work; however, the writing is somewhat opaque. Despite reading the detailed appendices, I remain unsure whether I fully grasp the concepts. Below, I outline my understanding, which I hope the authors can confirm or correct.

My understanding is that the effectiveness of the temporal shuffle is due to the following reasons:

SNNs require potential accumulation to fire, leading to delayed responses, with most spikes occurring in later time steps, meaning that many neurons are unable to complete firing before time ending.
To mitigate this, we desire earlier spike emissions. By shuffling effect, neurons may "overdraft" and fire earlier based on their expected firing rate, instead of waiting for potential accumulation.
The method proposed in this paper uses a Bernoulli distribution to simulate this behavior, ensuring that the spike emission rate at each time step meets the conversion requirements, rather than only ensuring accuracy for the final result after $T$ step, as in standard ANN-SNN rate coding. Based on this understanding, I raise the following question:

The concept of shuffle described in the paper seems not fundamental. It merely appears to achieve an even distribution of spikes. A more precise term of your method might be "expected firing rate estimation." This idea bears resemblance to the approaches in [1, 2], where spike attention outputs are predicted in the time dimension to ensure consistent expectations at each time step. Both papers address mechanisms for stabilizing variable multiplications in ANN-SNN conversion for Transformers, but could offer a simpler explanation for the phenomenon explored in this paper.
While Theorem 1 demonstrates that the proposed SNN neuron firing rate is an unbiased estimate of the ANN firing rate at each time step, the stability of this firing rate should not be overlooked. The randomness introduced by the Bernoulli distribution could cause significant variance in the results of the same inference task. With the proposed algorithm, this variance may not converge to zero over time (and may even diverge), leading to perturbations during both training and testing. Does this introduce any stability concerns for the algorithm? Could there be considerable room for improvement? Related theoretical analysis may be found in [1,2], and supplementary empirical analysis would also be beneficial.

[1] Jiang, Yizhou, et al. "Spatio-Temporal Approximation: A Training-Free SNN Conversion for Transformers." The Twelfth International Conference on Learning Representations. 2024.

[2] Huang, Zihan, et al. "Towards High-performance Spiking Transformers from ANN to SNN Conversion." Proceedings of the 32nd ACM International Conference on Multimedia. 2024.

方法与评估标准

The evaluation criteria are appropriate, using standard datasets and backbone networks.

However, the algorithm presented in this paper appears limited to traditional convolutional networks with ReLU activation and may not extend to Vision Transformers with GELU activation and attention operations. This severely restricts its potential applications. Could the authors discuss possible future directions for overcoming this limitation?

理论论述

I reviewed the proof of Theorem 1 in Appendix B and found no apparent issues.

实验设计与分析

The experimental design and implementation appear sound.

However, the probabilistic neuron introduced in this work includes a stochastic component, which incurs computational and time costs. These costs are hardware-dependent (e.g., they could significantly affect performance on NVIDIA GPUs), potentially rendering the power consumption analysis incomplete and impacting the universality of its application.

补充材料

I have reviewed the theoretical analysis in Appendix B and the phenomenon analysis in G.

与现有文献的关系

This paper builds on prior work in the ANN-SNN conversion domain, particularly related to phase lag and unevenness, and introduces a novel analysis and solution regarding neuron dynamics. These aspects are clearly discussed in the paper.

遗漏的重要参考文献

The authors might consider referencing the latest work on Transformer conversion algorithms, as noted above.

其他优缺点

While the paper's overall structure is clear, the writing clarity and organization could be improved. Despite some efforts to reduce the reading difficulty (e.g., Section 3.2, lines 191–215), the explanation could be more direct in clarifying the essence of the algorithm rather than the process.

其他意见或建议

I have no additional comments.

作者回复

2025-04-01

We thank the reviewer for their time in assessing our paper and their constructive comments.

3. We consider the setting from Theorem 1: we assume that the accumulated membrane potential in the first phase is $v:=TX$ , that the threshold $\theta=1$ and that $0\leq TX\leq T$ (other cases being trivial). Let us also put $S[t]=\sum_{i=1}^ts[i]$ . We consider two different situations:

Case 1. $TX$ is an integer. In particular, Theorem 1 (b) tells us that at each time step $t$ , as long as $v\geq S[t-1]$ , the expectation of having a spike at $t$ given $S[t-1]$ is $\frac{v-S[t-1]}{T-t+1}$ . We note that the last quantity is never strictly bigger than 1 and never strictly below 0. In particular, the conditional expectation (of mem. potential before spiking) $E[v[t+1]\mid S[t-1]]=v-S[t-1]-\frac{v-S[t-1]}{T-t+1}=v[t]+\frac{v[t]}{T-t+1}$ . The total expectation then becomes $E[v[t+1]]=E[v[t]]-\frac{E[v[t]]}{T-t+1}=E[v[t]]\frac{T-t}{T-t+1}$ . Initially, we have $E[v[1]]=v=TX$ . Solving this recurrence gives $E[v[t+1]]=v\prod_{i=1}^t\frac{T-i}{T-i+1}=v\frac{T-t+1}{T}$ . Hence, $E[s[t]]=\frac{E[v[t]]}{T-t+1}=\frac{v}{T}$ .

In particular, since the total number of spikes during $T$ steps is $v$ (Theorem 1. (c)) it follows that when $v$ is an integer in $[0,T]$ , $S[t]$ is a random variable that follows hypergeometric distribution $H(T,v,t)$ . The variance of the firing rate is then $V[\frac{1}{t}S[t]]=\frac{v}{tT}(1-\frac{v}{T})\cdot\frac{T-t}{T-1}$ . Note that at the final step $T$ , the firing rate is precisely $X$ with variance 0.

Case 2. When $TX$ is not an integer, we no longer have a nice expression for the probability of having a spike at $t$ . Namely, the modified expression becomes $E[s[t]]=\frac{\min(\max(v[t],0),1)}{T-t+1}$ as $v[t]$ in this case can be strictly negative or be strictly larger than 1. Hence, we cannot solve the resulting recurrence in a closed form. However, we do note that the final number of spikes at step $T$ is either $\lfloor TX\rfloor$ or $\lceil TX\rceil$ (Theorem 1. (c)). Both cases occur with non-zero probabilities $p$ and $1-p$ n Then, the firing rate at the end is either $\frac{\lfloor TX\rfloor}{T}$ with probability $p$ , either $\frac{\lceil TX\rceil}{T}$ with probability $1-p$ . The expected firing rate and its variance are $\frac{\lfloor TX\rfloor}{T}+\frac{1-p}{T}$ and $\frac{p(1-p)}{T^2}$ . In particular, we see that the firing rate stabilizes with $T$ .

It seems that our spiking process offers a continuous extension of hypergeometric distribution (with continuous $v$ ), but we were not able to precisely detect this distribution in the existing literature. Further theoretical study of this process seems like a promising continuation of this work.

To shed more light on the situation, we provide the following plots. In the first plot, we fix $T=16$ and vary $v$ continuously. We plot probability (blue) and variance (orange) of the event $S[T]=i$ , when $v$ varies through $[i,i+1]$ .

Prob var

We also plot the probabilities for various steps $t$ of having a spike at that time step, in dependence of the inital $v$ . We notice that when $v$ is an integer, the probability is uniform for all $t$ , while in other cases it varies from step to step.

Prob spike

2. We thank for the references provided. Comparison with [1] and [2]: We focus on Section 5.2 in [2] and in particular Theorem 3. We note the assumption that the assumption of the theorem is that the firing mechanism of spiking neurons are stationary processes with a fixed given number of produced spikes. However, in our case, as we discussed above, our proposed firing mechanism is dynamic, with bias depending on the emitted spikes. Furthermore, the number of spikes TPP neuron produces is ``guaranteed'' ( $\lfloor TX\rfloor$ or $\lfloor TX\rfloor+1$ ) which offers more control in conversion. Based on our ablation studies (please refer to tables provide to Reviewer BQ8A), TPP neurons constantly outperform probabilistic neurons with fixed bias, and we expect that implementation of TPP with method of [2] would further ameliorate their respective results. We will deal with this more thoroughly in the future.

1. You are on the point in your description of the motivation behind this work and the shuffling effect (See also Theorem 1 in Appendix G). However, we do emphasize subtle characteristics of our proposed Bernoulli firing process.

4. Our work in progress deals with extension of TPP neurons to approximate non-linear (activation) functions. Please refer to Going beyond convolution architectures answer to the Reviewer UeYr.

5. We acknowledge that the stochastic component may induce extra energy consumption for TPP neurons. However, we argue that low number of SOPs to achieve high accuracy may compensate for this energy consumption.

审稿人评论

2025-04-02

Thanks for the author's response. I really appreciate this job and believe that should be accepted. If given the opportunity, I would like to see its official version with improved writing clarity.

作者评论

2025-04-05

We sincerely thank Reviewer CXU9 for their thoughtful feedback and encouraging support. We are especially grateful for their positive assessment and constructive suggestions.

Authors

审稿意见

评分: 32025-03-14

The authors in this paper report a seemingly unusual phenomena in the ANN to SNN conversion framework, wherein by random spike arrangement of the output of SNN layers there was a increase in performance. Following this, the authors introduce a probabilistic neuronal model namely TPP, which improves the accuracy of the resulting SNN closer to the baseline ANN. The authors evaluate their proposed approach on vision-based datasets such as CIFAR-10/100, ImageNet, etc and achieved SOTA performance.

给作者的问题

(1) For SNNs operating on longer timesteps, how many runs were done while doing the permuted spikes experiment. It is very unusual that model performance will be consistently higher for all permutations.

论据与证据

The primary claim is substantiated with empirical results. However, there is no strong theoretical proof behind the underlying temporal misalignment phenomena.

方法与评估标准

The experimental result bolsters the core contribution of the work.

理论论述

The authors provided theoretical proofs regarding the design formulation of the TPP neurons. However, I could not find any substantial proof for the underlying temporal misalignment phenomena.

实验设计与分析

The authors used standard datasets such as CIFAR-10/100, Imagenet, etc. to evaluate their approach. They also examined the membrane potential distribution across different models as a means of understanding why the base models underperformed.

补充材料

No code was provided as part of the submission. Since, this work largely rests on experimental findings, I think exploring the code would help the reviewers understand and appreciate the contributions more.

与现有文献的关系

This work pertains to the domain of ANN-SNN conversion, which is relevant in the broader neuromorphic community.

遗漏的重要参考文献

The authors discussed all major references.

其他优缺点

Strengths: (a) Interesting observation regarding temporal misalignment. (b) proposed model achieves SOTA performance.

Weaknesses: (a) No strong theoretical justification behind the observed event. (b) The work mainly explores convolutional architectures. Exploring transformer based architectures as well can provide a more complete picture.

其他意见或建议

Can explore transformer based architectures as well.

作者回复

2025-04-01

Other Strenths and Weaknesses

(a) Theoretical justification:

We argue that ``temporal misalignment'' happens primarily due to the fact that in ANN-SNN conversion, SNN models will need a few time steps to accumulate enough potential to start firing.

We start by noting that in ANN-SNN conversion, at a particular activation layer of ANN, the thresholds for the corresponding spiking neurons SNN layer are chosen based on the distribution of the corresponding ANN activations, standard choice being the maximum activation or its percentile. In any case, there will be substantial activations that fall below this threshold. We argue that TPP neuron will produce more spikes at a first time step, than a vanilla spiking neuron, because of its probabilistic nature. To make this formal, we propose the following:

Proposition Consider an ANN neuron with ReLu activation and let activation values follow distribution with PDF $p(x)$ . Let further $\theta>0$ be the threshold of the corresponding vanilla SNN neuron. Then:

The expectation of a spike output at step $t=1$ of a vanilla spiking neuron is $\int_{\theta}^\infty p(x)dx$ .
The expectation of having a spike at step $t=1$ of a TPP neuron is $\int_{0}^\infty p(x)dx$ which is the same as the expectation of the output of the ANN neuron with ReLU activation.

Many of the baselines propose to start with an initial membrane potential, in order to ``encourage'' early spiking, however, there will always be a substantial part of the values that will fall below the threshold and the above proposition still holds.

Empirical evidence for this is presented in Figures 4 and 7, to which we kindly refer. This absence of spikes in the first time steps further causes unevenness error of the spike outputs, as thoroughly discussed, for example, in AAAI’23, “Reducing ANN-SNN Conversion Error through Residual Membrane Potential.” Ideally, we expect the spikes received from the preceding layer to be uniformly distributed.

In particular, this result shows that TPP neurons with their designed probabilistic spiking approximate the output of the ANN neurons as starting with the first time step, while for vanilla spiking neurons this approximation is coarser. Our Theorem 1 (b) then shows how this approximation evolves throughout the rest of the simulation time.

Other Comments Or Suggestions

Going beyond convolution architectures In our current work in progress, we consider generalization of TPP neurons that are specific to approximate various non-linear activation functions, as a first step towards conversion of various ANN architectures. In particular, design that we are currently exploring is the following (please see the following Figure for a visual explanation).

activation function

We consider a general nice enough activation function $f$ , and a sequence of thresholds $\theta_1,\dots,\theta_T$ , chosen in such a way that the steps with length $\theta_i$ and height $s=1$ approximate the function $f$ in an ``optimal'' way (we do not discuss how to choose $\theta_i$ and what optimality would actually mean). If we denote by $\Theta_t:=\sum_{i=1}^t\theta_i$ , the modified spiking mechanism of our TPP neuron becomes $s[t] = B(\frac{v[t-1]}{\Theta_{T-t+1}})$ , $v[t+1] = v[t]-s[t]\theta_t$ . We note that the mechanism generalizes our TPP in that for ReLU activation we had $\theta_t=\theta$ for all $t$ . This situation requires more sophisticated approach for theoretical insights, and we decided to keep it separate from our submission.

Questions For Authors

For the experiments in the main body of the paper, we performed permutations with 5 different seeds and report the mean of the results. However, we kindly refer to Appendix G where we provide further experiments concerning permutations as well as theoretical insights into their functioning (Appendix G, Theorem 1). We would also like to point one curiosity here, as it is still beyond our understanding. Namely, in Figure 11, we report results when for latency $T=4$ and baseline model, a chosen permutation is applied to all the spiking layers of the model (so the permutation does not vary from layer to layer in a random way). We tested for all the 24 possible permutations of 4 elements and for all the 24 permutations, the performance improved.

Supplementary material

We provide anonymized code that was used in some of the experiments.

Code

审稿意见

评分: 22025-03-19

This paper presents a new framework for ANN-SNN conversion, which is motivated by an interesting phenomenon called “temporal misalignment”. The authors observe that the performance of converted SNN becomes better if they rearrange the temporal order of output spike trains of each layer. Based on such observation, the authors propose a new method for ANN-SNN conversion, using a two-phase probabilistic spiking neuron which mimics the effect of “permuting spike trains”. The method achieves SOTA results on various image classification datasets.

给作者的问题

N/A

论据与证据

Clear: The bio-plausibility and hardware implementation of the proposed TPP neurons is well presented.

The effectiveness and efficiency of proposed method is verified on various image classification datasets with several network architectures, and the results are sound.

Not Clear: The proposed TPP neuron seems ignore any temporal information of input spike trains, since the first phase will accumulate the input and IF neuron has no decay. Then why is the Bernoulli random firing mechanism needed? I think the output timing of spikes for TPP neuron won’t affect the next layer, so it won’t change the final result as long as TPP neuron just fire the expected number of spikes?

方法与评估标准

The comparison with other ANN-SNN conversion methods is thorough and fair. Apart from accuracy, the spiking activity and membrane potential distribution are provided.

理论论述

No checks

实验设计与分析

No issues for experimental designs in Table1 and Table 2.

In figure1 and figure4, even in appendix G, it’s not very clear to me how the authors do “shuffle” or “permute”. The analysis doesn’t make sense if it’s just a random permute of spike trains, since the authors claim the permute can significantly improve the performance.

It is not clear how “permutation” inspires the authors to propose the TPP neurons. Section 3.2 is not very informative about this. Why TPP is doing (random) permutation?

补充材料

I go through all sections of supplementary materials.

与现有文献的关系

N/A

遗漏的重要参考文献

N/A

其他优缺点

Weakness: The analysis for “temporal misalignment” is more empirical and not theoretical to me. I’m not sure why other methods have “temporal misalignment” and why the proposed TPP neuron can solve the “temporal misalignment”. Based on the figure 4 and figure 7, baseline and probabilistic modes should have completely different spike counts and firing rate, but in the figure 5 and figure 6 baseline and TPP have almost the same spike counts.

As mentioned in previous sections, it is not clear how “permutation” is related to the TPP neurons. At least it is not clearly presented in the paper.

其他意见或建议

N/A

作者回复

2025-04-01

We thank the reviewer for their time and constructive comments.

Claims And Evidence: Not Clear

It is important to ensure not only that the expected number of spikes matches the value of the ANN activation, but also that these spikes are distributed in a uniform manner. In particular, even though all spiking neurons in one layer emitted the expected number of spikes, when we apply the subsequent weights the next layer will not necessarily receive the expected input corresponding to the input of ANN layer. The key issue here is the unevenness error, as thoroughly discussed in AAAI’23, “Reducing ANN-SNN Conversion Error through Residual Membrane Potential”, as it can happen that the expected number of spikes arrives too early or too late.

So, ideally, we expect the spikes received from the preceding layer to be uniformly distributed. However, as spikes propagate through deeper layers, their timing often becomes increasingly irregular, leading to deviations from the expected spike count and resulting in unevenness error.

For your convenience, we compared our TPP method with a naive probabilistic spiking neuron, where after the accumulation of the membrane potential $v$ , the neuron emits spikes with the constant bias $v/(\theta\cdot T)$ (so spiking becomes a stationary process). The following tables show that our design is outperforming the stationary process. Our Theorem 1. (b) offers an explanation, as the dynamic bias takes into account already emitted spikes and offers a more meaningful approximation of the ANN outputs. We will provide full tables in the revised manuscript.

Table1 and Table2

Experimental Designs Or Analyses

The shuffling is performed in the following way. Once a baseline method is fixed, we proceed in order, layer by layer. After each spiking layer we would collect spike trains of that layer, then we would permute them in temporal dimension (that is we rearrange the spikes in time). Such permuted spike trains are then passed to the next layer, and we continue this process until the output layer. The permutation we apply after each layer is random, based on the current seed.

Generalization to TPP: Conceptually, a permutation would require to 1) collect the spike trains and 2) rearrange them in temporal dimension. These two correspond to two phases in our TPP neuron. In the second phase, TPP is producing the spike train, and the total number of spikes is given by Theorem 1. (c). Furthermore, at any given time step, there is a non-zero probability to have a spike (as long as there is non-zero residue voltage). Since in the end, produced spike trains will have "predetermined" number of spikes and there is a possibility to have a spike at every given time, it is as if TPP was acting as a permutation on a hypothetical spike train with the same number of spikes.

We hope this clarification makes the connection between permutation and TPP neurons more explicit, but please also kindly refer to Section 3.2 in our paper.

Weaknessess

Temporal misalignment in baselines We consider the situation of an ANN neuron with ReLU activation and how its output is approximated with a vanilla spiking neuron and our proposed TPP neuron.

The expectation of a spike output at step $t=1$ of a vanilla spiking neuron is $\int_{\theta}^\infty p(x)dx$ .
The expectation of having a spike at step $t=1$ of a TPP neuron is $\int_{0}^\infty p(x)dx$ which is the same as the expectation of the output of the ANN neuron with ReLU activation.

In particular, this result shows that TPP neurons with their designed probabilistic spiking approximate the output of the ANN neurons as starting with the first time step, while for vanilla spiking neurons this approximation is coarser (see also Figures 8 for an empirical evidence of the above Proposition). Our Theorem 1 (b) then shows how this approximation evolves throughout the rest of the simulation time.

Different spike counts Note that in Figure 4 and Figure 7 the distribution of membrane potential is presented for time steps 1 and 2. This is to empirically show that baselines do not have enough membrane potential to produce spikes in the first time steps, which eventually causes approximation errors as we discussed above. However, in Figures 5 nad 8, we are comparing spike counts for latencies 8 and higher. In particular, baselines are producing more spikes, however, these spikes trains are not "evenly" or "optimally" positioned in time, hence the gap in performance compared to our method.

We hope that this discussion contributes to the clarity of the paper, but we will incorporate your objections in the revised manuscript.

审稿意见

评分: 32025-03-19

The paper investigates the ANN-SNN (Artificial Neural Network to Spiking Neural Network) conversion process, identifying a phenomenon called "temporal misalignment," where random permutations of spike trains across SNN layers improve performance. The authors propose a novel two-phase probabilistic (TPP) spiking neuron model to address this, featuring an accumulation phase followed by probabilistic spiking based on a Bernoulli process. Main findings include improved accuracy and reduced latency in SNNs compared to baseline methods, validated across datasets like CIFAR-10/100, CIFAR10-DVS, and ImageNet with architectures such as VGG-16, ResNet-20/34, and RegNet

给作者的问题

Energy Efficiency: Why was energy consumption not evaluated alongside accuracy, given SNNs’ energy-efficient premise? Including this could strengthen the paper’s practical impact—e.g., if TPP reduces energy use, it bolsters the contribution; if not, it reveals a limitation.
Generalization: How does TPP perform on non-classification tasks (e.g., reinforcement learning or time-series prediction)? Evidence of broader applicability could elevate the method’s significance; lack thereof might narrow its scope.
Statistical Significance: Can you provide p-values or confidence intervals for the firing count differences (Tables 8-10)? This would clarify if TPP’s spiking changes are meaningful, potentially affecting the perceived robustness of the approach.

论据与证据

The claims are generally well-supported, but the assertion that TPP neurons are universally superior might overreach, as evidence is limited to specific datasets and architectures. Generalization to other tasks is untested, and the paper lacks discussion on failure cases or limitations.

方法与评估标准

The focus on accuracy alone neglects energy efficiency, a key SNN advantage, which could enhance the evaluation’s completeness.

理论论述

Yes, theorems seem correct

实验设计与分析

Spiking activity analysis (Tables 8-10) shows percentage differences, but the interpretation is unclear without statistical significance tests or error bars, reducing confidence in firing rate claims.

CIFAR10-DVS (Table 2): Shows TPP outperforming direct training methods. The event-based dataset choice is apt, but the lack of latency or energy metrics limits depth.

ImageNet Results (Table 1): Compares TPP with multiple baselines across timesteps. The use of five runs with averages and deviations (partially shown) supports validity, but incomplete data (e.g., missing T=4 for some methods) hampers full assessment.

补充材料

In parts

与现有文献的关系

The paper builds on ANN-SNN conversion literature (e.g., Rueckauer et al., 2017a; Diehl et al., 2015) and spiking neuron dynamics (e.g., Izhikevich, 2007). The temporal misalignment concept extends prior work on phase lag (Li et al., 2022) and input unevenness (Bu et al., 2022c), offering a novel interpretation.

遗漏的重要参考文献

Hunsberger & Eliasmith (2015), "Spiking Deep Networks with LIF Neurons" (Neural Computation) Neftci et al. (2017), "Event-Driven Deep Neural Networks" (IEEE) Diehl et al. (2016), "Conversion of Continuous-Valued Deep Networks to Efficient Event-Driven Networks" (arXiv): Lee, Donghyun, et al. "TT-SNN: tensor train decomposition for efficient spiking neural network training." 2024 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2024.

其他优缺点

Strengths: Originality in identifying and leveraging temporal misalignment. Practical significance via state-of-the-art results on standard benchmarks. Clear exposition of TPP mechanics (Figures 2-3, Algorithms 1-3).

Weaknesses: Limited discussion of computational cost or energy efficiency, critical for SNNs. Overemphasis on accuracy without addressing trade-offs (e.g., latency vs. precision). Clarity suffers from incomplete tables (e.g., Table 1 lacks full data) and missing proof details).

其他意见或建议

N/A

作者回复

2025-04-01

We thank the reviewer for their time in assessing our paper and constructive comments.

Essential References Not Discussed:

Thank you for pointing these out, we will include them in the revised manuscript.

Questions For Authors:

1. We reported in Tables 8-10 the number of spikes as, in general, this is a fair "measure" to use when comparing various methods for the same architectures and latency, on neuromorphic hardware. However, we provide further tables that compute the energy consumption on specialized hardware. In particular, our approach follows Merolla, et al. "A million spiking neuron integrated circuit with a scalable communication network and interface" where synaptic operations were used to calculate the energy, as this approach has been adapted in recent relevant literature. To estimate the energy consumptions per one SOP (FLOP), we use the values for the neuromorphic processor Qiao et al. "A reconfigurable on-line learning spiking neuromorphic processor comprising 256 neurons and 128K synapses". To be fair, we also calculated potential MAC operations in our and baseline models coming from the first layer, as the sample images use constant encoding.

In the following table (please refer to Table 17), we compare the energy consumption for some of the baseline methods and our approach, together with the approximated ANN energy consumption. We compare the energy consumption based on the accuracy. As our method reaches the ANN accuracy using much lower number of SOPs, it provides a more energy efficient alternative. However, we acknowledge that the stochastic component of our method can induce further energy consumption, but we were not able to test for that. We will provide full tables and more comparison results in the updated manuscript.

https://drive.google.com/drive/folders/1sbhqT9Nabl8BUIDs9dt8OFpsz1sMZK6w

Generalization We applied our proposed method on a simple Reinforcement Learning (DQN) example. We compared performance of the baseline QCFS (L=8) and TPP on CartPole task with over 20 evaluation epochs. The following table reports the average episode length (with standard deviation in parentheses) for different time horizons (T). A higher episode length indicates better performance, with a maximum possible score of 500 steps. The baseline struggles at T=4 but improves with higher latency, while TPP achieves significantly better performance at lower T values and reaches optimal performance faster.

Reinforcement Learning

In our current work in progress, we consider generalization of TPP neurons that are specific to approximate various non-linear activation functions, as a first step towards conversion of various ANN architectures. In particular, design that we are currently exploring is the following (please see the following Figure for a visual explanation).

activation function

3. We provide the required tests in the following tables (please refer to Tables 3-15). The results confirm that the reported changes in TPPs spiking changes are meaningful.

https://drive.google.com/drive/folders/1sbhqT9Nabl8BUIDs9dt8OFpsz1sMZK6w

最终决定Accept (poster)

2025-05-01

This paper proposes a novel ANN-SNN conversion framework based on the phenomenon of temporal misalignment by introducing a two-phase probabilistic spiking neuron (TPP) to improve the temporal distribution of spike trains across layers, thereby enhancing the performance of the converted SNN. Reviewers generally acknowledge the method’s advantages in accuracy improvement and performance stability, with strong results on datasets such as CIFAR and ImageNet. Some concerns remain regarding the TPP neuron’s ability to preserve input temporal information, energy efficiency, and comparisons with existing conversion methods, as well as the need for additional theoretical analysis and more comprehensive ablation studies. Despite these issues, the overall idea is innovative and the experimental results are sufficiently compelling, therefore, acceptance is recommended.