The main weakness of the paper is common to many works on SNNs. Specifically, the significance, novelty, potential impact, and experimental validation are limited to the narrow field of SNNs themselves. Very rarely does an SNN paper show its advantages in the broader literature on neural networks, let alone in the real world. The present manuscript too, when evaluated in a broader scope, suffers from the same issues.

More concretely:

the method is only new for SNNs, but not for neural networks in general.
the performance is claimed to surpass the state of the art (already in the abstract), but the authors do not actually compare with the true state of the art, including non-spiking networks.
there is no experimental comparison with standard (i.e. less constrained) temporal convolutions.

Therefore, it is unclear what the true contribution of the work is, beyond a nice conceptual analogy between temporal convolutions and synaptic delays.

Secondary weaknesses:

Even within spiking networks, the work does not seem to surpass the state of the art, contrary to the authors' claims. In [1], a partly spiking neural network reached 95.6% on the GSC v0.02, where the authors report 95.35% at most. The manuscript does not cite that prior work.
The paper does not motivate sufficiently the choice of spiking neurons as a model. A paragraph explaining the advantages of SNNs in comparison with the true state of the art, i.e. ANNs, supported with citations that demonstrate them measurably, such as energy efficiency, but also rarely in other metrics such as speed of inference and training [1] and even classification accuracy [2]. Any other arguments and citations that the authors can add to support that choice would be useful.
The authors claim that there is no recurrency in their models, but a leaky integrate-and-fire neuron's leak membrane potential is equivalent to a self-recurrent connection. I understand what the authors mean, but, again in the spirit of appealing to the broader ICLR community and not only to the SNN niche, this should be clarified.

[1] Jeffares et al., Spike-inspired rank coding for fast and accurate recurrent neural networks, ICLR 2022

[2] Moraitis et al., Optimality of short-term synaptic plasticity in modelling certain dynamic environments, arXiv 2021

EDIT (adding my responses here too, for public visibility):

The authors' response dedicates a large section to address points that I did not make. To correct the record I must unfortunately reply to that section too, even though it is merely a distraction.

Nowhere did I claim that SNNs are not important or not a legitimate research direction, or that the entire field deserves rejection. I did not dismiss the paper on the basis of it being an SNN. I did point out that some of its weaknesses are frequent in the SNN literature, but pointing that out does not make those weaknesses irrelevant to this specific review. The attempt by the authors to entirely dismiss my review based on how many SNN papers per year are published and how many good reviews the paper received is an attempt to evade my specific criticisms. Worse, the aggressive style of the authors' response, and the misconstrual of my arguments as if they were a personal matter of mine is not helpful.

Again, SNNs can certainly have important advantages, and some SNNs do have them, but a neural network merely being implemented with spiking neurons does not guarantee these benefits. An SNN paper must be evaluated as any other paper, and not merely be accepted as a significant contribution because the network is spiking.

Despite this attempt to discount my comments, I continue my contribution to this process in a separate comment.

Some important weaknesses remain.

The key method that the authors used is not new, only its application is.
The so-far evaluation does not suffice to compare with other works: (a) Two of the three used datasets have received very little if any attention outside of the SNN literature. (b) Only feedforward architectures, with only 2 or 3 layers, have been tested. (c) Only spiking networks have been tested, so it is unclear whether the same results could be achieved, for example, with much smaller (and thus possibly more efficient) non-spiking networks.
The paper is missing a sufficient motivation of SNNs as a model. A paragraph with the potential benefits of SNNs should be added, citing the previously demonstrated improvements in efficiency, inference speed, and even classification accuracy, but it should also explain that these benefits are not present in all SNNs by default. Examples of such references were given in my original review.

On the other hand, the paper now does include a comparison with a more standard method, i.e. conventional temporal convolutions, and it does outperform it. Of course, the work already was a good contribution to the SNN field, but this addition makes it now a relatively convincing demonstration of the power of learned delays more generally, that is a also useful result for the broader ICLR community. Based on these, I am raising my score.