5.3

/10

Poster3 位审稿人

最低4最高7标准差1.2

5.0

置信度

正确性2.0

贡献度2.3

表达2.7

NeurIPS 2024

FEEL-SNN: Robust Spiking Neural Networks with Frequency Encoding and Evolutionary Leak Factor

Mengting Xu,De Ma,Huajin Tang,Qian Zheng,Gang Pan

OpenReview PDF

提交: 2024-05-10更新: 2024-11-06

摘要

关键词

spiking neural networkrobustnessattack

评审与讨论

审稿意见

评分: 5置信度: 52024-07-05

This article proposes a robust algorithm for spiking neural networks. The algorithm includes a frequency-domain filter with a hard threshold and trainable neuron leakage parameters. The author's motivation in organizing the paper is based on biological interpretability, adopting an engineering approach in methodology, and attempting to propose a unified, robust framework for spiking neural networks. The author combines multiple previous methods in the experiment and provides better robustness results under adversarial disturbances.

优点

The author provides better robustness against disturbances. The method was verified by the author in GN FGSM, PGD, BIM, and CW.

I think the author's motivation is very important and urgent, consistent with the interests of NeurIPS.

缺点

The author proposes that the robustness of SNN lacks theoretical analysis, but in reality, the theoretical analysis provided by the author is not significantly different from the theoretical analysis in StoG. The conclusions presented in the paper corresponding to the StoG method are similar to the theory proposed by the author. The innovation point here is not clear.

The author proposes that frequency encoding is based on cognitive motivation, i.e., selective visual attention mechanism, rather than coding level, which is inconsistent with the motivation behind the dynamic innovation points of variable dynamic parameters proposed later.

The two methods proposed by the author, FE and EL, did not conduct a detailed ablation study to determine the effectiveness of the module. Especially lacking in the performance of using the EL method alone.

What is the difference between the El method proposed by the author and the method proposed by Ding et al. in ICML 2024? [Ding et al., 2024] https://arxiv.org/abs/2405.20694.

问题

How did the author train a network containing frequency domain filtering (FE)? What backpropagation method is currently used by the author to propagate the direction of the FE module (such as what method is used, what tools are used, and what is the speed of backpropagation)?

Although the FE module proposed by the author improves the robustness of spiking neural networks, I would like to ask if FE will disrupt the low energy consumption characteristics of spiking neural networks. After all, frequency domain filtering seems to be more suitable in combination with traditional CNN, and the calculations here are not sparse.

When conducting white box and black box attacks, I believe it is necessary for the adversarial model with and without FE modules to be used as attack models in the validation of the paper.

局限性

See weakness and questions

作者回复

2024-08-06

1. The innovation point of this work is not clear compared with StoG [11].

We highlight the innovation of our theoretical analysis compared to StoG from two key perspectives:

Theoretical focus. For the regularizer $|\epsilon \odot \nabla_x \mathcal{L}(x)|_1$ in Eq. 5, StoG focuses on perturbation constraints during SNN signal transmission. In contrast, we analyze the effects of input noise and spiking neuron parameters on this regularizer (shown in Eq. 6). This shift in focus allows us to explore how these factors theoretically constrain the regularizer, offering a different perspective on enhancing robustness.
Implementation of theoretical framework. Unlike StoG's stochastic gating factor, we introduce a frequency encoding method to eliminate perturbation inputs. Tab. 1 (main paper) shows our method further enhances StoG's robustness, demonstrating that our method and idea do not conflict with StoG. Moreover, while prior studies empirically explored parameters like firing threshold [12] and leak factor [31] on SNN robustness, our robust constraint framework offers a theoretical basis for these findings.

In summary, our work advances the theoretical understanding of input noise and spiking neuron parameters to SNN robustness and proposes an innovative frequency encoding method. We will update our revised version to state the contribution of the adversarial loss constraint introduced in StoG to our theoretical analysis while emphasizing the distinctions and innovations of our approach.

2. The cognitive motivation of FE is inconsistent with dynamic parameters proposed later.

The concept of selective visual attention in [8] is described as "visual attention focuses on one region of the visual field at a time", closely resembling the retinal coding rather than a cognitive process, as described in lines 173-174. Inspired by this, we propose FE to capture different frequency information at different time steps in SNNs, and EL is used to better learn the correlation between information of different frequencies across time steps.

Furthermore, FE and EL methods are two contributions of our work, which can be used independently and in combination to further improve the robustness of SNNs.

3. Lack of performance when using the EL method alone.

We have now included the performance of our EL alone in Tab. R2 of the rebuttal appendix. It is evident that both FE and EL effectively enhance the robustness of the original methods, with FEEL further improving robustness on this foundation. For instance, under a PGD attack, the original RAT method achieves 8.87% accuracy, while our FE increases robustness to 9.70%, EL to 11.39%, and FEEL to 12.36%. This illustrates the effectiveness of each module of our method.

4. What is the difference between the EL method and DLIF [C]

Our EL and DLIF [C] optimize the leak factor to improve SNN robustness but differ in motivation and implementation.

Motivation, DLIF uses a dynamic leak factor to reduce perturbation transmission, while our approach leverages it to capture correlations across time steps via frequency encoding, thus enhancing the learning capability of SNNs further against perturbations.
Implementation, DLIF dynamically learns the leak factor at each time step but shares it across neurons within a layer. our EL learns the optimal leak factor across time steps and among individual neurons within the same layer, leading to greater robustness, as described in lines 74-78, 212-213.

While both methods aim to enhance robustness through leak factor optimization, our EL method achieves superior results, as shown in Tab. R7. We will cite DLIF [C] in our revision to clarify the differences.

Table R7: Performance (white-box attack) comparison with DLIF[C] (CIFAR100 with the same experimental setting in Tab.1 of [C]).

Method	Clean	FGSM	PGD
Vanilla+DLIF	70.79	6.95	0.08
Vanilla+EL	71.41	9.16	1.29

5. How did the author train a network containing frequency domain filtering (FE)?

We would like to clarify that our frequency masking operation is a data preprocessing step and does not participate in model training. As shown in Eq. 10 and Fig. 2a, after frequency mask crops information from high-frequency to low-frequency at different time steps (Eq. 7, 8, 9), the frequency domain images are converted back to the spatial domain images with varying frequency information at different time steps (Eq. 10). These images are then used for training. Therefore, our backpropagation method remains identical to standard training methods, using the surrogate gradient $\frac{\partial O}{\partial u} = \frac{1}{\gamma^2} \max \left( 0, \gamma - |u - V_{th}| \right)$ based on BPTT rule. More training settings are detailed in lines 218, 450-454 of the main paper and in the code provided in Supplementary Materials.

6. Whether FE disrupts the low energy consumption characteristics of spiking neural networks.

As addressed in the previous question, our FE functions as a data preprocessing method and does not participate in model training. Therefore, FE does not disrupt the low-energy performance of SNNs.

7. It is necessary for the adversarial model with and without FE modules to be used as attack models in the validation of the paper.

Thanks for your suggestion. The relevant experiments are included in our main paper. Fig. 3 and Fig. 4 illustrate the performance of the vanilla model with and without FE under white-box and black-box attacks, respectively. Tab. 1 presents the performance of SOTA robust SNNs with and without FE under white-box attacks. We have now added the performance of adversarial models ( $i.e.$ , AT [13] and RAT [9]) with and without FE under black-box attacks in Tab. R5 of the rebuttal appendix. All experimental results confirm that FE effectively enhances the robustness of the model.

评论- Looking forward to the exact experimental setting for the last question

2024-08-08

Dear Reviewer zYHj:

Thank you for your detailed feedback. Regarding the last question in the Questions part, it is confusing for us to understand the exact experimental setting of “the adversarial model with and without FE modules to be used as attack models”. Do you mean using black-box attack to evaluate the performance of the model with and without FE? We have now answered the question from this understanding, the detailed answer can be found in our Rebuttal. If you feel that the answer does not meet your question, please let us know, and we will be glad to give further responses.

2024-08-12

Thank you for your reply. Based on your rebuttal, FE is just a preprocessing process. This means that FE can also be added to the ANN-based image classifier and has little to do with the characteristics of the SNN itself. My question is actually about how you implement the adversarial attack. Is the adversarial perturbation (1) applied to the image domain before FE, or (2) applied to the filtering result after FE. If based on case (1), it means that FE needs to have differentiable conditions. How did you achieve it? If based on case (2), it is necessary to make FE differentiable to achieve a true white-box attack. At the same time, for (2), my other question is what will the performance be if an SNN black-box model without FE and an SNN black-box model with FE and differentiable processing are used? Please integrate the answer into the paper after answering this question.

评论- Further Response to Reviewer zYHj

2024-08-13

Thanks for your further comment. We address your concern in two parts:

1. This means that FE can also be added to the ANN-based image classifier and has little to do with the characteristics of the SNN itself.

Since ANN-based image classifiers lack temporal characteristics and timesteps, we construct five training datasets for ANN-based methods: (1) images generated by FE using all timesteps (similar to data augmentation), and (2) images generated by FE at each of four individual timesteps.

We have already evaluated the models' performance using these five training datasets, with the results presented below (can also be found in Tab. 2 of the main paper). These results suggest that applying FE without considering temporal characteristics is less effective.

Table 2: Effect of different training datasets generated by FE. The attack is PGD with perturbation $\epsilon=4/255$ , iterative step $\alpha=0.01$ , and iterative step $k=4$ . The dataset is CIFAR100 with $T = 4$ , the network is VGG11.

Datasets	Clean	GN	FGSM	PGD	BIM	CW
images generated by FE using all timesteps	70.88	69.73	14.43	4.33	4.19	6.21
images generated by FE at the first timestep	62.01	60.73	9.47	2.28	2.17	4.79
images generated by FE at the second timestep	68.78	67.55	13.62	4.74	4.35	6.37
images generated by FE at the third timestep	69.96	69.26	14.60	5.38	5.20	6.87
images generated by FE at the fourth timestep	70.95	70.39	15.72	5.41	5.22	7.45
FE (Ours, using the temporal characteristics of SNN )	71.40	70.59	16.80	6.89	6.62	8.09

Please note that the models referenced in Tab. 2 are SNNs, used to ensure a fair comparison for validating frequency mask strategies, rather than to assess the effectiveness of adding FE to ANN-based classifiers. Due to the rush time, we are unable to conduct experiments on ANN models. We will perform a fair comparison by adding FE to ANNs and will include the results and discussions in the final version.

2. How do you implement the adversarial attack when FE is employed?

In our study, we implement the adversarial attack by applying adversarial perturbations to the image domain before the FE module, aligning with the first scenario you mentioned. We achieve the differentiable conditions of FE by:

According to Eq. 7-10 of the main paper, the formulation of our FE is as follows: $\tilde{x} = \mathcal{F}^{-1} \left( \mathcal{M} \odot \mathcal{F}(x) \right),$ where $x$ is the original image, $\tilde{x}$ is the FE-encoded image, $\mathcal{F}$ and $\mathcal{F}^{-1}$ represent the Discrete Fourier Transform (DFT) and Inverse-DFT (Eq. 7 and Eq. 10), and $\mathcal{M}$ is the frequency mask (Eq. 8 and Eq. 9).
Both $\mathcal{F}$ and $\mathcal{F}^{-1}$ are differentiable [33] and can be directly implemented using the torch.fft.fft2(x) function in the PyTorch framework.
The frequency mask operation $\mathcal{M} \odot \mathcal{F}(x)$ involves element-wise multiplication of the frequency domain image $\mathcal{F}(x)$ with the 0, 1 matrix $\mathcal{M}$ of the same size, which is also differentiable, as implemented in [15].

Thus, adversarial perturbations can be generated directly through these differentiable operations.

We will revise the paper to incorporate the above explanation as follows (revised or newly added contents are in bold):

In line 193 of Section 4.2:

“In summary, the proposed FE method, as described in Eq. 10, allows us to control the frequency mask radius $r$ at each time step, enabling the suppression of different frequency ranges. Since the DFT ( $\mathcal{F}$ ), IDFT ( $\mathcal{F}^{-1}$ ) (Eq. 7 and Eq. 10) and frequency mask operation ( $\mathcal{M} \odot \mathcal{F}(x)$ , Eq. 8 ) are differentiable [33,15], the FE module can be directly utilized to generate adversarial perturbations. Therefore, the adversarial perturbations are applied to the image domain before FE.”

In line 239 of Section 5.1:

“The attack methods include adversarial attacks ( $i.e.$ , FGSM, PGD with random start, BIM, and CW, for both white-box and black-box attacks) and common noise attack ( $i.e.$ , gaussian noise, GN). In our study, the adversarial perturbations are applied to the image domain before FE, leveraging the differentiable property of the FE module.”

Reference

[15] Lirong He, Qingzhong Ai, Yuqing Lei, Lili Pan, Yazhou Ren, and Zenglin Xu. Edge enhancement improves adversarial robustness in image classification. Neurocomputing, 518:122–132, 2023.

[33] Duraisamy Sundararajan. The discrete Fourier transform: theory, algorithms and applications. World Scientific, 2001.

评论- Response to Reviewer zYHj

2024-08-13

We are glad to have such a nice discussion with you. Thanks for your insightful suggestions which significantly help improve the quality of our work. We feel quite encouraged!

2024-08-13

I truly appreciate the effort the author has put into this work. However, I find that Table 2 does not fully convey a sense of dynamics to me. Upon revisiting Table 2, it gave me an opportunity to reflect further on your statement: "This superiority stems from our frequency encoding, which simulates selective visual attention in the biological nervous system, thereby enhancing the model’s robustness more effectively." (above Table 2 in the main content).

I’m curious about how the model represents or emulates selectivity in the retina. I wanted to share some paragraphs for your reference and to hear your thoughts on this matter. From my perspective, selective attention should be more of a data-driven approach.

"A number of studies have measured the influence of selective attention on the coding of visual stimuli by single neurons (e.g., Spitzer et al., 1988; McAdams & Maunsell 1999, 2000; Reynolds & Chelazzi 2004) and populations of single neurons (Cohen & Maunsell 2009, 2010), and they have discovered that attention appears to increase the information conveyed about stimuli. Moreover, attention-driven increases in coding appear to be specific to behavioral conditions in which an animal’s perceptual sensitivity per se, rather than simple response bias, is increased (Luo & Maunsell 2015)."

This content is from your reference [8]. If you delve into [8], you’ll find that "selective attention is defined behaviorally as a relative improvement in psychophysical performance for attended versus unattended stimuli," which seems different from "closely resembling the retinal coding rather than a cognitive process" as mentioned in the authors’ rebuttal. I would gently suggest that the authors reconsider including the bio-inspired aspect if it is not fully aligned with common viewpoints.

评论- Further Response to Reviewer zYHj

2024-08-13

We sincerely appreciate your insightful feedback, which will greatly improve our paper. While there may be potential misleading in References [8] and [E] as they share the same title but have different authors, we agree with your suggestion that selective attention should be more of a data-driven approach. We agree that the current version lacks a comprehensive exploration of this viewpoint, and we will remove the bio-inspired aspects that do not fully align with common viewpoints in our revised version. Thank you for providing us with a learning opportunity and potential exploration directions for future research. Thank you once again for your time and efforts.

References

[8] Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual review of neuroscience, 18(1):193–222.

[E] Moore, T., & Zirnsak, M. (2017). Neural mechanisms of selective visual attention. Annual review of psychology, 68(1), 47-72.

2024-08-13

Thanks. I will increase my score to 5. Happy to see the performance improvement.

审稿意见

评分: 7置信度: 52024-07-08

This paper presents a unified framework for SNN robustness, based on this framework, this paper further proposes a frequency encoding (FE) method for SNNs to decrease the input perturbations and proposes an evolutionary membrane potential leak factor (EL) to ensure that different neurons in the network learn the optimal robustness leak factor at different time steps, thus improving the robustness of SNNs. Extensive experiments are conducted to verify the effectiveness of this method.

优点

The authors present a unified framework for SNN robustness constraints, which provides a potential explanation for the robustness improvements achieved by previous work and inspires enhancements in the encoding method and the leak factor for SNN robustness.
The proposed FEEL method crops information from high-frequency to low-frequency to remove the input noise and learn the optimal robustness leak factor at different time steps. The extensive results demonstrate that the FEEL method is state-of-the-art. Both the FE and EL methods further enhance the robustness of current defense strategies.

缺点

The Frequency Encoding (FE) is proposed to suppress the perturbation $\varepsilon (t)$ in Eq. (6). The implementation of FE is based on the cropping operation in Eq. (9). Although such an implementation gives the benefit of $\varepsilon (t)$ suppression for $T>1$ , it also brings the drawback of valid information loss. It is not clear whether the benefits outweigh the drawbacks or vice versa. Please provide more evidence or analysis to support the performance improvement by FE (as compared with direct coding) in Table 2.
Section 4.3 introduces the implementation of considering leak factor $\lambda$ in the first term of Eq. (6). According to the objective of Eq. (6), an intuitive approach is to minimize $\lambda$ . However, the authors proposed a learnable leak factor, which seems to contradict this intuitive approach. Please clarify it.
There is a new attack method [1] specifically designed for SNNs which outperforms attacks designed for ANNs. I wonder how the proposed method in this paper performs under such a kind of attack.

[1] Bu T, Ding J, Hao Z, et al. Rate gradient approximation attack threats deep spiking neural networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023: 7896-7906.

问题

please refer to the weakness

局限性

Yes

作者回复

2024-08-06

1. Please provide more evidence or analysis to support the performance improvement by FE.

We would like to show our FE not only improves defense accuracy but also maintains clean accuracy from both data observation and additional experimental validation.

Data observation. We visualize the frequency spectrums of CIFAR10 images alongside the added GN, FGSM, PGD, BIM, and CW noise in Fig. R1 of the rebuttal appendix. As illustrated in Fig. R1, the information of the original image is concentrated in the low-frequency region (center area of the second column), while the noise information spans from low-frequency (center area) to high-frequency (edge area) regions (third to fifth columns). The proposed FE removes noise by progressively cropping information from high-frequency to low-frequency regions over time steps. (as $r$ in Eq. 9 of the main paper gradually decreases over time). Since the information of the original image is concentrated in the low-frequency region, this method minimizes the loss of valid information from the original image.
Experimental validation. To further verify the effectiveness of FE, we compare it with an alternative strategy, Inverse-FE (IFE), which crops information from low-frequency to high-frequency over time steps. As shown in Tab. R6 below, IFE causes a significant drop in clean accuracy (64.81% vs. vanilla 92.64%). This demonstrates that a substantial amount of valid information is lost, verifying that valid information is concentrated in the low-frequency area. In contrast, FE not only effectively removes noise (21.56% vs. vanilla 15.59% when under PGD attack) but also minimizes the loss of valid information (92.26% vs. vanilla 92.64%).

Table 6: Performance (%) of the proposed Frequency Encoding (FE) and the alternative strategy Inverse-FE (IFE). The perturbation $\epsilon=4/255$ for all attacks, and iterative step $k=4$ , step size $\alpha=0.01$ for PGD. The dataset is CIFAR10 with time step $T=4$ , the network is VGG11.

Method	Clean	GN	FGSM	PGD	BIM	CW
Vanilla	92.64	91.28	35.47	15.59	14.95	6.92
IFE	64.81	64.48	12.33	4.44	4.25	4.18
FE	92.26	92.02	39.67	21.56	21.05	10.12

2. Please clarify why not minimize $\lambda$ directly.

We did not minimize $\lambda$ directly as it would negatively impact clean accuracy on original images. This is supported by both theoretical analysis and experimental validation.

Theoretical analysis. According to Eq. 2 of the main paper, the leak factor controls the residual membrane potential between time steps. A smaller leak factor may lead to a weakened temporal modeling capability of the SNN, leading to a decline in network performance [A]. Considering the leak factor's dual role in original information transmission (Eq. 2) and robustness enhancement (Eq. 6), we propose an evolutionary leak factor (EL). The EL dynamically learns the optimal robustness leak factor across different time steps and neurons, which also increases the expression capability of SNN, helping maintain clean accuracy and improving robustness.
Experimental validation. We compare EL with two alternative strategies. The first strategy sets all leak factors to 0. The second strategy, adds L2 regularization to the EL to further constrain the leak factor. As shown in Tab. R1 of the rebuttal appendix, a small leak factor significantly reduces clean accuracy (vanilla 92.64% vs. FEEL ( $||\lambda||_2$ ) 88.52% vs. FEEL ( $\lambda=0.0$ ) at 81.76%), consistent with theoretical analysis above. Besides, a small leak factor does increase the robustness of SNN ( $e.g.$ , under PGD attack, FEEL ( $\lambda=0.0$ ) is 63.80%, FEEL ( $||\lambda||_2$ ) is 29.98%, compared to vanilla 15.59%). This also aligns with the proposed robustness framework (Eq. 6) by demonstrating that controlling the leak factor improves robustness. And our EL method ensures improvements in both robustness and original accuracy ( $e.g.$ , the PGD defense accuracy of FEEL (learnable $\lambda$ ) is 30.27%, compared to 15.59% for vanilla, and the clean accuracy of FEEL (learnable $\lambda$ ) is 92.73%, compared to 92.64% for vanilla).

3. The performance of the proposed method when under RGA attack[B].

We expand the results in Tab. 1 of the main paper with the RGA strategy [B]. As results shown in Tab. R4 of the rebuttal appendix, under a PGD attack with RGA, AT+FEEL improves accuracy to 10.83%, compared to 8.57% for AT alone. Similar improvements are observed with other attacks. These results confirm that our methods (FE and FEEL) achieve state-of-the-art defense accuracy and enhance the robustness of existing approaches, even against SNN-specific attacks. This is because our FEEL method, which introduces frequency encoding and learnable leak factors, increases the complexity of spiking neurons, effectively countering attacks.

评论- Thanks for your response

2024-08-13

The author has adequately addressed my concerns, and I recommend incorporating these results in the final version.

评论- Response to Reviewer yJ85

2024-08-13

Thanks for your acknowledgment of our work. We will incorporate the analysis and results into the final version of our paper.

审稿意见

评分: 4置信度: 52024-07-12

This paper aims to enhance the robustness of SNN. The authors first present a unified framework for SNN robustness. They propose a frequency encoding method that filter the noise in frequency domain. Based on that, they also propose the trainable leaky parameter to better constrain robustness. Experimental results on various datasets validate that both our FE and EL methods can effectively improve the robustness of SNN to different noises.

优点

The frequency encoding method is novel. The FE-SNN is able to filter out noise by processing information in the frequency domain.

The authors conducted very comprehensive experiments to demonstrate the effectiveness of the proposed method. The experiments results demonstrate that FEEL can be combined with adversarial training or other robustness enhancement algorithms to obtain more robust SNNs.

缺点

The theoretical framework is not rigorous enough. It is not obvious from Eq. 6 that a smaller TERM 1 (term 1) will result in less perturbation in the output, since the change of term 1 may affect $$. The authors need more solid theory to support the FE and EL methods.

The robustness improvement is not significant. Sometimes robustness performance of FEEL is even worse than FE.

问题

See in weaknesses

局限性

Authors are encouraged to introduce the additional training/inference cost of the proposed method.

评论- Explaination for the first question

2024-08-03

Sorry for the ambiguous comment of my first question in Weaknesses. My question is how the authors conclude that a smaller leaky factor will increase robustness based on Eq. 6? It seems that if you directly reduce the leaky factor, the gradient term will also change and it may also affect the whole equation. Could the authors explain more about that?

作者回复

2024-08-06

1. Could the authors explain more about why a smaller leak factor will increase robustness based on Eq. 6, since the change of term 1 may affect other terms in Eq. 6?

We would like to discuss how the leak factor affects other terms in Eq. 6 in two cases: 1) leak factor $\lambda$ as a hyperparameter predefined before neural network training and 2) leak factor $\lambda$ as a learnable parameter during neural network training (the proposed implementation).

CASE 1: $\lambda$ is a fixed value during neural network training (similar to $\epsilon$ in term 1). Hence, it will not affect other terms in Eq. 6. To validate the correctness of our theoretical framework ( $i.e.$ , smaller term 1 results in less perturbation in the output), we conduct additional experiments, training different neural networks with different fixed $\lambda$ (keep the remaining settings the same as that reported in experimental settings in the main paper). As results shown in Tab. R1 (rebuttal appendix), a smaller $\lambda$ results in a more robust model, indicating that smaller term 1 results in less output perturbation. As can also be observed from Tab. R1 (rebuttal appendix), a smaller $\lambda$ could bring performance degradation for clean inputs, $i.e.$ , from 92.26% at $\lambda=1.0$ to 81.76% at $\lambda=0.0$ . Therefore, we implement the leak factor as a learnable parameter to mitigate performance degradation.
CASE 2: $\lambda$ is a learnable parameter updated within neural network training. It is not easy to directly analyze the influence of $\lambda$ on other terms in Eq. 6 due to their complex relationship. Therefore, we analyze the influence by validating whether term 1 for robustness improvement affects term 2 or term 3's effectiveness for the same goal. To be specific, as analyzed in line 160, RAT [9] (weights regularization) is essentially minimizing term 2. We implement another comparison method which is implemented by adding a gradient constraint via the L2 norm (gradient penalty regularization (GP), same implementation as that in [D]) to minimize term 3. We compare these two methods to two alternatives of our methods. These two alternatives are implemented by additionally optimizing $\lambda$ for methods RAT and GP (keeping remaining parts unchanged), represented as RAT+EL and GP+EL, respectively. As shown in Tab. 1 (main paper) and Tab. R1, Tab.R2 (rebuttal appendix), RAT+EL and GP+EL significantly improve the robustness of RAT and GP, across different attack types and datasets, respectively. These results show that leveraging term 1 for robustness improvement does not interfere with term 2 or term 3’s effectiveness for the same goal, indicating that the leak factor does not affect other terms in Eq. 6.

In summary, results in both cases indicate that the leak factor does not affect other terms in Eq. 6 on SNN robustness. We would like to discuss this with you in the reviewer-author discussion period if you have further questions.

2. The robustness improvement is not significant. Sometimes robustness performance of FEEL is even worse than FE.

We would like to show that our robustness improvement is consistent and significant in two aspects. Kindly note that FE is essentially to fix $\lambda=1$ .

Consistent improvement over FE by an alternative implementation for all attack types. As discussed in our response to Question 1, an alternative implementation is to strictly adhere to the theoretical framework of Eq. 6, $i.e.$ , set the leak factor to 0. As can be observed from Tab. R1 (rebuttal appendix), FEEL ( $\lambda=0.0$ ) can achieve consistent robustness improvement over FE. This observation validates the effectiveness of our theoretical framework.
Significant improvement over FE and SOTA methods by the proposed implementation for average performance across different attack types ( $i.e.$ , FGSM, PGD, BIM, and CW). As can be observed from Fig. 3 and Fig. 4 (main paper), FEEL significantly outperforms FE (black: 6.8% over 4.2%, white: 10.8% over 4.9%, CIFAR10, VGG11, T=4). Furthermore, as described in Tab. 1 (main paper), the average improvement of StoG+FEEL (5.8%) and StoG+FE (4.6%) are 1.6 and 1.2 times larger than the improvement achieved by the SOTA method StoG [11] over vanilla method (3.6%).

3. The additional training/inference cost of the proposed method.

We would like to introduce the training/inference cost of our method in terms of 1) Training/inference time and 2) Convergence speed comparison and analysis.

Training/inference time. Tab. R3 (rebuttal appendix) presents the training/inference time of the proposed method and other SOTA methods, which demonstrates that the training time added by our method is less than other methods, under the same experimental settings (detailed in lines 218, 450-454 in the main paper). Compared to StoG [11], which optimizes additional stochastic gating factors for SNN robustness, our method demonstrates more efficient training times. Particularly, our method has a significant advantage over adversarial training methods such as AT [13] and RAT [9], since our method does not need additional adversarial data for training.
Convergence speed comparison and analysis. As shown in Fig. R2 (rebuttal appendix), incorporating our FEEL module results in faster convergence of the training loss. This may be due to the fact that our EL increases the learning ability of the network, making it converge faster.

In summary, these observations further confirm the superiority of our approach in terms of training and inference costs.

评论- Thanks for your precious time and we would like to see if there are any further concerns and comments

2024-08-13

Dear Reviewer uyY9:

Following extensive communication and discussion with the other two reviewers, they acknowledged the contribution of our work and subsequently raised their scores to positive. In light of this and the detailed responses provided in our rebuttal, we hope our feedback has adequately addressed your concerns. We sincerely request your reconsideration of our manuscript.

If you have any further questions or comments, we would be pleased to provide additional responses. We understand the approaching deadline for author-reviewer discussion and are afraid you may get further comments we cannot respond in time due to the closing of the system. However, we will try our best to discuss or address the potential issues you may raise in the final version of our paper.

Below, we concisely summarize our responses to your concerns to facilitate a quick review.

Theoretical Framework: Your first concern relates to the rationality of our theoretical framework. To address this, we provide two cases: (1) treating the leak factor as a hyperparameter and (2) treating it as a learnable parameter. These cases illustrate that the leak factor in term 1 does not affect other terms within the framework, ensuring model robustness.
Method Performance: Your second concern pertains to the performance of our method. We present an alternative implementation based on theoretical analysis and the average performance of FE and FEEL across various attack types, demonstrating that our robustness improvements are consistent and significant.
Training/Inference Costs: Your third concern involves the costs associated with training and inference. We compare (1) training/inference time and (2) convergence speed to confirm the superiority of our approach in these aspects.

We look forward to hearing from you soon.

Best regards,

Authors of paper 3953

评论- Thank you for your time and we hope that our response helps for your assessment of our work

2024-08-14

Dear Reviewer uyY9：

We feel incredibly fortunate to have received such valuable comments from experts like yourself. Because you and the other two reviewers all with the highest confidence scores and provide very professional suggestions to further improve the quality of our paper.

Following thorough discussions with the other two reviewers, we have gained significant insights, and both other reviewers raised their initial ratings, from 6 to 7 and from 4 to 5, respectively. These positive feedbacks are really encouraging.

We are eager to engage with you and learn from your perspectives. We will await your reply until the deadline and will respond promptly to any additional concerns you may have.

We look forward to your feedback.

Best regards,

The Authors of Paper 3953

评论- Looking forward to your feedback in the last three hours

2024-08-14

Dear Reviewer uyY9:

Thanks for your constructive suggestions on our work. Given that the discussion period is closing in the next 3 hours, if you have any further questions, please feel free to reach out to us. We will remain attentive to any new concerns and will respond promptly.

In the event that we do not receive any feedback, we assure you that all rebuttal content will be incorporated into the final version of our paper, and we will release the relevant code to ensure the reproducibility of our experiments.

Once again, we sincerely appreciate your time and contribution to improving our paper.

We look forward to your feedback.

Best regards,

The Authors of Paper 3953

作者回复

2024-08-06

We appreciate all the reviewers for the insightful feedback. We are encouraged that they recognize the significance and urgency of our motivation [Reviewer zYHj], the novelty [Reviewer uyY9] and effectiveness [Reviewer zYHj] of our method, and the comprehensiveness [Reviewer uyY9] and extensiveness [Reviewer yJ85] of our experiments.

In response to each reviewer's comments, we have provided point-by-point replies in the corresponding sections. Figures R1-R2 and Tables R1-R5 referenced in our rebuttal are included in the newly uploaded Rebuttal Appendix PDF. The references are listed below.

We will add all additional discussions and results to the final version of our paper and release relevant codes for the reproducibility of all experiments.

Welcome further discussion during the reviewer-author discussion period if there are any additional questions.

References

[8] Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual review of neuroscience, 18(1):193–222.

[9] Ding, J., Bu, T., Yu, Z., Huang, T., & Liu, J. (2022). Snn-rat: Robustness-enhanced spiking neural network through regularized adversarial training. Advances in Neural Information Processing Systems, 35:24780–24793.

[11] Ding, J., Yu, Z., Huang, T., & Liu, J. K. (2024) Enhancing the robustness of spiking neural networks with stochastic gating mechanisms. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 492–502.

[12] El-Allami, R., Marchisio, A., Shafique, M., & Alouani, I. (2021, February). Securing deep spiking neural networks against adversarial attacks through inherent structural parameters. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 774-779). IEEE.

[13] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572.

[31] Sharmin, S., Rathi, N., Panda, P., & Roy, K. (2020). Inherent adversarial robustness of deep spiking neural networks: Effects of discrete input encoding and non-linear activations. In European Conference on Computer Vision, pages 399–414. Springer.

[A] Rathi, N., & Roy, K. (2021). Diet-snn: A low-latency spiking neural network with direct input encoding and leakage and threshold optimization. IEEE Transactions on Neural Networks and Learning Systems, 34(6), 3174-3182.

[B] Bu, T., Ding, J., Hao, Z., & Yu, Z. (2023). Rate gradient approximation attack threats deep spiking neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7896-7906).

[C] Ding, J., Pan, Z., Liu, Y., Yu, Z., & Huang, T. (2024). Robust Stable Spiking Neural Networks. In International Conference on Machine Learning.

[D] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A. C. (2017). Improved training of wasserstein gans. Advances in neural information processing systems, 30.

评论- Finalization of discussion before the author-reviewer discussion period ends

2024-08-12

Thanks to all who contributed to this discussion. As AC, I invite authors and reviewers to kindly comment and reply on the following points which are left open. From reviewers, we need replies to the question about the significance of the robustness improvement.
From authors, we need a response to the questions of Reviewer zYHj Considering and discussing these issues before the end of the author-referee rebuttal period would be very useful. Thanks Your AC

评论- Looking forward to your feedback

2024-08-12

Dear Reviewers:

Thanks for your thorough comments. We hope to know whether our response has addressed your concerns. Since the discussion period will end soon, please let us know if you have any further comments, and we will be glad to give further responses. Thanks a lot for your time and efforts!

We understand you may be in your busy schedules and have therefore provided a concise summary of our responses below to facilitate your quick review.

To Reviewer uyY9: Through additional experimental analysis, we validate the rationality of our theoretical framework from two perspectives and prove that our method can bring consistent and significant robustness improvements with efficient training/inference costs.
To Reviewer yJ85: We have further demonstrated the superiority of the FE and EL methods through frequency spectrum visualizations and various leak factor optimization techniques. Additionally, the newly introduced SNN-specific attack further confirms the effectiveness of our approach.
To Reviewer zYHj: From the different theoretical focus and implementation of the theoretical framework, we have highlighted the innovation of our theoretical analysis compared to previous works. The newly added ablation study illustrates the effectiveness of each module within our method.

We look forward to hearing from you soon.

Best regards,

Authors of paper 3953

评论- Summary of discussion results

2024-08-14

Dear Area Chair qrT2:

Thanks for your time and efforts in our work. At the end of the author-reviewer discussion period, we summarize our discussion results here.

1 Reviewer uyY9 (Rating: 4, Confidence: 5)

The first concern is whether the leak factor in term 1 affects other terms in our theoretical framework. We demonstrate through two cases that it does not.
The second concern is the performance of our method. We show consistent and significant robustness improvements of our method through theoretical analysis and average performance across various attack types.
The third concern is the training/inference costs. We demonstrate the efficiency of our approach in both (1) training/inference time and (2) convergence speed.

2 Reviewer yJ85 (Rating: 6->7, Confidence: 5)

The first concern is the effect of the proposed FE method. We show that FE improves defense accuracy while maintaining clean accuracy through data observation and experimental validation.
The second concern is the implementation of the EL method. We provide theoretical analysis and experimental validation to demonstrate its effectiveness.
The third concern is the performance under SNN-specific attack. The newly added experiments confirm that both FE and FEEL achieve state-of-the-art defense accuracy and enhance robustness, even against SNN-specific attacks.

3 Reviewer zYHj (Rating: 4->5, Confidence: 5)

The first concern is the innovation of our theoretical analysis. We emphasize this through theoretical focus and implementation.
The second concern is the cognitive motivation of FE. We agree that selective attention should be more of a data-driven approach.
The third concern is about the ablation study. We add experiments with EL alone to illustrate the effectiveness of each module of our method.
The fourth concern is the difference between the EL and previous work. We emphasize this through motivation and implementation.
The fifth concern is how to train FE. We provide detailed experimental setups and explain the differentiable property of FE.
The sixth concern is the energy consumption of FE. We claim that FE does not disrupt the low-energy performance of SNNs as a data preprocessing method.
The seventh concern is the black-box performance of FE. All experimental results confirm that FE effectively enhances the robustness of the model both in white and black-box attacks.

After extensive discussions with Reviewers yJ85 and Reviewers zYHj, they acknowledge the contribution of our work and raise their scores, from 6 to 7 and from 4 to 5, respectively. It is with deep regret that Reviewer uyY9 did not participate in the discussion, we do not know whether he was satisfied with our response. We sincerely thank every reviewer for their suggestions and efforts in our work. We will add all additional discussions and results to the final version of our paper and release relevant codes for the reproducibility of all experiments.

Finally, we emphasize the contributions of our paper:

We present a unified framework for SNN robustness, which offers a different perspective on enhancing SNN robustness.
Based on the proposed robustness framework, first, we propose a frequency encoding (FE) method for SNNs, which captures information of varying frequencies at different time steps, and preserves the original information while suppressing noise in different frequency ranges.
Second, we propose an evolutionary membrane potential leak factor (EL). EL ensures that different neurons in the network learn the optimal robustness leak factor at different time steps.
Experimental results validate that both our FE and EL methods can consistently and significantly improve the robustness of SNN to different noises, and can be used in conjunction with other methods to improve the robustness further.

Thanks again for AC and all the Reviewers' time and efforts.

Best regards,

The Authors of Paper 3953

最终决定Accept (poster)

2024-09-25

This paper presents a unified framework for SNN robustness, based on this framework, this paper further proposes a frequency encoding (FE) method for SNNs to decrease the input perturbations and proposes an evolutionary membrane potential leak factor (EL) to ensure that different neurons in the network learn the optimal robustness leak factor at different time steps, thus improving the robustness of SNNs. Experiments were conducted to verify the effectiveness of this method. Concerns were raised during review about theoretical aspects, novelty and validation, which were largely addressed in rebuttal.

FEEL-SNN: Robust Spiking Neural Networks with Frequency Encoding and Evolutionary Leak Factor

摘要

评审与讨论

优点

缺点

问题

局限性

1. The innovation point of this work is not clear compared with StoG [11].

2. The cognitive motivation of FE is inconsistent with dynamic parameters proposed later.

3. Lack of performance when using the EL method alone.

4. What is the difference between the EL method and DLIF [C]

5. How did the author train a network containing frequency domain filtering (FE)?

6. Whether FE disrupts the low energy consumption characteristics of spiking neural networks.

7. It is necessary for the adversarial model with and without FE modules to be used as attack models in the validation of the paper.

1. This means that FE can also be added to the ANN-based image classifier and has little to do with the characteristics of the SNN itself.

2. How do you implement the adversarial attack when FE is employed?

优点

缺点

问题

局限性

1. Please provide more evidence or analysis to support the performance improvement by FE.

2. Please clarify why not minimize λ\lambdaλ directly.

3. The performance of the proposed method when under RGA attack[B].

优点

缺点

问题

局限性

1. Could the authors explain more about why a smaller leak factor will increase robustness based on Eq. 6, since the change of term 1 may affect other terms in Eq. 6?

2. The robustness improvement is not significant. Sometimes robustness performance of FEEL is even worse than FE.

3. The additional training/inference cost of the proposed method.

2. Please clarify why not minimize $\lambda$ directly.