PaperHub
5.2
/10
Poster6 位审稿人
最低4最高7标准差1.1
7
6
4
4
5
5
4.0
置信度
正确性2.8
贡献度2.8
表达2.8
NeurIPS 2024

Autonomous Driving with Spiking Neural Networks

OpenReviewPDF
提交: 2024-05-13更新: 2024-11-06
TL;DR

We present Spiking Autonomous Driving (SAD), the first unified Spiking Neural Network (SNN) to address the energy challenges faced by autonomous driving systems through its event-driven and energy-efficient nature.

摘要

关键词
Spiking Neural NetworksNeuromorphic ComputingBrain-inspired Computing

评审与讨论

审稿意见
7

This paper introduces Spiking Autonomous Driving (SAD), an end-to-end spiking neural network (SNN) designed for autonomous driving. SAD integrates perception, prediction, and planning into a unified neuromorphic framework. It achieves competitive performance on the nuScenes dataset and significantly outperforms traditional artificial neural network (ANN) methods in terms of energy efficiency. The authors anticipate that this work will spur further research into neuromorphic computing for sustainable and robust autonomous driving solutions.

优点

  1. Autonomous driving can be seen as an energy-limited scene, so I think it is meaningful to introduce SNNs to this task. To the best of my knowledge, this paper is among the first to utilize SNNs to AD.
  2. This paper provides a unified framework on how to apply SNNs toAD. The authors employ a dual-pathway architecture and I think this is an innovative design in the use of spiking neurons.
  3. SAD achieves competitive results on the BEV segmentation IoU and semantic segmentation IoU, indicating the practical viability of the proposed method. The results in Table 3 have really surprised me, for its great energy-efficiency.

缺点

  1. Notation typos. It seems that the authors make mistakes in equations in Line 148, 155, 157, and 169. For example, I think XX \in {0,1} C×T×L×H×W^{C\times T\times L\times H\times W} instead of XRC×T×L×H×WX \in \mathbb{R}^{C\times T\times L\times H\times W} because XX should be a spike train rather than a floating-point matrix.
  2. I think energy efficiency is the key to this paper. However, the authors talk little about this issue in Section 4.1. Can you provide me more about how you calculate the energy consumption and your detailed energy experiments of all 3 steps (i.e., perception, prediction, and planning)?

问题

  1. I wonder if the numbers in your proposed SNNs are spikes or not. R\mathbb{R} stands for real number field.
  2. Can you show the detailed energy results of all steps in Table 3? I think it is incomplete to show the planning step only.
  3. In Line 238, I wonder why you chose GRU to refine the trajectory. Does GRU have unique advantages on this issue?

局限性

The authors mentioned limitations in the Conclusion Section, focusing on further validation through real-world on-vehicle testing. However, it would be better for authors to discuss limitations in more detail.

作者回复

Thank you for your thorough review and insightful comments on our manuscript. We appreciate the time and effort you've invested in providing valuable feedback. We have carefully considered your points and are pleased to address them below.

I wonder if the numbers in your proposed SNNs are spikes or not. stands for real number field.

Response: Thank you for your observation. You are correct; the SNNs indeed represent spikes. We should clarify that the domain should be {0,1}\{0,1\} instead of R\mathbb{R}, accurately reflecting the spiking nature of the neurons.

Can you show the detailed energy results of all steps in Table 3? I think it is incomplete to show the planning step only.

Response: We apologize for any confusion. To clarify, Table 3 presents the overall energy results, not just a single stage. Here's a detailed breakdown of the energy consumption for each stage:

StepEnergy (mJ)
Perception41.09
Prediction4.53
Planning1.30
Overall46.92

In Line 238, I wonder why you chose GRU to refine the trajectory. Does GRU have unique advantages on this issue?

Response:We chose GRU for its effectiveness in handling temporal and spatial mixing. While both GRU and LSTM are suitable for this task, GRU offers a slight advantage in terms of computational efficiency. It has a lower token mixer complexity compared to LSTM, as it uses one less hidden state. This makes GRU faster while still providing comparable performance for our specific use case.

We hope these clarifications and additional details address your concerns and improve the quality of our manuscript. Once again, we sincerely appreciate your valuable feedback, which has helped us enhance the clarity and completeness of our work.

评论

I have thoroughly reviewed the authors' response, and they have addressed my concerns. I also read their replies to other reviewers and noticed that some reviewers also had doubts about the energy computation of this paper as I did, but I solved the doubts about how to calculate energy consumption in the authors' second reply to reviewer EVdw.

评论

Thank you for your thorough review and for taking the time to read our responses to all reviewers. We are grateful for your increased rating of our paper and value your contribution to improving our work, and we appreciate your careful consideration of our explanations, especially regarding the energy computation.

审稿意见
6

Although SNNs have potential of neuromorphic computing in sustainable and safety-critical autonomous technology, they still lack evidence in complex real-world computer vision applications. In this work, the authors propose a unified end-to-end SNN called SAD that consists of three models to generate safe trajectories, which presents improved energy efficiency. The researchers complete the validation on the nuScenes dataset, which not only verifies the good performance, but also proves the effectiveness to a certain extent.

优点

The proposed idea is novel and relevant to the NeurIPS community.

  1. SNNs have attracted a lot of attention in recent years due to their better performance and low energy consumption, but the application potential remains to be explored. In this work, the authors investigate an end-to-end SNN-based approach applying in autonomous driving, which shows impressive performance similar to traditional neural network approaches. This increases the impact of the work.

  2. The proposed method can help reduce energy consumption and plan a safe path, which shows that SNNs have great potential to be applied in the field of autonomous driving systems.

  3. In experiments, the authors compare the proposed method with traditional deep learning works in recent five years, and obtain competitive results.

缺点

Comparisons with state-of-art works in the last three years are lacking in the results of perception, prediction, and planning in Section 4.1.

问题

  1. In related works, the authors should add a relevant description of the connection between this research and End-to-end Autonomous Driving, as in the previous paragraph “We push...in this paper ”.

  2. In Section 4.1, the authors mention that "The results, as summarized in Tab. 1, show that our SAD method, ..., competes favorably against state-of-the-art, non-spiking artificial neural networks (ANNs)”. But I only see a specific figure of 7.43%. Please add descriptions to support it.

  3. This work lacks a discussion of the robustness of the proposed method, I would like to recommend adding experiments on more datasets.

  4. In Table 4 and Table 5, the authors do not show complete ablation experiments, e.g., SEW+SP in Table 4, SA+SR in Table 5, etc. Please provide the necessary details.

局限性

Some details in the manuscript are not clearly stated and algorithmic performance needs more convincing.

作者回复

We thank the reviewer for their thorough and constructive feedback on our manuscript. We appreciate the positive comments on the novelty and relevance of our work, as well as the recognition of its potential impact in the field of autonomous driving. We have carefully considered all the points raised and have addressed them in detail in the following sections. We kindly ask the reviewer to find our responses to their specific questions and concerns in the content below.

In related works, the authors should add a relevant description of the connection between this research and End-to-end Autonomous Driving, as in the previous paragraph “We push...in this paper ”.

Response: Thank you for pointing this out. We will add a paragraph that connects our research to end-to-end autonomous driving in the final version. A draft passage is provided here: "While previous work has shown SNNs' effectiveness in autonomous control tasks, our research extends to the more complex challenge of end-to-end autonomous driving. We build upon these foundations, pushing SNNs to handle the challenging, real-time decision-making required for full autonomy in dynamic, real-world environments. This work bridges the gap between simplified control tasks and the holistic approach needed for true end-to-end autonomous driving.

In Section 4.1, the authors mention that "The results, as summarized in Tab. 1, show that our SAD method, ..., competes favorably against state-of-the-art, non-spiking artificial neural networks (ANNs)”. But I only see a specific figure of 7.43%. Please add descriptions to support it.

Response: Thank you for bringing this to our attention. You're right that we should provide more comprehensive support for our statement about competing favorably against state-of-the-art ANNs. We'll revise this section to include more detailed comparisons. A draft update is provided here: "The results, as summarized in Tab. 1, show that our SAD method, which is fully implemented with spiking neural networks (SNNs), competes favorably against several state-of-the-art, non-spiking artificial neural networks (ANNs). Specifically: Our SAD method achieves a mean IoU of 35.62%, which outperforms VED (28.19%), VPN (30.36%), PON (30.52%), and Lift-Splat (34.61%). While more recent methods like IVMP (36.76%), FIERY (40.18%), and ST-P3 (42.69%) achieve higher mean IoUs, our SNN-based approach comes close to their performance, especially considering the inherent energy-efficiency of using SNNs."

This work lacks a discussion of the robustness of the proposed method, I would like to recommend adding experiments on more datasets.

Response: We appreciate your suggestion. We agree that testing on additional datasets would strengthen our work. While time constraints prevent us from conducting these experiments for this paper, we acknowledge this limitation and plan to address it in future work. Specifically, we aim to extend our experiments to include the CARLA dataset.

In Table 4 and Table 5, the authors do not show complete ablation experiments, e.g., SEW+SP in Table 4, SA+SR in Table 5, etc. Please provide the necessary details.

Response: Thank you for your observation. We appreciate the suggestion for more comprehensive ablation experiments. However, we would like to clarify that our ablation studies are designed to compare individual changes in configuration against our proposed model (shown in the last row of each table). The purpose of these experiments is to demonstrate the impact of each specific component or strategy. Combining multiple changes simultaneously (e.g., SEW+SP in Table 4 or SA+SR in Table 5) would make it difficult to isolate the effect of individual components. Moreover, as our results show, changing one configuration already leads to a decrease in performance. It's reasonable to expect that altering multiple configurations simultaneously would likely result in an even greater performance drop.

We thank the reviewer once again for their valuable feedback and insightful questions. We believe that addressing these points has helped to clarify and strengthen our work. We have provided additional context for our research, expanded on our results, acknowledged limitations, and explained our experimental design choices. We hope that these responses adequately address the reviewer's concerns and further demonstrate the significance and potential impact of our work in advancing SNN-based approaches for autonomous driving. We are committed to incorporating these improvements in the final version of our paper, should it be accepted. We appreciate the reviewer's time and expertise in evaluating our submission.

评论

Dear Reviewer XRtX,

With the discussion period drawing to a close, we wanted to extend our heartfelt appreciation for your insightful comments and the positive feedback on our work, which has been incredibly helpful to us. We would be immensely grateful if you could kindly inform us whether our responses have resolved your concerns or if you have any additional questions that we can assist with.

评论

Dear Reviewer XRtX,

We're writing to kindly remind you that today is the final day of our author-reviewer discussion period. If you have any remaining concerns, we would be most grateful if you could share them with us as soon as possible.

Thank you once again for your positive review. Your insights have been invaluable, and we sincerely appreciate your time and expertise.

审稿意见
4

The authors adapted ST-P3 into a version with a binary spiking neuron and then stated that autonomous driving based on a spiking neural network can address the energy challenges. The authors stated that this neuromorphic technology can be a step toward sustainable and safety-critical automotive technology.

优点

This work has completed the end-to-end SNN-based autonomous driving pipeline, which is a heavy job.

The work has a clear architecture. This work is easy to follow.

The idea of using SNN in driving technology is interesting.

缺点

The evaluation is not as effective as its reference work, ST-P3, which has evaluations of open loop validation and close loop validation.

The implementation plan on neuromorphic hardware is not clear in this paper, which I think it’s the most important aspect for energy-efficient applications.

No comparison with recent state-of-the-art ANN models (later than ST-P3), as the necessity of autonomous driving using SNN is an open question. It is better to include more comparison and further discussion.

问题

What is the advantage of the SNN solution from the perspective of a safety-critical application? As this work is oriented towards this specific application, it is better to clearly state the advantages. Do you have specific safety considerations in the design of this model? Do you include any related work on SNN robustness in this paper?

Dual pathways and other architecture designs manifest a problem of mapping on neuromorphic hardware. I suspect that SGRU is as energy efficient as simple LIF. Eqs. 6, 7, 8, and 9 imply heavy matrix computation. Can you provide an explanation for this?

ANN seems more robust than SNN (Figs. 5, 7). In the figures of ANN outputs, are the vehicle markers clearer and more accurate? What do you think SNN will do to lead to this result? Do you have any solutions to this? Can you include experiments on this to address this problem? (This question is important.)

局限性

See questions and weaknesses.

作者回复

We thank the reviewer for their thorough examination of our work and their insightful comments. We acknowledge the challenges highlighted in your review and appreciate the opportunity to address them. In the following responses, we aim to clarify our contributions, explain our methodological choices, and discuss the implications of our findings for the field.

What is the advantage of the SNN solution from the perspective of a safety-critical application? As this work is oriented towards this specific application, it is better to clearly state the advantages. Do you have specific safety considerations in the design of this model? Do you include any related work on SNN robustness in this paper?

Response: Thank you for your question. We'd like to clarify our focus and address your points:

The primary aim of our work is to demonstrate the potential of SNNs to handle the complex requirements of low-power autonomous driving. Our current research is centered on establishing the feasibility and efficiency of SNN-based models for this application.

To our knowledge, there is ongoing research into robust SNNs [R1], demonstrating their potential in safety-critical applications. This existing work provides strong evidence for the viability of SNNs in scenarios where safety is paramount, such as autonomous driving.

Prior to this work, the application of SNNs to autonomous vehicle planning had not been achieved, and so integrating robustness, interpretability, and safety-critical features are a logical next step beyond efficiency.

Moreover, we appreciate your suggestion about including related work on SNN robustness. This is a valuable addition that we plan to incorporate in the final version of our paper.

Dual pathways and other architecture designs manifest a problem of mapping on neuromorphic hardware. I suspect that SGRU is as energy efficient as simple LIF. Eqs. 6, 7, 8, and 9 imply heavy matrix computation. Can you provide an explanation for this?

Response: Thank you for highlighting this point. The dual pathways and other complex architectural designs present challenges when mapping onto neuromorphic hardware. However, it's worth noting that designs similar to ResNet have been extensively explored in the field of Spiking Neural Networks (SNNs) and successfully implemented in hardware. This suggests that dual pathway architectures are quite straightforward to realize in neuromorphic systems. Given that most modern neuromorphic hardware utilize many smaller cores, parallel layers would be executed in separate cores, with their results merged in another core - quite similarly to residual/skip connections.

Regarding the energy efficiency of SGRU compared to simple LIF neurons, your intuition may be correct. The key lies in understanding the nature of the computations in Equations 6, 7, 8, and 9. While these equations might initially appear to involve heavy matrix computations, the actual implementation in SGRU is more efficient due to the binary nature of the signals.

In SGRU, the input xt{0,1}x_t \in \{0,1\} represents spikes. This binary representation simplifies the computations WirxtW_{ir}x_t, WizxtW_{iz}x_t, and WinxtW_{in}x_t. Similarly, since ht=(1zt)nt+ztht1h_t = (1 - z_t) \odot n_t + z_t \odot h_{t-1}, where zt{0,1}z_t \in \{0,1\} and nt{0,1}n_t \in \{0,1\}, it follows that hth_t is also a binary value. Consequently, the terms Whrht1W_{hr}h_{t-1}, Whzht1W_{hz}h_{t-1}, and Whnht1W_{hn}h_{t-1} become spike-driven operations.

评论

ANN seems more robust than SNN (Figs. 5, 7). In the figures of ANN outputs, are the vehicle markers clearer and more accurate? What do you think SNN will do to lead to this result? Do you have any solutions to this? Can you include experiments on this to address this problem? (This question is important.)

Response: Thank you for this important observation. You're correct that the ANN outputs appear more robust than the SNN outputs in Figures 5 and 7, with clearer and potentially more accurate vehicle markers. This difference highlights a crucial challenge in the field of neuromorphic computing.

The primary reason for this disparity lies in the fundamental nature of SNNs. While SNNs offer significant advantages in terms of energy efficiency due to their discrete, spike-based processing, this same characteristic can lead to information loss during training and inference. This trade-off between efficiency and performance is a well-known challenge in the SNN domain. The discretization of information into spikes, while beneficial for efficiency, can result in a reduction of fine-grained details, potentially leading to less clear outputs compared to traditional ANNs. This performance gap is something the entire field is working to overcome.

However, it's important to note that our work represents a significant step forward in applying SNNs to complex, real-world tasks like autonomous driving. We've made substantial efforts to bridge this performance gap:

  1. We've developed novel architectural approaches that are specifically tailored to maintain spike-driven processing while improving performance in complex tasks.
  2. Our training methods have been carefully designed to maximize the information carried by spikes, helping to mitigate some of the inherent limitations of spike-based computation.
  3. We've introduced innovative components, such as our unique Spiking Token Mixer, which have allowed us to achieve competitive performance without relying on traditional ANN elements.

These efforts have made it possible to apply SNNs to autonomous driving tasks, which was previously considered extremely challenging. While there's still room for improvement, our work demonstrates that SNNs can be viable for such complex applications.

Regarding robustness, you've highlighted an important point that, while not the primary focus of this paper, is crucial for the practical application of SNNs in safety-critical systems like autonomous driving. Improving SNN robustness is indeed a vital direction for future research. There are several promising approaches to enhance SNN robustness, such as developing more advanced neuron models, exploring different surrogate gradient methods, or applying adversarial training techniques. These methods could potentially be integrated with our current work to further improve performance and reliability.

In conclusion, while the current performance gap between SNNs and ANNs is evident in our results, our work represents a significant advancement in applying SNNs to complex, real-world tasks. We've laid a foundation that future research can build upon, not only to further improve performance but also to address critical aspects like robustness. This opens up exciting possibilities for the future of energy-efficient, neuromorphic computing in autonomous driving and other demanding applications.


[R1] Ding J, Yu Z, Huang T, et al. Enhancing the robustness of spiking neural networks with stochastic gating mechanisms [AAAI]

评论

Dear Reviewer j4mM,

As we near the conclusion of the discussion period, we wanted to extend our sincere gratitude for your valuable feedback. We would be most appreciative if you could let us know whether our responses have resolved your concerns or if there are any areas where you feel further elaboration would be beneficial.

评论

Dear Reviewer j4mM,

This is a kind reminder that today is the last day of the author-reviewer discussion period. If you have any concerns, please let us know as soon as possible so that we can address them.

审稿意见
4

This paper presents an end-to-end SNN model for the autonomous driving to address the energy challenges. This model consists of three main modules: perception, prediction, and planning. The model is evaluated in the nuScenes dataset.

优点

  1. This paper introduce the first SNN designed for end-to-end autonomous driving, integrating perception, prediction, and planning into a single model.

  2. well-written.

缺点

  1. The novelty is limited. The paper just uses the SNN to autonomous driving and does not provide any special designs.

  2. Only one dataset is used.

  3. The energy conputation does not consider the data moving.

问题

see weekness.

局限性

None

作者回复

We appreciate the time and effort the reviewer has dedicated to evaluating our work. Your feedback is valuable in helping us improve and clarify our research.

The novelty is limited. The paper just uses the SNN to autonomous driving and does not provide any special designs.

Response: We appreciate the reviewer's comments but respectfully disagree with the assessment regarding novelty and the lack of special designs. Our research makes several significant contributions to the field: Firstly, unlike tasks such as object detection, semantic segmentation, and classification that have been previously addressed by SNNs, we tackle autonomous driving - a task widely recognized as highly complex in the visual domain. To our knowledge, this application of SNNs to autonomous driving is the first of its kind in the field. If there are existing papers on this topic, we kindly request the reviewer to point them out. Secondly, the claim that there are no special designs is inaccurate. Our 3.1 module, "Distinct Temporal Strategies for Encoder and Decoder," combines Sequential Alignment with Sequential Repetition, which is a novel approach in the SNN domain while maintaining a spike-driven architecture. Furthermore, the training process of the Spiking Token Mixer, a core component in our Encoder, is entirely unique. As detailed in Appendix B.1, we avoided using any attention or convolutional components, as well as MLP-Mixer type token mixers. Instead, our approach achieved a performance of 72.1% on ImageNet, surpassing other Spiking Transformers. Thirdly, the design of the Spiking GRU is an original contribution not previously proposed in other papers. Our ablation study demonstrates how our architecture outperforms alternative designs, some of which struggle to achieve meaningful performance. Contrary to the reviewer's assertion, we adhere to Occam's razor principle, seeking the most appropriate design rather than pursuing novelty for its own sake. We firmly believe that truly effective designs, not merely novel ones, are what advance the machine learning community.

Only one dataset is used.

Response: We'd like to address this concern from several angles. Firstly, comparable datasets for end-to-end autonomous driving are scarce. Several prominent works in this field, such as [R1] and [R2], have also exclusively utilized the nuScenes dataset, highlighting its significance and broad acceptance in the research community. Moreover, it's not entirely accurate to say we only used one dataset. As detailed in Appendix B.1, we also tested the crucial part of our method's performance on ImageNet to verify its robustness and generalizability. We believe that the combination of in-depth analysis on a specialized autonomous driving dataset and performance verification on a general, large-scale dataset provides a comprehensive evaluation of our method's effectiveness and potential for broader applications.

The energy conputation does not consider the data moving.

Response: Thank you for your observation. You're correct that our energy computation does not consider data movement. This approach is consistent with standard practices in SNNs power estimation. Typically, these estimations focus on computation using FLOPS [R1][R2][R3]. It's important to note that this is a common limitation in the SNN field as a whole. However, the field is evolving, and new SNN hardware based on in-memory computing is emerging[R4]. This development is part of the broader trend in neuromorphic hardware design, which aims to address energy efficiency issues, including those related to data movement.

In conclusion, we would like to emphasize that our work represents a novel and significant contribution to the field of SNNs and autonomous driving. We have introduced unique architectural designs, applied SNNs to a complex real-world problem, and provided comprehensive evaluations on both specialized and general datasets. While we acknowledge that there is always room for improvement and further exploration, we believe our research pushes the boundaries of what is possible with SNNs in autonomous driving applications. We hope that our clarifications have addressed the reviewer's concerns and demonstrated the value and novelty of our work. We remain open to further discussion and are committed to advancing this important area of research.


[R1]:Zhou, Zhaokun, et al. "Spikformer: When spiking neural network meets transformer." arXiv preprint arXiv:2209.15425 (2022).

[R2]:Yao, Man, et al. "Spike-driven transformer." Advances in neural information processing systems 36 (2024).

[R3]:Zhu, Rui-Jie, et al. "Spikegpt: Generative pre-trained language model with spiking neural networks." TMLR 2024.

[R4]: El Arrassi, Asmae, et al. "Energy-efficient SNN implementation using RRAM-based computation in-memory (CIM)." 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC). IEEE, 2022.

评论

Thanks for your response. My second concern have be well addressed. However, I still feel the novelty is limited and the energy computation is biased. I have raised my score. hope the author could further give a more detailed analysis for energy computation.

评论

Thank you for considering our rebuttal and raising the score. Regarding energy consumption, we have provided a detailed explanation in Appendix F. We'd like to further elaborate on two main aspects of energy consumption:

Memory Access

Memory access is critical for model inference and training latency. Techniques like Flash Attention [R1] use fused kernels to reduce memory access frequency, significantly improving the efficiency of attention kernels. However, memory access is not directly proportional to energy consumption; it's also related to the number of synaptic operations (SOPs) [R2] when inferencing on CPUs and GPUs. SOPs refer to the number of operations performed in neural networks,

As we mentioned in the rebuttal, in neuromorphic hardware, a new paradigm called in-memory computing is emerging. Designs like IBM's NorthPole [R3] and Intel's Loihi [R4] partially adopt this concept. In these systems, synaptic weights between neurons are determined by synapse strength, greatly reducing latency and energy consumption while fully utilizing sparsity.

Compute Energy

Unlike memory access, compute energy is highly correlated with the SOPs in neural networks, even though it's not the primary bottleneck for training/inference latency. As shown in [R5], the number of neural network SOPs is almost directly proportional to energy consumption.

SNNs gain energy efficiency mainly through two aspects:

a) Binary activations

In ANNs, the SOPs primarily involve MAC (Multiply-Accumulate) operations; but in SNNs, all MAC operations can be performed without multiplication, using only addition. This can be represented mathematically as:

ANN: y=iwixi,\text{ANN: } y = \sum_{i} w_i x_i, where xiRx_i \in \mathbb{R}

SNN: y=iwixi,\text{SNN: } y = \sum_{i} w_i x_i, where xi0,1 x_i \in \\{0,1\\}

b) Sparsity

The binary nature of activations results in many zeros, which neuromorphic chips can exploit to create event-driven sparsity.

These two factors—addition-only operations and sparsity—form the foundation of SNN's superior energy efficiency. In modern neuromorphic chips like Loihi 2 [R4] and Speck [R6], sparsity is the primary factor in reducing computational energy. They leverage the graph-like nature of neural networks, constructing each neuron as a router rather than representing synapses as matrices, as in traditional GPUs.

Energy Consumption Calculation

To quantify the energy efficiency of our SNN architecture, we calculate the theoretical energy consumption using the following methodology:

  1. Calculate Synaptic Operations (SOPs) for each block:

    SOPs(l)=fr×T×FLOPs(l)\operatorname{SOPs}(l) = fr \times T \times \operatorname{FLOPs}(l)

    where:

    • ll is the block number
    • frfr is the input spike train firing rate
    • TT is the neuron time step
    • FLOPs(l)\operatorname{FLOPs}(l) are the floating-point operations in the block
  2. Compute SNN energy consumption:

    ESNN=EMAC×FLOPSNNConv1+EAC×(n=2NSOPSNNConvn+m=1MSOPSNNFCm)E_{SNN} = E_{MAC} \times \mathrm{FLOP}{\mathrm{SNN}\mathrm{Conv}}^1 + E_{AC} \times \left(\sum_{n=2}^N \mathrm{SOP}{\mathrm{SNN}\mathrm{Conv}}^n + \sum_{m=1}^M \mathrm{SOP}{\mathrm{SNN}\mathrm{FC}}^m\right)

    where:

    • EMAC=4.6pJE_{MAC} = 4.6 \text{pJ} (MAC operation energy cost)
    • EAC=0.9pJE_{AC} = 0.9 \text{pJ} (AC operation energy cost)
    • NN and MM are the number of Conv and FC layers, respectively
    • The first Conv layer uses direct encoding, employing MAC operations, so FLOPs are used for its energy calculation.
  3. For comparison, ANN energy consumption:

    EANN=EMAC×FLOPANNE_{ANN} = E_{MAC} \times \mathrm{FLOP}{\mathrm{ANN}}

This approach allows us to directly compare the energy efficiency of our SNN with ANN models.

We hope this explanation provides a more comprehensive justification for our energy consumption calculations. Thank you once again for your constructive comments and for giving us the opportunity to improve our paper.


[R1]:Dao, Tri, et al. "Flashattention: Fast and memory-efficient exact attention with io-awareness." Advances in Neural Information Processing Systems 35 (2022).

[R2]:Tripp, Charles Edison, et al. "Measuring the Energy Consumption and Efficiency of Deep Neural Networks: An Empirical Analysis and Design Recommendations." arXiv:2403.08151.

[R3]: Modha, Dharmendra S., et al. "Neural inference at the frontier of energy, space, and time." Science 2023.

[R4]: Orchard, Garrick, et al. "Efficient neuromorphic signal processing with loihi 2." 2021 IEEE Workshop on SIPS

[R5]:Lahmer, Seyyidahmed, et al. "Energy consumption of neural networks on nvidia edge boards: an empirical model." IEEE, 2022.

[R6]:Richter, Ole, et al. "Speck: A smart event-based vision sensor with a low latency 327k neuron convolutional neuronal network processing pipeline." (2024).

审稿意见
5

This paper presents Spiking Neural Network to reduce the energy consumption by normal neural network in autonomous driving. The network includes end-to-end architecture including perception, prediction and planning. It evaluates the model using nuScenes dataset and achieves comparable performance in three modules yet drawing much lower energy consumption.

优点

The author proposes a good architecture and describe in detail all of the modules: perception, prediction and planning. I enjoy the reading and found it very good to list in details of architecture, especially the results, visualization and ablation study. In general, I think this stands a good paper. The reason I rate it at 5 is due to not familiar with the work in energy consumption using spike neural networks.

缺点

I do not write this due to not being familiar with the work.

问题

  • Do the author know if any self-driving company use spiking neural network in their real operation? This can justify the impact of this work.
  • Energy is saved but how about memory, inference time?

局限性

I think the work addresses the proposed question about reducing energy consumption. Besides, I do not have expert knowledge in spiking neural networks, thus I shall refrain from talking about limitations.

作者回复

Thank you for your insightful questions regarding the practical implementation of SNN in self-driving systems. While it's true that SNNs are not yet widely deployed in commercial self-driving operations, our research aims to pave the way for their future adoption by demonstrating their viability and potential advantages in this domain.

Do the author know if any self-driving company use spiking neural network in their real operation? This can justify the impact of this work.

Response: While there are several companies exploring the use of SNNs in autonomous driving, the exact information on their progress is currently not in the public domain. To the best of our knowledge, this is the first work to use an SNN to perform complex motion planning given multi-camera input.

Mercedes Benz has begun implementing neuromorphic chips in its concept car, EQXX, specifically for voice detection. Similarly, BMW has announced plans to use neuromorphic chips in their vehicles. However, we've observed that in most cases, these chips are being used to solve relatively simple tasks. This aligns with the limitations of SNNs that we discussed in our paper, where they are often perceived as unable to solve complex problems.

Secondly, Intel's simultaneous development of both the Loihi 2 neuromorphic chip and Mobileye's self-driving processors presents a unique opportunity for integrating SNNs into autonomous driving systems. This combination of technologies within the same company creates a potential pathway for implementing SNN-based self-driving models on neuromorphic hardware.

Lastly, the primary advantage of deploying SNN models on neuromorphic chips like Loihi 2 for self-driving applications would be significantly reduced power consumption. This is crucial for electric and autonomous vehicles, where energy efficiency directly impacts range and overall performance. Therefore, while not yet widely implemented, the potential benefits of SNNs in self-driving technology make it a promising area for future development and application.

Energy is saved but how about memory, inference time?

Response: You raise an important point about memory usage and inference time.

In our current stage of research, we've primarily validated the performance of our SNN-based self-driving model on standard hardware like GPUs. At this point, SNNs don't show significant advantages in terms of memory usage or inference time compared to traditional neural networks on these platforms.

However, the real potential of SNNs for self-driving applications is in their deployment on neuromorphic hardware such as Intel's Loihi 2 chip. The asynchronous data processing capabilities of Loihi 2 are expected to greatly enhance efficiency across multiple dimensions, including memory usage and inference time.

It's worth noting that previous work has already demonstrated the feasibility and benefits of implementing self-driving tasks on neuromorphic hardware. For instance, [R1] successfully applied SNNs to a self-driving classification task on the Loihi chip. While this was a specific subtask of autonomous driving, it provided strong evidence that SNNs can indeed work effectively on neuromorphic hardware for automotive applications.

In future stages of our research, we plan to implement our model on Loihi 2 or a similar neuromorphic platform. This transition from standard GPUs to specialized neuromorphic hardware is expected to unlock the full potential of SNNs for self-driving applications, addressing current limitations in memory usage and inference time while maintaining the energy efficiency advantages.

We appreciate the reviewer's thoughtful questions and comments. Your feedback has allowed us to clarify important aspects of our work and its potential impact on the field of autonomous driving.


[R1]: Viale A, Marchisio A, Martina M, et al. Carsnn: An efficient spiking neural network for event-based autonomous cars on the loihi neuromorphic research processor, 2021 IJCNN.

评论

Thank you authors for providing a detailed response. I am satisfied with your answers. This is an important piece of work to reduce the energy consumption for autonomous driving and I believe it is quite novel as in the first time, I read energy-efficient architecture using SNN. I keep my score as 5 due to low confidence in the work. I appreciate the author's response to other reviewers and hope can improve the overall quality of paper regardless of decisions.

评论

Thank you for your kind words and valuable feedback. We are deeply grateful for your response, which motivates us to continue our research in energy-efficient SNN architectures for autonomous driving.

审稿意见
5

This paper introduces a unified Spiking Autonomous Driving (SAD) system based on Spiking Neural Networks (SNNs). The system is trained end-to-end and comprises three main modules: a perception module that processes inputs from multi-view cameras and constructs spatiotemporal bird's-eye views; a prediction module that forecasts future states using a novel dual-pathway approach; and a planning module that generates safe trajectories based on the predictions. The authors evaluated SAD on the nuScenes dataset, where it demonstrated competitive performance in perception, prediction, and planning tasks. By leveraging its event-driven and energy-efficient nature, SAD effectively addresses the energy challenges faced by autonomous driving systems.

优点

  1. This paper constructs the first end-to-end spiking neural network designed specifically for autonomous driving. By integrating perception, prediction, and planning into a single network, it enhances efficiency and reduces the complexity associated with managing these components separately.

  2. The experimental results are comprehensive and persuasive, with a thorough comparison with other works, demonstrating the significant potential of SNNs in complex real-world applications.

缺点

  1. Energy Consumption Analysis: The main motivation for implementing autonomous driving tasks with SNNs in this paper is to enhance energy efficiency, with a corresponding energy efficiency analysis provided in Appendix F and Table 3. Nonetheless, I have reservations about the energy consumption calculations presented for the following reasons:
    • Since both ANNs and SNNs process the same input data (with SNNs employing direct encoding at the first layer), the difference in energy efficiency can only stem from differences in memory access or the energy consumption of MAC/AC operations. This paper only offers a quantitative comparison of energy consumption during computation (L748-L768).
    • It is well-known that energy consumption is primarily determined by memory access rather than FLOPS or IPS [1r], yet this aspect is not discussed in the paper, nor is it considered in the energy consumption calculations. This should at least be acknowledged, as it is a common shortcoming in publications claiming reduced energy consumption. Could the authors comment on how memory access impacts the energy consumption of SNNs?
    • ANN networks typically exhibit similarly sparse ReLU activations. Previous research has shown that even on general-purpose CPU hardware, the sparsity of ReLU can be utilized to accelerate (and reduce energy consumption of) inference [2r]. How would the comparison results between SNNs and ANNs change if ANN sparsity were considered?

[1r]: An Analytical Estimation of Spiking Neural Networks Energy Efficiency. ICONIP 2022

[2r]: Inducing and Exploiting Activation Sparsity for Fast Neural Network Inference. ICML 2020

  1. Inference Speed: In comparisons with six ANN methods, SNN performance only surpassed half of the existing methods. Assuming SNNs have lower energy consumption, we would consider using SNNs for autonomous driving tasks. What about their inference speed in practical deployments? Autonomous driving tasks rely on high dynamic resolution and timely responsiveness, such as detecting and avoiding obstacles on highways.

问题

Referencing the Weakness section, I recognize that some evaluation methods are commonly used within the SNN community. However, due to differences in task formulations, explanations and clarifications are necessary.

局限性

N/A

作者回复

We sincerely appreciate the thoughtful comments and questions raised by the reviewers. Your feedback has provided us with valuable insights that will help improve our manuscript. We are grateful for the opportunity to address these points and clarify certain aspects of our work. In the following responses, we aim to address each of your concerns comprehensively while highlighting the strengths and potential impact of our research on SNNs for autonomous driving tasks.

Since both ANNs and SNNs process the same input data (with SNNs employing direct encoding at the first layer), the difference in energy efficiency can only stem from differences in memory access or the energy consumption of MAC/AC operations. This paper only offers a quantitative comparison of energy consumption during computation (L748-L768). It is well-known that energy consumption is primarily determined by memory access rather than FLOPS or IPS [1r], yet this aspect is not discussed in the paper, nor is it considered in the energy consumption calculations. This should at least be acknowledged, as it is a common shortcoming in publications claiming reduced energy consumption. Could the authors comment on how memory access impacts the energy consumption of SNNs?

Response: Thank you for your insightful question. We acknowledge that this issue indeed exists. The field of SNNs has not specifically focused on memory access, despite the fact that data movement is indeed the primary bottleneck in modern deep learning. We will certainly address this point in our revised manuscript.

SNNs utilize binary activations, which naturally lead to activation sparsity. This sparsity can be leveraged to reduce memory access frequency and enable more efficient inference, potentially allowing for partial offloading to CPU DRAM [R1]. Additionally, the lower precision of activations in SNNs contributes to reduced memory bandwidth requirements.

Furthermore, neuromorphic hardware inspired by in-memory computing concepts, such as RRAM-based architectures, shows promise in eliminating traditional memory access costs altogether when paired with SNNs [R2]. These factors combined contribute significantly to the overall energy efficiency of SNN implementations, beyond what is captured by comparing only the MAC/AC operations.

Collectively, these characteristics of SNNs and their potential hardware implementations suggest that the energy efficiency gains may be even more substantial when considering memory access, though we agree that this aspect warrants more thorough investigation and quantification in future research.

ANN networks typically exhibit similarly sparse ReLU activations. Previous research has shown that even on general-purpose CPU hardware, the sparsity of ReLU can be utilized to accelerate (and reduce energy consumption of) inference [2r]. How would the comparison results between SNNs and ANNs change if ANN sparsity were considered?

Response: Thanks for your question. In our examination of ANNs, we focused on ST-P3, the state-of-the-art ANN for these tasks. We found that the main challenge in utilizing sparsity across the entire network stems from the varied use of activation functions. Specifically, only the encoder and decoder head employ ReLU, which exhibits sparsity. Other crucial components such as parts of the decoder, temporal module, and planning module use different activation functions like tanh (for GRU) and GeLU (for partial decoder blocks). These functions are not inherently sparse, resulting in a model that is more dense than sparse overall.

This heterogeneity in activation functions makes it difficult to fully leverage the potential benefits of sparsity throughout the ANN. In contrast, our SNN model utilizes sparse and binary activations across all layers, enabling us to fully exploit both the sparsity and binary nature of the network. This comprehensive approach to sparsity in SNNs provides a more consistent basis for comparison and potential performance advantages.

Inference Speed: In comparisons with six ANN methods, SNN performance only surpassed half of the existing methods. Assuming SNNs have lower energy consumption, we would consider using SNNs for autonomous driving tasks. What about their inference speed in practical deployments? Autonomous driving tasks rely on high dynamic resolution and timely responsiveness, such as detecting and avoiding obstacles on highways.

Response: Thank you for your question. As stated in our paper, our primary goal is to demonstrate the potential of SNNs to handle the complex requirements of low-power autonomous driving. Neuromorphic chips can achieve extremely low latency (about 200 times less) compared to conventional chips, while also maintaining low power consumption. For example, Speck [R3] can achieve <0.1 ms latency when implementing SNN on-chip, compared to 24.7 ms latency when not implemented on-chip. Intel's Loihi chip also shows superior latency when executing similar tasks [R4]. We plan to implement our algorithm on these chips to fully utilize their capabilities.

In conclusion, we thank the reviewers for their insightful comments, which have allowed us to provide a more comprehensive view of our work.


[R1]:Song, Yixin, et al. "Powerinfer: Fast large language model serving with a consumer-grade gpu." arXiv preprint arXiv:2312.12456 (2023).

[R2]:El Arrassi, Asmae, et al. "Energy-efficient SNN implementation using RRAM-based computation in-memory (CIM)." 2022 IFIP/IEEE 30th International Conference on Very Large Scale Integration (VLSI-SoC). IEEE, 2022.

[R3]: Yao M, Richter O, Zhao G, et al. Spike-based dynamic computing with asynchronous sensing-computing neuromorphic chip. Nature Communications, 2024.

[R4]: Viale A, Marchisio A, Martina M, et al. Carsnn: An efficient spiking neural network for event-based autonomous cars on the loihi neuromorphic research processor, 2021 IJCNN.

评论

Dear Reviewer x6bz,

As the discussion period is drawing to a close, we wanted to reach out and kindly inquire if our rebuttal has adequately addressed the concerns you raised. We would be grateful for any feedback on whether our responses have clarified the points in question or if you have any additional queries we can assist with.

评论

Thank you for the author's response. I carefully reviewed the relevant content, and the author mentioned the sparsity of SNNs compared to ANNs and the low latency during inference, which convinces me that this research is indeed meaningful. Nevertheless, there are still many aspects in the author's response that are more of a commitment, such as the deployment of the algorithm on a chip. While this requires further validation, it is not a sufficient reason for rejection. Therefore, I have decided to raise my score from 4 to 5.

评论

Thank you for your recognition and increased score, and your decision to raise the rating from 4 to 5 is a great encouragement to us. We truly appreciate your careful review of our work and your understanding of its significance.

Indeed, implementing our algorithm on a chip is a challenging process, but we hope our work can at least demonstrate that SNNs are capable of performing autonomous driving tasks, rather than just simple operations like voice wake-up on vehicles. The latter is currently the main function of the neuromorphic chip installed in the Mercedes conceptual car EQXX.

We wish our research can push the boundaries of what SNNs can achieve in automotive applications. In the future, we will also attempt to realize hardware implementation of our approach.

评论

We would like to express our sincere gratitude to all six reviewers for their time and effort in evaluating our work. We are particularly thankful to the four reviewers who actively participated in the discussion period: Reviewers x6bz, GVJq, EVdw, and S2mt,. Their engagement and insightful feedback have been invaluable in refining and strengthening our research.

We are pleased to note that three of the four participating reviewers increased their scores by one point following our responses, indicating that our rebuttals successfully addressed many of their initial concerns. Reviewer GVJq, while maintaining their original score, still found our responses satisfactory.

All responding reviewers indicated that our rebuttals either fully or partially resolved their concerns:

  • Reviewer x6bz acknowledged the significance of our research in applying SNNs to autonomous driving tasks.
  • Reviewer GVJq recognized the potential impact of our work in advancing SNN-based approaches for autonomous driving.
  • Reviewer EVdw noted that our second concern regarding the dataset has been well addressed.
  • Reviewer S2mt found their doubts about energy calculations resolved after reading our responses.

We are also grateful for the positive feedback from the other reviewers:

  • Reviewer j4mM recognized that "this work has completed the end-to-end SNN-based autonomous driving pipeline, which is a heavy job."
  • Reviewer XRtX mentioned that "The proposed idea is novel and relevant to the NeurIPS community."

Once again, we extend our heartfelt thanks to all six reviewers and the Area Chair for their expertise and dedication in evaluating our submission. Your collective insights have been crucial in enhancing the quality and impact of our work.

最终决定

This submission presents Spiking Autonomous Driving (SAD), the first ever end-to-end framework that uses spiking neural networks (SNNs) to build sub-blocks that target perception, prediction and planning tasks. This framework is trained end-to-end using a single cost function. SAD demonstrates comparable accuracies to existing artificial neural network (ANN) based baselines highlighted in the paper across the three tasks targeted using the nuScenes dataset. In addition thanks to the event-driven nature and single bit spike encoding of SNNs SAD achieves significant reduction in the energy consumption (up to ~70x) compared to other state-of-the-art Autonomous Driving ANNs.

The energy estimates presented are analytically calculated based on Op count and energy per op. This is good first order estimate, but it ignores the costs of operations like Batch Normalization, Heaviside step function, Tensor fusion etc. Also, its unclear how the authors account for the energy of the motion planning block (bicycle model) for SAD and other baselines. Regardless, the event-driven nature of SNNs along with the significant reduction in bits per Op for multiply-accumulate ops, in my opinion will lend SAD a notable energy efficiency benefit. Many reviewers pointed out that this analytical energy estimation misses the cost of memory accesses and data movement. The authors addressed those concerns in their rebuttal aptly by citing the fewer bits used by SNNs for their accumulation operations and recent SNN HW implementations using non-volatile memories that further lower the cost of data-movement.

The reviewers also raised concerns related to lack of recent baselines (post ST-P3), lack of closed loop validation (unlike ST-P3) and an insufficient discussion on the robustness of SNNs for deployment in a safety-critical application like Autonomous Driving. While the first two concerns remain unanswered, authors intend to add robustness analysis to their future work section.

Despite these limitations, the paper does present a novel result by training the first ever SNN based end-to-end autonomous driving framework, while demonstrating comparable accuracy to its ANN counterparts as acknowledged by reviewers, j4mM and XRtX. For this reason, I think this paper should be accepted.