Spiking neural networks (SNNs) have attracted attention due to their biological plausibility and the potential for low-energy applications on neuromorphic hardware. Two mainstream approaches are commonly used to obtain SNNs, i.e., ANN-to-SNN conversion methods, and Directly-trained-SNN methods. However, the former achieve excellent performance at the cost of a large number of time steps (i.e., latency), while the latter exhibit lower latency but suffers from suboptimal performance. To tackle the performance-latency trade-off, we propose Self-Architectural Knowledge Distillation (SAKD), an intuitive and effective method for SNNs leveraging Knowledge Distillation (KD). We adopt a bilevel teacher–student training strategy in SAKD, i.e., level-1 involves directly transferring same-architectural pre-trained ANN weights to SNNs, and level-2 encourages the SNNs to mimic ANN’s behavior, considering both final responses and intermediate features aspects. Learning with informative supervision signals fostered by labels and ANNs, our SAKD achieves new state-of-the-art (SOTA) performance with a few time steps on widely-used classification benchmark datasets. On ImageNet-1K, with only 4 time steps, our Spiking-ResNet34 model attains a Top-1 accuracy of 70.04%, outperforming the previous same-architectural SOTA methods. Notably, our SEW-ResNet152 model reaches a Top-1 accuracy of 77.30% on ImageNet-1K, setting a new SOTA benchmark for SNNs. Furthermore, we apply our SAKD to various dense prediction downstream tasks, such as object detection and semantic segmentation, demonstrating strong generalization ability and superior performance. In conclusion, our proposed SAKD framework presents a promising approach for achieving both high performance and low latency in SNNs, potentially paving the way for future advancements in the field.
Read full abstract