-
With the development of hardware-optimized deployment of piking neural networks (SNNs), SNN processors based on field-programmable gate arrays (FPGAs) have become a research hotspot due to their efficiency and flexibility. However, existing methods rely on multi-timestep training and reconfigurable computing architectures, which increases computational and memory overhead, thus reducing deployment efficiency. This work presents an efficient and lightweight residual SNN accelerator that combines algorithm and hardware co-design to optimize inference energy efficiency. In terms of algorithm, we employ single-timesteps training, integrate grouped convolutions, and fuse batch normalization (BN) layers, thus compressing the network to only 0.69 M parameters. Quantization-aware training (QAT) further constrains all weights and activations to 8-bit precision. In terms of hardware, the reuse of intra-layer resource maximizes FPGA utilization, a full pipeline cross-layer architecture improves throughput, and on-chip block RAM (BRAM) stores network parameters and intermediate results to improve memory efficiency. The experimental results show that the proposed processor achieves a classification accuracy of 87.11% on the CIFAR-10 dataset, with an inference time of 3.98 ms per image and an energy efficiency of 183.5 FPS/W. Compared with mainstream graphics processing unit (GPU) platforms, it achieves more than double the energy efficiency. Furthermore, compared with other SNN processors, it achieves at least a fourfold increase in inference speed and a fivefold improvement in energy efficiency.
-
表 1 残差SNN处理器资源利用率
Table 1. Resource utilization of residual SNN processors.
名称 消耗资源 可用资源 百分比/% LUTs 134859 425280 31.71 FF 341722 850560 40.18 BRAM 674.5 1080 62.45 DSP 3008 4272 70.41 表 2 处理器和GPU平台在CIFAR-10数据集上的性能表现
Table 2. Performance of the processor and GPU platform on the CIFAR-10 dataset.
硬件平台 ZCU216 FPGA GeForce RTX 4060 Ti 准确率/% 88.11 88.33 功耗/W 1.369 51 单张图片推理
时间/ms3.98 0.243 FPS 251 4115 FPS/W 183.5 80.7 表 3 在CIFAR-10数据集上与其他SNN处理器的性能比较
Table 3. Performance comparison with other SNN processors on the CIFAR-10 dataset.
平台 E3NE[21] SCPU[22] SiBrain[23] Aliyev et al.[24] 本文> FPGA型号 XCVU13 P Virtex-7 Virtex-7 XCVU13 P ZCU216 频率/MHz 150 200 200 100 100 SNN模型 AlexNet ResNet-11 CONVNet(VGG-11) VGG-9 ResNet-10 模型深度 8 11 6(11) 9 10 精度/bits 6 8 8(8) 4 8 参数量/M — — 0.3(9.2) — 0.69 LUTs/FFs 48k/50k 178k/127k 167k/136k(140k/122k) — 135k/342k 准确率/% 80.6 90.60 82.93(90.25) 86.6 87.11 功率/W 4.7 1.738 1.628(1.555) 0.73 1.369 时延/ms 70 25.4 1.4(18.9) 59 3.98 FPS 14.3 39.43 696(53) 16.95 251 FPS/W 3.0 22.65 438.8(34.1) 23.21 183.5 -
[1] Shelhamer E, Long J, Darrell T 2016 IEEE T. Pattern Anal. 39 640
[2] Redmon J, Divvala S, Girshick R, Farhadi A 2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Las Vegas, June 26—July 1, 2016 p779
[3] 施岳, 欧攀, 郑明, 邰含旭, 王玉红, 段若楠, 吴坚 2024 物理学报 73 104202
Google Scholar
Shi Y, Ou P, Zheng M, Tai H X, Wang Y H, Duan R N, Wu J 2024 Acta Phys. Sin. 73 104202
Google Scholar
[4] 应大卫, 张思慧, 邓书金, 武海斌 2023 物理学报 72 144201
Google Scholar
Ying D W, Zhang S H, Deng S J, Wu H B 2023 Acta Phys. Sin. 72 144201
Google Scholar
[5] 曹自强, 赛斌, 吕欣 2020 物理学报 69 084203
Google Scholar
Cao Z Q, Sai B, Lv X 2020 Acta Phys. Sin. 69 084203
Google Scholar
[6] Maass W 1997 Neural Networks 10 1659
Google Scholar
[7] Nunes J D, Carvalho M, Carneiro D, Cardoso J S 2022 IEEE Access, 10 60738
Google Scholar
[8] 武长春, 周莆钧, 王俊杰, 李国, 胡绍刚, 于奇, 刘洋 2022 物理学报 71 148401
Google Scholar
Wu C C, Zhou P J, Wang J J, Li G, Hu S G, Yu Q, Liu Y 2022 Acta Phys. Sin. 71 148401
Google Scholar
[9] Aliyev I, Svoboda K, Adegbija T, Fellous J M 2024 IEEE 17th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) Kuala Lumpur, December 16-19, 2024 p413
[10] Merolla P A, Arthur J V, Alvarez-Icaza R, Cassidy A S, Sawada J, Akopyan F, Jackson B L, Imam N, Guo C, Nakamura Y, Brezzo B, Vo I, Esser S K, Appuswamy R, Taba B, Amir A, Flickner M D, Risk W P, Manohar R, Modha D S 2014 Science 345 668
Google Scholar
[11] Davies M, Srinivasa N, Lin T H, Chinya G, Cao Y, Choday S H, Dimou G, Joshi P, Imam N, Jain S, Liao Y, Lin C K, Lines A, Liu R, Mathaikutty D, McCoy S, Paul A, Tse J, Venkataramanan G, Weng Y H, Wild A, Yang Y, Wang H 2018 IEEE Micro 38 82
Google Scholar
[12] 何磊, 王堃, 吴晨, 陶卓夫, 时霄, 苗斯元, 陆少强 2025 中国科学: 信息科学 55 796
Google Scholar
He L, Wang K, Wu C, Tao Z F, Shi X, Miao S Y, Lu S Q 2025 Sci. Sin. Inf. 55 796
Google Scholar
[13] Gdaim S, Mtibaa A 2025 J. Real-Time Image Pr. 22 67
Google Scholar
[14] 严飞, 郑绪文, 孟川, 李楚, 刘银萍 2025 现代电子技术 48 151
Yan F, Zheng X W, Meng C, Li C, Liu Y P 2025 Modern Electron. Techn. 48 151
[15] Liu Y J, Chen Y H, Ye W J, Gui Y 2022 IEEE T. Circuits I 69 2553
[16] Ye W J, Chen Y H, Liu Y J 2022 IEEE T. Comput. Aid. D. 42 448
[17] Panchapakesan S, Fang Z M, Li J 2022 ACM T. Reconfig. Techn. 15 48
[18] Chen Q Y, Gao C, Fu Y X 2022 IEEE T. VLSI Syst. 30 1425
Google Scholar
[19] Wang S Q, Wang L, Deng Y, Yang Z J, Guo S S, Kang Z Y, Guo Y F, Xu W X 2020 J. Comput. Sci. Tech. 35 475
Google Scholar
[20] Biswal M R, Delwar T S, Siddique A, Behera P, Choi Y, Ryu J Y 2022 Sensors 22 8694
Google Scholar
[21] Gerlinghoff D, Wang Z, Gu X, Goh R S M, Luo T 2021 IEEE T. Parall. Distr. 33 3207
[22] Chen Y H, Liu Y J, Ye W J, Chang C C 2023 IEEE T. Circuits II 70 3634
[23] Chen Y H, Ye W J, Liu Y J, Zhou H H 2024 IEEE T. Circuits I 71 6482
[24] Aliyev I, Lopez J, Adegbija T 2024 arXiv: 2411.15409[CS-Ar]
[25] Stein R B, Hodgkin A L 1967 Proceedings of the Royal Society of London. Series B. Biological Sciences 167 64
[26] Eshraghian J K, Ward M, Neftci E O, Wang X, Lenz G, Dwivedi G 2023 Proc. IEEE 111 1016
Google Scholar
[27] 刘浩, 柴洪峰, 孙权, 云昕, 李鑫 2023 中国工程科学 25 61
Liu H, Chai H F, Sun Q, Yun X, Li X 2023 Engineering 25 61
[28] Krizhevsky A, Sutskever I, Hinton G E 2012 Adv. Neural Inf. Pro. Syst. 25 1097
[29] Huang G, Liu S, van der Maaten L, Weinberger K 2018 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Salt Lake City, US, June 18-22, 2018 p2752
[30] Krizhevsky A, Hinton G https://www.cs.toronto.edu/~kriz/cifar.html [2025-3-22]
[31] Ioffe S, Szegedy C 2015 arXiv: 1502.03167[CS-LG]
[32] Zheng J W 2021 M. S. Thesis (Xi’an: Xidian University)(in Chinese)[郑俊伟 2021 硕士学位论文 (西安: 西安电子科技大学)]
[33] Jacob B, Kligys S, Chen B, Tang M, Howard A, Adam H, Kalenichenko D 2018 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Sale Lake City, June 18-22, 2018 p2704
[34] Zhou S C, Wu Y X, Ni Z K, Zhou X Y, Wen H, Zou Y H 2016 arXiv: 1606.06160[cs. NE] https://doi.org/10.48550/arXiv.1606.06160
[35] Liu Z, Cheng K T, Huang D, Xing E P, Shen Z 2022 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, June 19-24, 2022 p4942
[36] Chen Y H, Krishna T, Emer J S, Sze V 2016 IEEE J. Solid-St. Circ. 52 127
Metrics
- Abstract views: 342
- PDF Downloads: 14
- Cited By: 0