Implementation of high-efficiency, lightweight residual spiking neural network processor based on field-programmable gate arrays

HOU Yue; XIANG Shuiying; ZOU Tao; HUANG Zhiquan; SHI Shangxuan; GUO Xingxing; ZHANG Yahui; ZHENG Ling; HAO Yue

doi:10.7498/aps.74.20250390

Abstract
With the development of hardware-optimized deployment of piking neural networks (SNNs), SNN processors based on field-programmable gate arrays (FPGAs) have become a research hotspot due to their efficiency and flexibility. However, existing methods rely on multi-timestep training and reconfigurable computing architectures, which increases computational and memory overhead, thus reducing deployment efficiency. This work presents an efficient and lightweight residual SNN accelerator that combines algorithm and hardware co-design to optimize inference energy efficiency. In terms of algorithm, we employ single-timesteps training, integrate grouped convolutions, and fuse batch normalization (BN) layers, thus compressing the network to only 0.69 M parameters. Quantization-aware training (QAT) further constrains all weights and activations to 8-bit precision. In terms of hardware, the reuse of intra-layer resource maximizes FPGA utilization, a full pipeline cross-layer architecture improves throughput, and on-chip block RAM (BRAM) stores network parameters and intermediate results to improve memory efficiency. The experimental results show that the proposed processor achieves a classification accuracy of 87.11% on the CIFAR-10 dataset, with an inference time of 3.98 ms per image and an energy efficiency of 183.5 FPS/W. Compared with mainstream graphics processing unit (GPU) platforms, it achieves more than double the energy efficiency. Furthermore, compared with other SNN processors, it achieves at least a fourfold increase in inference speed and a fivefold improvement in energy efficiency.

Keywords:
spiking neural networks /

field-programmable gate array /

high efficiency /

lightweight
FullText HTML

Cited By

图 1 ResNet-10脉冲神经网络结构

Figure 1. ResNet-10 spiking neural network structure.

DownLoad: Full-Size Img PowerPoint

图 2 标准卷积与分组卷积下ResNet-10各层参数量对比

Figure 2. Comparison of parameter counts for each layer of ResNet-10 under standard convolution and group convolution.

DownLoad: Full-Size Img PowerPoint

图 3 不同条件下测试准确率对比

Figure 3. Comparison of test accuracy under different conditions.

DownLoad: Full-Size Img PowerPoint

图 4 残差SNN处理器硬件总体架构图

Figure 4. Overall hardware architecture of the residual SNN processor.

DownLoad: Full-Size Img PowerPoint

图 5 卷积操作示意图

Figure 5. Schematic diagram of the convolution operation.

DownLoad: Full-Size Img PowerPoint

图 6 流水线设计　(a) 卷积数据处理流水线结构; (b)层间全流水架构

Figure 6. Pipeline design: (a) Convolution data processing pipeline structure; (b) fully pipelined inter-layer architecture.

DownLoad: Full-Size Img PowerPoint

表 1 残差SNN处理器资源利用率

Table 1. Resource utilization of residual SNN processors.

名称	消耗资源	可用资源	百分比/%
LUTs	134859	425280	31.71
FF	341722	850560	40.18
BRAM	674.5	1080	62.45
DSP	3008	4272	70.41

DownLoad: CSV

表 2 处理器和GPU平台在CIFAR-10数据集上的性能表现

Table 2. Performance of the processor and GPU platform on the CIFAR-10 dataset.

硬件平台	ZCU216 FPGA	GeForce RTX 4060 Ti
准确率/%	88.11	88.33
功耗/W	1.369	51
单张图片推理时间/ms	3.98	0.243
FPS	251	4115
FPS/W	183.5	80.7

DownLoad: CSV

表 3 在CIFAR-10数据集上与其他SNN处理器的性能比较

Table 3. Performance comparison with other SNN processors on the CIFAR-10 dataset.

平台	E3NE^[21]	SCPU^[22]	SiBrain^[23]	Aliyev et al.^[24]	本文>
FPGA型号	XCVU13 P	Virtex-7	Virtex-7	XCVU13 P	ZCU216
频率/MHz	150	200	200	100	100
SNN模型	AlexNet	ResNet-11	CONVNet(VGG-11)	VGG-9	ResNet-10
模型深度	8	11	6(11)	9	10
精度/bits	6	8	8(8)	4	8
参数量/M	—	—	0.3(9.2)	—	0.69
LUTs/FFs	48k/50k	178k/127k	167k/136k(140k/122k)	—	135k/342k
准确率/%	80.6	90.60	82.93(90.25)	86.6	87.11
功率/W	4.7	1.738	1.628(1.555)	0.73	1.369
时延/ms	70	25.4	1.4(18.9)	59	3.98
FPS	14.3	39.43	696(53)	16.95	251
FPS/W	3.0	22.65	438.8(34.1)	23.21	183.5

DownLoad: CSV

[1]	Shelhamer E, Long J, Darrell T 2016 IEEE T. Pattern Anal. 39 640
[2]	Redmon J, Divvala S, Girshick R, Farhadi A 2016 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Las Vegas, June 26—July 1, 2016 p779
[3]	施岳, 欧攀, 郑明, 邰含旭, 王玉红, 段若楠, 吴坚 2024 物理学报 73 104202 Google Scholar Shi Y, Ou P, Zheng M, Tai H X, Wang Y H, Duan R N, Wu J 2024 Acta Phys. Sin. 73 104202 Google Scholar
[4]	应大卫, 张思慧, 邓书金, 武海斌 2023 物理学报 72 144201 Google Scholar Ying D W, Zhang S H, Deng S J, Wu H B 2023 Acta Phys. Sin. 72 144201 Google Scholar
[5]	曹自强, 赛斌, 吕欣 2020 物理学报 69 084203 Google Scholar Cao Z Q, Sai B, Lv X 2020 Acta Phys. Sin. 69 084203 Google Scholar
[6]	Maass W 1997 Neural Networks 10 1659 Google Scholar
[7]	Nunes J D, Carvalho M, Carneiro D, Cardoso J S 2022 IEEE Access, 10 60738 Google Scholar
[8]	武长春, 周莆钧, 王俊杰, 李国, 胡绍刚, 于奇, 刘洋 2022 物理学报 71 148401 Google Scholar Wu C C, Zhou P J, Wang J J, Li G, Hu S G, Yu Q, Liu Y 2022 Acta Phys. Sin. 71 148401 Google Scholar
[9]	Aliyev I, Svoboda K, Adegbija T, Fellous J M 2024 IEEE 17th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC) Kuala Lumpur, December 16-19, 2024 p413
[10]	Merolla P A, Arthur J V, Alvarez-Icaza R, Cassidy A S, Sawada J, Akopyan F, Jackson B L, Imam N, Guo C, Nakamura Y, Brezzo B, Vo I, Esser S K, Appuswamy R, Taba B, Amir A, Flickner M D, Risk W P, Manohar R, Modha D S 2014 Science 345 668 Google Scholar
[11]	Davies M, Srinivasa N, Lin T H, Chinya G, Cao Y, Choday S H, Dimou G, Joshi P, Imam N, Jain S, Liao Y, Lin C K, Lines A, Liu R, Mathaikutty D, McCoy S, Paul A, Tse J, Venkataramanan G, Weng Y H, Wild A, Yang Y, Wang H 2018 IEEE Micro 38 82 Google Scholar
[12]	何磊, 王堃, 吴晨, 陶卓夫, 时霄, 苗斯元, 陆少强 2025 中国科学: 信息科学 55 796 Google Scholar He L, Wang K, Wu C, Tao Z F, Shi X, Miao S Y, Lu S Q 2025 Sci. Sin. Inf. 55 796 Google Scholar
[13]	Gdaim S, Mtibaa A 2025 J. Real-Time Image Pr. 22 67 Google Scholar
[14]	严飞, 郑绪文, 孟川, 李楚, 刘银萍 2025 现代电子技术 48 151 Yan F, Zheng X W, Meng C, Li C, Liu Y P 2025 Modern Electron. Techn. 48 151
[15]	Liu Y J, Chen Y H, Ye W J, Gui Y 2022 IEEE T. Circuits I 69 2553
[16]	Ye W J, Chen Y H, Liu Y J 2022 IEEE T. Comput. Aid. D. 42 448
[17]	Panchapakesan S, Fang Z M, Li J 2022 ACM T. Reconfig. Techn. 15 48
[18]	Chen Q Y, Gao C, Fu Y X 2022 IEEE T. VLSI Syst. 30 1425 Google Scholar
[19]	Wang S Q, Wang L, Deng Y, Yang Z J, Guo S S, Kang Z Y, Guo Y F, Xu W X 2020 J. Comput. Sci. Tech. 35 475 Google Scholar
[20]	Biswal M R, Delwar T S, Siddique A, Behera P, Choi Y, Ryu J Y 2022 Sensors 22 8694 Google Scholar
[21]	Gerlinghoff D, Wang Z, Gu X, Goh R S M, Luo T 2021 IEEE T. Parall. Distr. 33 3207
[22]	Chen Y H, Liu Y J, Ye W J, Chang C C 2023 IEEE T. Circuits II 70 3634
[23]	Chen Y H, Ye W J, Liu Y J, Zhou H H 2024 IEEE T. Circuits I 71 6482
[24]	Aliyev I, Lopez J, Adegbija T 2024 arXiv: 2411.15409[CS-Ar]
[25]	Stein R B, Hodgkin A L 1967 Proceedings of the Royal Society of London. Series B. Biological Sciences 167 64
[26]	Eshraghian J K, Ward M, Neftci E O, Wang X, Lenz G, Dwivedi G 2023 Proc. IEEE 111 1016 Google Scholar
[27]	刘浩, 柴洪峰, 孙权, 云昕, 李鑫 2023 中国工程科学 25 61 Liu H, Chai H F, Sun Q, Yun X, Li X 2023 Engineering 25 61
[28]	Krizhevsky A, Sutskever I, Hinton G E 2012 Adv. Neural Inf. Pro. Syst. 25 1097
[29]	Huang G, Liu S, van der Maaten L, Weinberger K 2018 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Salt Lake City, US, June 18-22, 2018 p2752
[30]	Krizhevsky A, Hinton G https://www.cs.toronto.edu/~kriz/cifar.html [2025-3-22]
[31]	Ioffe S, Szegedy C 2015 arXiv: 1502.03167[CS-LG]
[32]	Zheng J W 2021 M. S. Thesis (Xi’an: Xidian University)(in Chinese)[郑俊伟 2021 硕士学位论文 (西安: 西安电子科技大学)]
[33]	Jacob B, Kligys S, Chen B, Tang M, Howard A, Adam H, Kalenichenko D 2018 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Sale Lake City, June 18-22, 2018 p2704
[34]	Zhou S C, Wu Y X, Ni Z K, Zhou X Y, Wen H, Zou Y H 2016 arXiv: 1606.06160[cs. NE] https://doi.org/10.48550/arXiv.1606.06160
[35]	Liu Z, Cheng K T, Huang D, Xing E P, Shen Z 2022 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition New Orleans, June 19-24, 2022 p4942
[36]	Chen Y H, Krishna T, Emer J S, Sze V 2016 IEEE J. Solid-St. Circ. 52 127

[1]	Wang Yong-Bo, Tang Xi, Zhao Le-Han, Zhang Xin, Deng Jin, Wu Zheng-Mao, Yang Jun-Bo, Zhou Heng, Wu Jia-Gui, Xia Guang-Qiong. A Tbit/s parallel real-time physical random number scheme based on chaos optical frequency comb of Si₃N₄ micro-ring. Acta Physica Sinica, doi: 10.7498/aps.73.20231913
[2]	Quan Xu, Qiu Da, Sun Zhi-Peng, Zhang Gui-Zhong, Liu Song. Dynamic analysis and FPGA implementation of a fourth-order chaotic system with coexisting attractor. Acta Physica Sinica, doi: 10.7498/aps.72.20230795
[3]	Zhang Gui-Zhong, Quan Xu, Liu Song. Analysis and FPGA implementation of memristor chaotic system with extreme multistability. Acta Physica Sinica, doi: 10.7498/aps.71.20221423
[4]	Zhang Ya-Jun, Cai Jia-Lin, Qiao Ya, Zeng Zhong-Ming, Yuan Zhe, Xia Ke. Implementation of unsupervised clustering based on population coding of magnetic tunnel junctions. Acta Physica Sinica, doi: 10.7498/aps.71.20220252
[5]	Wang Tong, Wen Juan, Lü Kang, Chen Jian-Zhong, Wang Liang, Guo Xin. Bio-inspired sensory systems with integrated capabilities of sensing, data storage, and processing. Acta Physica Sinica, doi: 10.7498/aps.71.20220281
[6]	Wu Chang-Chun, Zhou Pu-Jun, Wang Jun-Jie, Li Guo, Hu Shao-Gang, Yu Qi, Liu Yang. Memristor based spiking neural network accelerator architecture. Acta Physica Sinica, doi: 10.7498/aps.71.20220098
[7]	Kang Zhi-Wei, Liu Tuo, Liu Jin, Ma Xin, Chen Xiao. Pulsar candidate selection based on self-normalizing neural networks. Acta Physica Sinica, doi: 10.7498/aps.69.20191582
[8]	Lü Yan-Min, Min Fu-Hong. Dynamic analysis of symmetric behavior in flux-controlled memristor circuit based on field programmable gate array. Acta Physica Sinica, doi: 10.7498/aps.68.20190453
[9]	Wang Chuan-Fu, Ding Qun. SM4 key scheme algorithm based on chaotic system. Acta Physica Sinica, doi: 10.7498/aps.66.020504
[10]	Xu Ya-Ming, Wang Li-Dan, Duan Shu-Kai. A memristor-based chaotic system and its field programmable gate array implementation. Acta Physica Sinica, doi: 10.7498/aps.65.120503
[11]	Guo Ye-Cai, Zhou Lin-Feng. Study of anisotropic diffusion model based on pulse coupled neural network and image entropy. Acta Physica Sinica, doi: 10.7498/aps.64.194204
[12]	Shao Shu-Yi, Min Fu-Hong, Wu Xue-Hong, Zhang Xin-Guo. Implementation of a new chaotic system based on field programmable gate array. Acta Physica Sinica, doi: 10.7498/aps.63.060501
[13]	Zhang Xu-Dong, Zhu Ping, Xie Xiao-Ping, He Guo-Guang. A dynamic threshold value control method for chaotic neural networks. Acta Physica Sinica, doi: 10.7498/aps.62.210506
[14]	Pan Jing, Qi Na, Xue Bing-Bing, Ding Qun. Field programmable gate array-based chaotic encryption system design and hardware realization of cell phone short message. Acta Physica Sinica, doi: 10.7498/aps.61.180504
[15]	Liu Qiang, Fang Jin-Qing, Zhao Geng, Li Yong. Research of Chaotic encryption system based on FPGA technology. Acta Physica Sinica, doi: 10.7498/aps.61.130508
[16]	Gao Bo, Yu Xue-Feng, Ren Di-Yuan, Li Yu-Dong, Cui Jiang-Wei, Li Mao-Shun, Li Ming, Wang Yi-Yuan. Research on the total-dose irradiation damage effect for static random access memory-based field programmable gate array. Acta Physica Sinica, doi: 10.7498/aps.60.036106
[17]	Zhou Wu-Jie, Yu Si-Min. Chaotic digital communication system based on field programmable gate array technology—Design and implementation. Acta Physica Sinica, doi: 10.7498/aps.58.113
[18]	Zhou Wu-Jie, Yu Si-Min. Design and implementation of chaotic generators based on IEEE-754 standard and field programmable gate array technology. Acta Physica Sinica, doi: 10.7498/aps.57.4738
[19]	HE GUO-GUANG, CAO ZHI-TONG. CONTROLLING CHAOS IN CHAOTIC NEURAL NETWORK. Acta Physica Sinica, doi: 10.7498/aps.50.2103
[20]	MA YU-QIANG, ZHANG YUE-MING, GONG CHANG-DE. RETRIEVAL PROPERTIES OF HOPFIELD NEURAL NETWORK MODELS. Acta Physica Sinica, doi: 10.7498/aps.42.1356

Metrics

Abstract views: 342
PDF Downloads: 14
Cited By: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

Search

Article

留言板