基于RISC-V的神经网络加速器硬件实现*

doi:10.16257/j.cnki.1681-1070.2022.1010

电子与封装 ›› 2023, Vol. 23 ›› Issue (2): 020306 . doi: 10.16257/j.cnki.1681-1070.2022.1010

基于RISC-V的神经网络加速器硬件实现^*

鞠虎;高营;田青;周颖

中国电子科技集团公司第五十八研究所，江苏无锡 214035

收稿日期:2021-12-16 发布日期:2023-02-23
作者简介:鞠虎（1992—），男，江苏泰州人，硕士，工程师，主要研究方向为神经网络加速器设计。

Hardware Implementation of Neural NetworkAccelerator Based on RISC-V

JU Hu, GAO Ying, TIAN Qing, ZHOU Yin

China Electronics Technology GroupCorporation No. 58 Research Institute, Wuxi 214035, China

Received:2021-12-16 Published:2023-02-23

摘要/Abstract

摘要： 针对第五代开放精简指令集（RISC-V）的人工智能（AI）处理器较少、先进的精简指令微处理器（ARM）架构供应链不稳定、自主可控性弱的问题，设计了以RISC-V处理器为核心的神经网络推理加速器系统级芯片（SoC）架构。采用开源项目搭建SoC架构；基于可变张量加速器（VTA）架构，完成深度神经网络加速器指令集设计；通过高级可扩展接口（AXI）连接处理器与VTA，并采用共享内存的方式进行数据传输；基于深度学习编译栈实现卷积运算和神经网络部署。试验结果表明，所设计的架构可灵活实现多种主流的深度神经网络推理任务，乘法累加单元（MAC）数目可以达到1024，量化长度为有符号8位整数（INT8），编译栈支持主流神经网络编译，实现了修正后的ZFNet和ResNet20神经网络图像分类演示，在现场可编程逻辑门阵列（FPGA）电路上整体准确率分别达到78.95%和84.81%。

关键词: RISC-V, 神经网络, 可变张量加速器, 通用矩阵乘, 深度学习编译器

Abstract: To solve the problems of few artificial intelligence (AI) processors based on open reduced instruction set computer-FIVE (RISC-V), unstable supply chain of advanced RISC machine (ARM) architectures and weak autonomy and controllability, the system on chip (SoC) architecture of neural network inference accelerator based on RISC-V processor is designed. SoC architecture is built based on open source free projects. The instruction set of the deep neural network accelerator is designed based on a variable tensor accelerator (VTA) architecture. The processor and VTA are connected by advanced eXtensible interface (AXI), and data are transferred through a shared memory. The convolutional operation and neural network deployment are realized based on a deep learning compiler. Experimental results show that the designed architecture can flexibly realize a variety of mainstream deep neural network reasoning tasks. The number of multiply and accumulate cells (MACs) can reach 1024 and the quantization length is a signed 8-bit integer (INT8). The compiler supports mainstream neural network compilations. The modified ZFNet and ResNet20 neural network image classification demonstrations are completed with overall accuracies of 78.95% and 84.81% on field programmable gate array (FPGA) circuits, respectively.

Key words: RISC-V, neural network, versatile tensor accelerator, general matrix multiplication, deep learning compiler

中图分类号:

TN495

鞠虎;高营;田青;周颖. 基于RISC-V的神经网络加速器硬件实现^*[J]. 电子与封装, 2023, 23(2): 020306 .

JU Hu, GAO Ying, TIAN Qing, ZHOU Yin. Hardware Implementation of Neural NetworkAccelerator Based on RISC-V[J]. Electronics & Packaging, 2023, 23(2): 020306 .

参考文献

[1] 张政馗, 庞为光, 谢文静, 等. 面向实时应用的深度学习研究综述[J]. 软件学报, 2020, 31(9): 2654-2677.
[2] NVIDIA. 各大计算机制造商与云服务提供商均采用NVIDIA GPU [J]. 单片机与嵌入式系统应用, 2018, 18(1): 13.
[3] JOUPPI N P, YOUNG C, PATIL N, et al. In-datacenter performance analysis of a tensor processing unit[J]. Computer Architecture News, 2017, 45(2): 1-12.
[4] 陶友龙, 赵安璞, 陈海波. 基于ARM Cortex-M3核的SoC架构设计及性能分析[J]. 电子技术应用, 2012, 38(8): 53-55.
[5] 娄冕, 张海金, 杨靓, 等. 基于Chisel语言的RISC-V处理器设计技术[J]. 微电子学与计算机, 2021, 38(3): 51-55.
[6] 贠晨阳, 苗瑞霞. 基于PicoRV32开源处理器的SOC平台搭建[J]. 现代电子技术, 2019, 42(21): 90-93.
[7] 徐文亮. 一种基于国产嵌入式CPU核的BP神经网络SoC设计[J]. 电子技术应用, 2021, 47(4): 63-66.
[8] ZEILER M D, FERGUS R. Visualizing and understanding convolutional networks[C]// Computer Vision- ECCV 2014: Part I. Springer, 2014: 818-833.
[9] KRIZHEVSKY A, SUTSKEVER I, HINTON G. ImageNet classification with deep convolutional neural networks[J]. Advances in Neural Information Processing Systems, 2012, 25(2): 1097-1105.
[10] MOREAU T, CHEN T Q, VEGA L, et al. A hardware-software blueprint for flexible deep learning specialization[J]. IEEE Micro, 2019, 39(5): 8-16.
[10] MOREAU T, CHEN T Q, VEGA L, et al. A hardware–software blueprint for flexible deep learning specialization[J]. IEEE Micro, 2019, 39(5): 8-16.[LinkOut]
[11] 高营, 刘德, 鞠虎. 基于开源处理器Rocket的异构SoC设计与验证[J]. 电子与封装, 2021, 21(3): 030305.
[12] 马飞, 刘琦, 包斌. 基于FPGA的AXI4总线时序设计与实现[J]. 电子技术应用, 2015, 41(6): 13-15, 19.

中国半导体行业协会封装分会会刊

中国电子学会电子制造与封装技术分会会刊

基于RISC-V的神经网络加速器硬件实现^*

Hardware Implementation of Neural NetworkAccelerator Based on RISC-V

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 4

编辑推荐

Metrics

本文评价

[1]	张文文,来鹏飞. 基于RISC-V芯片的内置Flash自编程实现方式[J]. 电子与封装, 2023, 23(2): 20301-.
[2]	王彬;高嘉平;司耸涛. 基于卷积神经网络的图像分类及应用[J]. 电子与封装, 2021, 21(5): 50503-.
[3]	常龙鑫;郭俊;洪广伟;虞致国;顾晓峰. 一种面向RISC-V的检查点和回滚恢复容错方法^*[J]. 电子与封装, 2020, 20(10): 100301-.
[4]	谢达，周道逵，季振凯，戴新宇，武睿. 基于异构多核平台的Caffe框架物体分类算法实现与加速[J]. 电子与封装, 2019, 19(5): 16-21.