中国半导体行业协会封装分会会刊

中国电子学会电子制造与封装技术分会会刊

导航

电子与封装 ›› 2023, Vol. 23 ›› Issue (2): 020306 . doi: 10.16257/j.cnki.1681-1070.2022.1010

• 电路与系统 • 上一篇    下一篇

基于RISC-V的神经网络加速器硬件实现*

鞠虎;高营;田青;周颖   

  1. 中国电子科技集团公司第五十八研究所,江苏 无锡 214035
  • 收稿日期:2021-12-16 发布日期:2023-02-23
  • 作者简介:鞠虎(1992—),男,江苏泰州人,硕士,工程师,主要研究方向为神经网络加速器设计。

Hardware Implementation of Neural NetworkAccelerator Based on RISC-V

JU Hu, GAO Ying, TIAN Qing, ZHOU Yin   

  1. China Electronics Technology GroupCorporation No. 58 Research Institute, Wuxi 214035, China
  • Received:2021-12-16 Published:2023-02-23

摘要: 针对第五代开放精简指令集(RISC-V)的人工智能(AI)处理器较少、先进的精简指令微处理器(ARM)架构供应链不稳定、自主可控性弱的问题,设计了以RISC-V处理器为核心的神经网络推理加速器系统级芯片(SoC)架构。采用开源项目搭建SoC架构;基于可变张量加速器(VTA)架构,完成深度神经网络加速器指令集设计;通过高级可扩展接口(AXI)连接处理器与VTA,并采用共享内存的方式进行数据传输;基于深度学习编译栈实现卷积运算和神经网络部署。试验结果表明,所设计的架构可灵活实现多种主流的深度神经网络推理任务,乘法累加单元(MAC)数目可以达到1024,量化长度为有符号8位整数(INT8),编译栈支持主流神经网络编译,实现了修正后的ZFNet和ResNet20神经网络图像分类演示,在现场可编程逻辑门阵列(FPGA)电路上整体准确率分别达到78.95%和84.81%。

关键词: RISC-V, 神经网络, 可变张量加速器, 通用矩阵乘, 深度学习编译器

Abstract: To solve the problems of few artificial intelligence (AI) processors based on open reduced instruction set computer-FIVE (RISC-V), unstable supply chain of advanced RISC machine (ARM) architectures and weak autonomy and controllability, the system on chip (SoC) architecture of neural network inference accelerator based on RISC-V processor is designed. SoC architecture is built based on open source free projects. The instruction set of the deep neural network accelerator is designed based on a variable tensor accelerator (VTA) architecture. The processor and VTA are connected by advanced eXtensible interface (AXI), and data are transferred through a shared memory. The convolutional operation and neural network deployment are realized based on a deep learning compiler. Experimental results show that the designed architecture can flexibly realize a variety of mainstream deep neural network reasoning tasks. The number of multiply and accumulate cells (MACs) can reach 1024 and the quantization length is a signed 8-bit integer (INT8). The compiler supports mainstream neural network compilations. The modified ZFNet and ResNet20 neural network image classification demonstrations are completed with overall accuracies of 78.95% and 84.81% on field programmable gate array (FPGA) circuits, respectively.

Key words: RISC-V, neural network, versatile tensor accelerator, general matrix multiplication, deep learning compiler

中图分类号: