中国半导体行业协会封装分会会刊

中国电子学会电子制造与封装技术分会会刊

导航

电子与封装 ›› 2024, Vol. 24 ›› Issue (9): 090301 . doi: 10.16257/j.cnki.1681-1070.2024.0121

• 电路与系统 • 上一篇    下一篇

基于RISC-V和可重构智能加速核的异构SoC系统设计*

权良华1,2;王艺霖1,2;黎思越1,2;李世平3;陈铠3;邓松峰4;何国强3;;冯书谊4;傅玉祥2;李丽1   

  1. 1. 南京大学电子科学与工程学院,南京 210023;2. 南京大学集成电路学院,江苏 苏州 215163;3. 江苏华创微系统有限公司,南京 211800;4. 上海航天电子技术研究所,上海 201108
  • 收稿日期:2024-04-01 出版日期:2024-09-25 发布日期:2024-09-25
  • 作者简介:权良华(1999—),男,四川成都人,硕士研究生,主要研究方向为人工智能硬件加速、可重构计算。

Heterogeneous SoC System Design Based on RISC-V and Reconfigurable Intelligent Acceleration Core

QUAN Lianghua1,2, WANG Yilin1,2, LI Siyue1,2, LI Shiping3, CHEN Kai3, DENG Songfeng4, HE Guoqiang3, FENG Shuyi4, FU Yuxiang2, LI Li1   

  1. 1. School of Electronic Science andEngineering, Nanjing University, Nanjing210023, China; 2. Schoolof Integrated Circuits, NanjingUniversity, Suzhou 215163, China; 3. Jiangsu HuachuangMicrosystem Co., Ltd., Nanjing 211800, China; 4. Shanghai Instituteof Spacecraft Electronics Technology,Shanghai 201108, China
  • Received:2024-04-01 Online:2024-09-25 Published:2024-09-25

摘要: 提出了可重构智能加速核架构,并设计了可重构激活函数乘累加单元(ACT-MAC),旨在提高低功耗约束下的运算资源利用率。加速核基于ACT-MAC设计了可重构计算阵列,支持卷积、池化、长短期记忆网络(LSTM)及激活函数等算法的硬件加速。加速核采用乒乓流水线设计,优化了存储分配,显著提升了数据处理效率。该加速核通过协处理器指令拓展(NICE)接口与开源RISC-V处理器集成,形成了完整的片上系统(SoC)。该设计在Nexys Video可编程逻辑门阵列(FPGA)中实现了芯片原型,并在其上部署了LeNet、VGG16和LSTM网络,展示了该SoC芯片原型在图像分类和语义识别等领域的应用潜力。与最近的工作相比,该设计在提升了数字信号处理(DSP)效率并维持了高能效比的同时,支持多种人工智能算法的硬件加速,展现了在嵌入式应用场景中的广阔应用前景。

关键词: RISC-V, 可重构计算, 非线性计算, 人工智能, SoC

Abstract: A reconfigurable intelligent acceleration core architecture is proposed and a reconfigurable activation function multiply-accumulate unit (ACT-MAC) is designed, aiming at improving the utilization of computing resources under low power constraints. A reconfigurable computing array based on ACT-MAC is designed in the acceleration core, and hardware acceleration of algorithms such as convolution, pooling, long short-term memory (LSTM) and activation function is supported. The acceleration core utilizes a ping-pong pipeline design to optimize memory allocation, significantly enhancing data processing efficiency. This acceleration core is integrated with the open-source RISC-V processor through the nuclei instruction co-unit extension (NICE) interface, forming a complete system on chip (SoC). The design is implemented as a chip prototype on the Nexys Video field programmable gate array (FPGA), on which networks such as LeNet, VGG16, and LSTM are deployed, demonstrating the application potential of this SoC chip prototype in areas such as image classification and semantic recognition. In comparison with recent approaches, this design achieves enhanced digital signal processing (DSP) efficiency while retaining a high energy efficiency ratio, and facilitates hardware acceleration for a multitude of artificial intelligence algorithms, thereby highlighting its broad potential in embedded application scenarios.

Key words: RISC-V, reconfigurable computation, non-linear computation, artificial intelligence, SoC

中图分类号: