[1] BRAINERD J G, SHARPLESS T K. The ENIAC[J]. Proceedings of the IEEE, 1999, 87(6):1031-1041. [2] DURANTON M, DUTOIT D, MENEZO S. Key requirements for optical interconnects within data centers[M]//Optical Interconnects for Data Centers. Amsterdam: Elsevier, 2017: 75-94. [3] SCHULTE M J, IGNATOWSKI M, LOH G H, et al. Achieving exascale capabilities through heterogeneous computing[J]. IEEE Micro, 2015,35(4):26-36. [4] KANNAN A, JERGER N E, LOH G H. Enabling interposer-based disintegration of multi-core processors[C]//Proceedings of the 48th International Symposium on Microarchitecture, Waikiki, HI, USA, 2015. [5] NAFFZIGER S, LEPAK K, PARASCHOU M, et al. 2.2 AMD Chiplet architecture for high-performance server and desktop products[C]//2020 IEEE International Solid- State Circuits Conference - (ISSCC), San Francisco, CA, USA, 2020. [6] MOORE S K. Chiplets are the future of processors: Three advances boost performance, cut costs, and save power[J]. IEEE Spectrum, 2020, 57(5): 11-12. [7] 许居衍. 复归于道: 封装改道芯片业[J]. 电子与封装, 2019, 19(10): 1-3. [8] STOW D, XIE Y, SIDDIQUA T, et al. Cost-effective design of scalable high-performance systems using active and passive interposers[C]//2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), Irvine, CA, USA, 2017. [9] 缪旻, 金玉丰. 微系统集成全新阶段: IC芯片与电子集成封装的融合发展[J]. 微电子学与计算机, 2021, 38(1): 1-6. [10] PAL S, PETRISKO D, KUMAR R, et al. Design space exploration for Chiplet-assembly-based processors[J]. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2020, 28(4): 1062-1073. [11] NAYFEH B A, OLUKOTUN K. A single-chip multiprocessor[J]. Computer, 1997, 30(9): 79-85. [12] UY R L. Beyond multi-core: a survey of architectural innovations on microprocessor[C]//2014 International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), Palawan, Philippines, 2014. [13] LIN M S, HUANG T C, TSAI C C, et al. A 7 nm 4 GHz Arm?-core-based CoWoS? Chiplet design for high performance computing[C]//2019 Symposium on VLSI Circuits, Kyoto, Japan, 2019. [14] 张前贤, 莫毓昌, 潘竹生. 用于计算机体系结构教学的哈佛体系结构模拟器[J]. 中国信息技术教育, 2014(5): 108-111. [15] GAO Y X, ZHANG P. A survey of homogeneous and heterogeneous system architectures in high performance computing[C]//2016 IEEE International Conference on Smart Cloud (SmartCloud), New York, NY, USA, 2016. [16] MITTAL S, VETTER J S. A survey of CPU-GPU heterogeneous computing techniques[J]. ACM Computing Surveys, 2015, 47(4): 69-72. [17] ESMAEILZADEH H, BLEM E, ST AMANT R, et al. Dark silicon and the end of multicore scaling[C]//Proceedings of the 38th Annual International Symposium on Computer Architecture, San Jose, California, USA, 2011. [18] NURVITADHI E, KWON D, JAFARI A, et al. Why compete when you can work together: FPGA-ASIC integration for persistent RNNs[C]//2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), San Diego, CA, USA, 2019. [19] LAU J H, LI M, LI QINGQIAN M, et al. Fan-out wafer-level packaging for heterogeneous integration[J]. IEEE Transactions on Components, Packaging and Manufacturing Technology, 2018, 8(9): 1544-1560. [20] VIJAYARAGHAVAN T, KARUNANITHI A, KAYIRAN O, et al. Design and analysis of an APU for exascale computing[C]//2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, USA, 2017. [21] ARUNKUMAR A, BOLOTIN E, CHO B, et al. MCM-GPU: multi-chip-module GPUs for continued performance scalability[C]//Proceedings of the 44th Annual International Symposium on Computer Architecture, Toronto, Canada, 2017. [22] MOUNCE G, LYKE J, HORAN S, et al. Chiplet based approach for heterogeneous processing and packaging architectures[C]//2016 IEEE Aerospace Conference, Big Sky, MT, USA, 2016. [23] WULF W, MCKEE S. Hitting the memory wall: implications of the obvious[J]. SIGARCH Comput. Archit. News, 1995, 23:20-24. [24] ZHANG Y, ZHANG C, NAN J, et al. Perspectives of racetrack memory for large-capacity on-chip memory: from device to system[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2016, 63(5): 629-638. [25] HE X, SHAN G B, WU L S, et al. The development of 3D aerospace SRAM integration technology using silicon interposer[C]//2015 16th International Conference on Electronic Packaging Technology (ICEPT), Changsha, China, 2015. [26] PAWLOWSKI J T. Hybrid memory cube (HMC)[C]//2011 IEEE Hot Chips 23 Symposium (HCS), Stanford, CA, USA, 2011. [27] YI L, SHAN G B, LIU S, et al. High-performance processor design based on 3D on-chip cache[J]. Microprocessors and Microsystems, 2016, 47(B): 486-490. [28] JUN H, CHO J, LEE K, et al. HBM (high bandwidth memory) DRAM technology and architecture[C]//2017 IEEE International Memory Workshop (IMW), Monterey, CA, USA, 2017. [29] KIM S, KIM S, CHO K, et al. Processing-in-memory in high bandwidth memory (PIM-HBM) architecture with energy-efficient and low latency channels for high bandwidth system[C]//2019 IEEE 28th Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS), Montreal, QC, Canada, 2019. [30] SHIM W, YU S M. Technological design of 3D NAND-based compute-in-memory architecture for GB-scale deep neural network[J]. IEEE Electron Device Letters, 2021, 42(2): 160-163. [31] SHULAKER M M, HILLS G, PARK R S, et al. Three-dimensional integration of nanotechnologies for computing and data storage on a single chip[J]. Nature, 2017, 547(7661): 74-78. [32] 张明明, 王颀, 井冲, 等. 3D NAND闪存数据保持力与初始状态依赖性研究[J]. 电子学报, 2020, 48(2): 314-320. [33] 余慧, 吴昊, 陈更生, 等. 一种堆叠式3D IC的最小边界热分析方法[J]. 电子学报, 2012, 40(5): 865-870. [34] PI Y D, WANG N Y, CHEN J, et al. Anisotropic equivalent thermal conductivity model for efficient and accurate full-chip-scale numerical simulation of 3D stacked IC[J]. International Journal of Heat and Mass Transfer, 2018, 120: 361-378. [35] 曹明鹏, 吴晓鹏, 管宏山, 等. 基于对偶单元法的三维集成微系统电热耦合分析[J]. 物理学报, 2021, 70(7): 180-188. [36] ENDOH T. Nonvolatile logic and memory devices based on spintronics[C]//2015 IEEE International Symposium on Circuits and Systems (ISCAS), Lisbon, Portugal, 2015. [37] ONSORI S, ASAD A, RAAHEMIFAR K, et al. A high-performance hybrid memory architecture for embedded CMPs using a convex optimization model[C]//2015 International SoC Design Conference (ISOCC), Gyungju, South Korea, 2015. [38] CHEON J, LEE I, AHN C, et al. Non-resistance metric based read scheme for multi-level PCRAM in 25 nm technology[C]//2015 IEEE Custom Integrated Circuits Conference (CICC), San Jose, CA, USA, 2015. [39] HSIEH M C, LIAO Y C, CHIN Y W, et al. Ultra high density 3D via RRAM in pure 28 nm CMOS process[C]//2013 IEEE International Electron Devices Meeting, Washington, DC, USA, 2013. [40] HUR J, LUO Y C, WANG Z, et al. A technology path for scaling embedded FeRAM to 28 nm with 2T1C structure[C]//2021 IEEE International Memory Workshop (IMW), Dresden, Germany, 2021. [41] PEDRAM A, RICHARDSON S, HOROWITZ M, et al. Dark memory and accelerator-rich system optimization in the dark silicon era[J]. IEEE Design & Test, 2017, 34(2): 39-50. [42] HAN S, LIU X Y, MAO H Z, et al. EIE: efficient inference engine on compressed deep neural network[C]//2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), Seoul, South Korea, 2016. [43] 许居衍, 黄安君. 后摩尔时代的技术创新[J]. 电子与封装, 2020, 20(12): 120101. [44] YOO T, KIM T T H, KIM B, et al. Design of current-mode 8T SRAM compute-In-memory macro for processing neural networks[C]//2020 International SoC Design Conference (ISOCC), Yeosu, South Korea, 2020. [45] IMANI M, KIM Y, ROSING T. MPIM: Multi-purpose in-memory processing using configurable resistive memory[C]//2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), Chiba, Japan, 2017. [46] CHOI J H, PARK G H. NVM way allocation scheme to reduce NVM writes for hybrid cache architecture in chip-multiprocessors[J]. IEEE Transactions on Parallel and Distributed Systems, 2017, 28(10): 2896-2910. [47] FARBEH H, MONAZZAH A M H, ALIAGHA E, et al. A-CACHE: alternating cache allocation to conduct higher endurance in NVM-based caches[J]. IEEE Transactions on Circuits and Systems Ⅱ: Express Briefs, 2019, 66(7): 1237-1241. [48] KIMURA H, HANYU T, KAMEYAMA M, et al. Complementary ferroelectric-capacitor logic for low-power logic-in-memory VLSI[J]. IEEE Journal of Solid-State Circuits, 2004, 39(6): 919-926. [49] BREYER E T, MULAOSMANOVIC H, TROMMER J, et al. Compact FeFET circuit building blocks for fast and efficient nonvolatile logic-in-memory[J]. IEEE Journal of the Electron Devices Society, 2020, 8: 748-756. [50] HAN L, SHEN Z Y, SHAO Z L, et al. A novel ReRAM-based processing-in-memory architecture for graph computing[C]//2017 IEEE 6th Non-Volatile Memory Systems and Applications Symposium (NVMSA), Hsinchu, Taiwan, China, 2017. [51] 杨轩, 叶文强, 崔小乐. 基于RRAM延时单元的PUF设计[J]. 电子学报, 2020, 48(8): 1565-1571. [52] YANG Z X, WEI L. Logic circuit and memory design for in-memory computing applications using bipolar RRAMs[C]//2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 2019. [53] KANG W, WANG H T, WANG Z H, et al. In-memory processing paradigm for bitwise logic operations in STT–MRAM[J]. IEEE Transactions on Magnetics, 2017, 53(11): 1-4. [54] ANGIZI S, HE Z Z, FAN D L. PIMA-logic: a novel processing-in-memory architecture for highly flexible and energy-efficient logic computation[C]//2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 2018. [55] ZHOU P, ZHAO B, YANG J, et al. A durable and energy efficient main memory using phase change memory technology[C]//Proceedings of the 36th annual international symposium on Computer architecture, Austin TX USA, 2009. [56] ATTENBOROUGH K, HURKX G A M, DELHOUGNE R, et al. Phase change memory line concept for embedded memory applications[C]//2010 International Electron Devices Meeting, San Francisco, CA, USA, 2010.
|