1
|
Youn S, Lee J, Kim S, Park J, Kim K, Kim H. Programmable Threshold Logic Implementations in a Memristor Crossbar Array. Nano Lett 2024; 24:3581-3589. [PMID: 38471119 DOI: 10.1021/acs.nanolett.3c04073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
In this study, we demonstrate the implementation of programmable threshold logics using a 32 × 32 memristor crossbar array. Thanks to forming-free characteristics obtained by the annealing process, its accurate programming characteristics are presented by a 256-level grayscale image. By simultaneous subtraction between weighted sum and threshold values with a differential pair in an opposite way, 3-input and 4-input Boolean logics are implemented in the crossbar without additional reference bias. Also, we verify a full-adder circuit and analyze its fidelity, depending on the device programming accuracy. Lastly, we successfully implement a 4-bit ripple carry adder in the crossbar and achieve reliable operations by read-based logic operations. Compared to stateful logic driven by device switching, a 4-bit ripple carry adder on a memristor crossbar array can perform more reliably in fewer steps thanks to its read-based parallel logic operation.
Collapse
Affiliation(s)
- Sangwook Youn
- Division of Materials Science and Engineering, Hanyang University, Seoul 04763, Korea
| | - Jungjin Lee
- Department of Electrical and Computer Engineering, Inha University, Incheon 22212, Korea
| | - Sungjoon Kim
- Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, Korea
| | - Jinwoo Park
- Division of Materials Science and Engineering, Hanyang University, Seoul 04763, Korea
| | - Kyuree Kim
- Division of Materials Science and Engineering, Hanyang University, Seoul 04763, Korea
| | - Hyungjin Kim
- Division of Materials Science and Engineering, Hanyang University, Seoul 04763, Korea
- Department of Electronic Engineering, Hanyang University, Seoul 04763, Korea
| |
Collapse
|
2
|
Hwang J, Joh H, Kim C, Ahn J, Jeon S. Monolithically Integrated Complementary Ferroelectric FET XNOR Synapse for the Binary Neural Network. ACS Appl Mater Interfaces 2024; 16:2467-2476. [PMID: 38175955 DOI: 10.1021/acsami.3c13945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
Neuromorphic computing, which mimics the structure and principles of the human brain, has the potential to facilitate the hardware implementation of next-generation artificial intelligence systems and process large amounts of data with very low power consumption. Among them, the XNOR synapse-based Binary Neural Network (BNN) has been attracting attention due to its compact neural network parameter size and low hardware cost. The previous XNOR synapse has drawbacks, such as a trade-off between cell density and accuracy. In this work, we show nonvolatile XNOR synapses with high density and accuracy using a monolithically stacked complementary ferroelectric field-effect transistor (C-FeFET) composed of a p-type Si MFMIS-FeFET at the bottom and a 3D stackable n-type Al:IZTO MFS-FeTFT, achieving 60F2 per cell (2C-FeFET). For adjusting the threshold voltage and improving the switching speed (100 ns) of n-type ferroelectric TFT, we employed a dual-gate configuration and a unique operation scheme, making it comparable to those of Si-based FeFETs. We performed array-level simulation with a 512 × 512 subarray size and a 3-bit flash ADC, demonstrating that the image recognition accuracies using the MNIST and CIFAR-10 data sets were increased by 3.17 and 14.07%, respectively, in comparison to other nonvolatile XNOR synapses. In addition, we performed system-level analysis on a 512 × 512 XNOR C-FeFET, exhibiting an outstanding throughput of 717.37 GOPS and an energy efficiency of 196.7 TOPS/W. We expect that our approach would contribute to the high-density memory systems, logic-in-memory technology, and hardware implementation of neural networks.
Collapse
Affiliation(s)
- Junghyeon Hwang
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291, Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| | - Hongrae Joh
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291, Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| | - Chaeheon Kim
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291, Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| | - Jinho Ahn
- Division of Materials Science and Engineering, Hanyang University, 222, Wangsimni-ro, Seonhdong-gu, Seoul 04763, Korea
| | - Sanghun Jeon
- School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291, Daehak-ro, Yuseong-gu, Daejeon 34141, Korea
| |
Collapse
|
3
|
Zhao R, Gong Z, Liu Y, Chen J. A High-Precision Voltage-Quantization-Based Current-Mode Computing-in-Memory SRAM. Micromachines (Basel) 2023; 14:2180. [PMID: 38138349 PMCID: PMC10745502 DOI: 10.3390/mi14122180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/12/2023] [Accepted: 11/27/2023] [Indexed: 12/24/2023]
Abstract
Non-linear distortion of signals is a serious problem in computing-in-memory SRAM (CIM-SRAM) circuits in current mode. This problem greatly limits the performance of calculations and directly affects the computing power of the CIM-SRAM. In this study, the causes of non-linearity and inconsistency were investigated. Based on detailed analyses, we proposed a high-precision, fully dynamic range IV (HFIV) conversion circuit. The HFIV circuit was added to each bit line (BL) for voltage clamping and proportionally mirroring the read current. We applied the structure to numerous prior studies and evaluated them using the 55 nm complementary metal-oxide semiconductor process. The results showed the proposed HFIV circuit could increase the CIM-SRAM's calculation linearity to 99.92% (8~32 SRAM bit-cells) and 99.8% (32~64 SRAM bit-cells) with a 1.2 V supply.
Collapse
Affiliation(s)
- Ruiyong Zhao
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200031, China; (R.Z.); (Y.L.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhenghui Gong
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200031, China; (R.Z.); (Y.L.)
| | - Yulan Liu
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200031, China; (R.Z.); (Y.L.)
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jing Chen
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200031, China; (R.Z.); (Y.L.)
| |
Collapse
|
4
|
Verma G, Soni S, Kaushik BK. Spin device-based image edge detection architecture for neuromorphic computing. Nanotechnology 2023; 35:055201. [PMID: 37797609 DOI: 10.1088/1361-6528/ad0056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/05/2023] [Indexed: 10/07/2023]
Abstract
Artificial intelligence and deep learning today are utilized for several applications namely image processing, smart surveillance, edge computing, and so on. The hardware implementation of such applications has been a matter of concern due to huge area and energy requirements. The concept of computing in-memory and the use of non-volatile memory (NVM) devices have paved a path for resource-efficient hardware implementation. We propose a dual-level spin-orbit torque magnetic random-access memory (SOT-DLC MRAM) based crossbar array design for image edge detection. The presented in-memory edge detection algorithm framework provides spin-based crossbar designs that can intrinsically perform image edge detection in an energy-efficient manner. The simulation results are scaled down in energy consumption for data transfer by a factor of 8x for grayscale images with a comparatively smaller crossbar than an equivalent CMOS design. DLC SOT-MRAM outperforms CMOS-based hardware implementation in several key aspects, offering 1.53x greater area efficiency, 14.24x lower leakage power dissipation, and 3.63x improved energy efficiency. Additionally, when compared to conventional spin transfer torque (STT-MRAM and SOT-MRAM, SOT-DLC MRAM achieves higher energy efficiency with a 1.07x and 1.03x advantage, respectively. Further, we extended the image edge extraction framework to spiking domain where ant colony optimization (ACO) algorithm is implemented. The mathematical analysis is presented for mapping of conductance matrix of the crossbar during edge detection with an improved area and energy efficiency at hardware implementation. The pixel accuracy of edge-detected image from ACO is 4.9% and 3.72% higher than conventional Sobel and Canny based edge-detection.
Collapse
Affiliation(s)
- Gaurav Verma
- Department of Electronics and Communication Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand-247667, India
- Cadence Design Systems, NSEZ, Noida, 201307, India
| | - Sandeep Soni
- Department of Electronics and Communication Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand-247667, India
| | - Brajesh Kumar Kaushik
- Department of Electronics and Communication Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand-247667, India
| |
Collapse
|
5
|
Du Y, Tang J, Li Y, Xi Y, Li Y, Li J, Huang H, Qin Q, Zhang Q, Gao B, Deng N, Qian H, Wu H. Monolithic 3D Integration of Analog RRAM-Based Computing-in-Memory and Sensor for Energy-Efficient Near-Sensor Computing. Adv Mater 2023:e2302658. [PMID: 37652463 DOI: 10.1002/adma.202302658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 08/14/2023] [Indexed: 09/02/2023]
Abstract
In the era of the Internet of Things, vast amounts of data generated at sensory nodes impose critical challenges on the data-transfer bandwidth and energy efficiency of computing hardware. A near-sensor computing (NSC) architecture places the processing units closer to the sensors such that the generated data can be processed almost in situ with high efficiency. This study demonstrates the monolithic three-dimensional (M3D) integration of a photosensor array, analog computing-in-memory (CIM), and Si complementary metal-oxide-semiconductor (CMOS) logic circuits, named M3D-SAIL. This approach exploits the high-bandwidth on-chip data transfer and massively parallel CIM cores to realize an energy-efficient NSC architecture. The 1st layer of the Si CMOS circuits serves as the control logic and peripheral circuits. The 2nd layer comprises a 1 k-bit one-transistor-one-resistor (1T1R) array with InGaZnOx field-effect transistor (IGZO-FET) and resistive random-access memory (RRAM) for analog CIM. The 3rd layer comprises multiple IGZO-FET-based photosensor arrays for wavelength-dependent optical sensing. The structural integrity and function of each layer are comprehensively verified. Furthermore, NSC is implemented using the M3D-SAIL architecture for a typical video keyframe-extraction task, achieving a high classification accuracy of 96.7% as well as a 31.5× lower energy consumption and 1.91× faster computing speed compared to its 2D counterpart.
Collapse
Affiliation(s)
- Yiwei Du
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Jianshi Tang
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Yijun Li
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Yue Xi
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Yuankun Li
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Jiaming Li
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Heyi Huang
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Qi Qin
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Qingtian Zhang
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Bin Gao
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Ning Deng
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - He Qian
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| | - Huaqiang Wu
- School of Integrated Circuits, Beijing Advanced Innovation Center for Integrated Circuits, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, 100084, China
| |
Collapse
|
6
|
Yang Y, Lv S, Li X, Wang X, Wang Q, Yuan Y, Liang S, Zhang F. An Ultra-Low-Power Analog Multiplier-Divider Compatible with Digital Code for RRAM-Based Computing-in-Memory Macros. Micromachines (Basel) 2023; 14:1482. [PMID: 37512793 PMCID: PMC10383279 DOI: 10.3390/mi14071482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/17/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023]
Abstract
This manuscript presents an ultra-low-power analog multiplier-divider compatible with digital code words, which is applicable to the integrated structure of resistive random-access memory (RRAM)-based computing-in-memory (CIM) macros. Current multiplication and division are accomplished by a current-mirror-based structure. Compared with digital dividers to achieve higher precision and operation speed, analog dividers present the advantages of a reduced power consumption and a simple circuit structure in lower precision operations, thus improving the energy efficiency. Designed and fabricated in a 55 nm CMOS process, the proposed work is capable of achieving 8-bit precision for analog current multiplication and division operations. Measurement results show that the signal delay is 1 μs when performing 8-bit operation, with a bandwidth of 1.4 MHz. The power consumption is less than 6.15 μW with a 1.2 V supply voltage. The proposed multiplier-divider can increase the operation capacity by dividing the input current and digital code while reducing the power consumption and complexity required by division, which can be further utilized in real-time operation of edge computing devices.
Collapse
Affiliation(s)
- Yiming Yang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
| | - Shidong Lv
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
| | - Xiaoran Li
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
- BIT Chongqing Institute of Microelectronics and Microsystems, Chongqing 401332, China
| | - Xinghua Wang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
- BIT Chongqing Institute of Microelectronics and Microsystems, Chongqing 401332, China
- Yangtze Delta Region Academy of Beijing Institute of Technology, Jiaxing 314000, China
| | - Qian Wang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
| | - Yiyang Yuan
- Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China
| | - Sen Liang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing 100081, China
| | - Feng Zhang
- Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China
| |
Collapse
|
7
|
Xu S, Li X, Xie C, Chen H, Chen C, Song Z. A High-Precision Implementation of the Sigmoid Activation Function for Computing-in-Memory Architecture. Micromachines (Basel) 2021; 12:mi12101183. [PMID: 34683234 PMCID: PMC8540118 DOI: 10.3390/mi12101183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 09/19/2021] [Accepted: 09/27/2021] [Indexed: 11/19/2022]
Abstract
Computing-In-Memory (CIM), based on non-von Neumann architecture, has lately received significant attention due to its lower overhead in delay and higher energy efficiency in convolutional and fully-connected neural network computing. Growing works have given the priority to researching the array of memory and peripheral circuits to achieve multiply-and-accumulate (MAC) operation, but not enough attention has been paid to the high-precision hardware implementation of non-linear layers up to now, which still causes time overhead and power consumption. Sigmoid is a widely used non-linear activation function and most of its studies provided an approximation of the function expression rather than totally matched, inevitably leading to considerable error. To address this issue, we propose a high-precision circuit implementation of the sigmoid, matching the expression exactly for the first time. The simulation results with the SMIC 40 nm process suggest that the proposed circuit implemented high-precision sigmoid perfectly achieves the properties of the ideal sigmoid, showing the maximum error and average error between the proposed simulated sigmoid and ideal sigmoid is 2.74% and 0.21%, respectively. In addition, a multi-layer convolutional neural network based on CIM architecture employing the simulated high-precision sigmoid activation function verifies the similar recognition accuracy on the test database of handwritten digits compared to utilize the ideal sigmoid in software, with online training achieving 97.06% and with offline training achieving 97.74%.
Collapse
Affiliation(s)
- Siqiu Xu
- The State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (S.X.); (C.X.); (H.C.); (C.C.); (Z.S.)
- The University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xi Li
- The State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (S.X.); (C.X.); (H.C.); (C.C.); (Z.S.)
- Correspondence:
| | - Chenchen Xie
- The State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (S.X.); (C.X.); (H.C.); (C.C.); (Z.S.)
| | - Houpeng Chen
- The State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (S.X.); (C.X.); (H.C.); (C.C.); (Z.S.)
| | - Cheng Chen
- The State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (S.X.); (C.X.); (H.C.); (C.C.); (Z.S.)
- The University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhitang Song
- The State Key Laboratory of Functional Materials for Informatics, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China; (S.X.); (C.X.); (H.C.); (C.C.); (Z.S.)
| |
Collapse
|