1
|
Laydevant J, Marković D, Grollier J. Training an Ising machine with equilibrium propagation. Nat Commun 2024; 15:3671. [PMID: 38693108 PMCID: PMC11063034 DOI: 10.1038/s41467-024-46879-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 03/12/2024] [Indexed: 05/03/2024] Open
Abstract
Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for achieving high accuracy. In this study, we demonstrate an efficient approach to train Ising machines in a supervised way through the Equilibrium Propagation algorithm, achieving comparable results to software-based implementations. We employ the quantum annealing procedure of the D-Wave Ising machine to train a fully-connected neural network on the MNIST dataset. Furthermore, we demonstrate that the machine's connectivity supports convolution operations, enabling the training of a compact convolutional network with minimal spins per neuron. Our findings establish Ising machines as a promising trainable hardware platform for AI, with the potential to enhance machine learning applications.
Collapse
Affiliation(s)
- Jérémie Laydevant
- Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, 91767, Palaiseau, France.
| | - Danijela Marković
- Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, 91767, Palaiseau, France
| | - Julie Grollier
- Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, 91767, Palaiseau, France.
| |
Collapse
|
2
|
Stern M, Liu AJ, Balasubramanian V. Physical effects of learning. Phys Rev E 2024; 109:024311. [PMID: 38491658 DOI: 10.1103/physreve.109.024311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 01/31/2024] [Indexed: 03/18/2024]
Abstract
Interacting many-body physical systems ranging from neural networks in the brain to folding proteins to self-modifying electrical circuits can learn to perform diverse tasks. This learning, both in nature and in engineered systems, can occur through evolutionary selection or through dynamical rules that drive active learning from experience. Here, we show that learning in linear physical networks with weak input signals leaves architectural imprints on the Hessian of a physical system. Compared to a generic organization of the system components, (a) the effective physical dimension of the response to inputs decreases, (b) the response of physical degrees of freedom to random perturbations (or system "susceptibility") increases, and (c) the low-eigenvalue eigenvectors of the Hessian align with the task. Overall, these effects embody the typical scenario for learning processes in physical systems in the weak input regime, suggesting ways of discovering whether a physical network may have been trained.
Collapse
Affiliation(s)
- Menachem Stern
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Andrea J Liu
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, New York 10010, USA
| | - Vijay Balasubramanian
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, New Mexico 87501, USA
- Theoretische Natuurkunde, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium
| |
Collapse
|
3
|
Farcis L, Teixeira BMS, Talatchian P, Salomoni D, Ebels U, Auffret S, Dieny B, Mizrahi FA, Grollier J, Sousa RC, Buda-Prejbeanu LD. Spiking Dynamics in Dual Free Layer Perpendicular Magnetic Tunnel Junctions. NANO LETTERS 2023; 23:7869-7875. [PMID: 37589447 DOI: 10.1021/acs.nanolett.3c01597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/18/2023]
Abstract
Spintronic devices have recently attracted a lot of attention in the field of unconventional computing due to their non-volatility for short- and long-term memory, nonlinear fast response, and relatively small footprint. Here we demonstrate experimentally how voltage driven magnetization dynamics of dual free layer perpendicular magnetic tunnel junctions can emulate spiking neurons in hardware. The output spiking rate was controlled by varying the dc bias voltage across the device. The field-free operation of this two-terminal device and its robustness against an externally applied magnetic field make it a suitable candidate to mimic the neuron response in a dense neural network. The small energy consumption of the device (4-16 pJ/spike) and its scalability are important benefits for embedded applications. This compact perpendicular magnetic tunnel junction structure could finally bring spiking neural networks to sub-100 nm size elements.
Collapse
Affiliation(s)
- Louis Farcis
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - Bruno M S Teixeira
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - Philippe Talatchian
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - David Salomoni
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - Ursula Ebels
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - Stéphane Auffret
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - Bernard Dieny
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | - Frank A Mizrahi
- Unité Mixte de Physique CNRS/Thales, Université Paris-Saclay, 91767 Palaiseau, France
| | - Julie Grollier
- Unité Mixte de Physique CNRS/Thales, Université Paris-Saclay, 91767 Palaiseau, France
| | - Ricardo C Sousa
- Université Grenoble Alpes, CEA, CNRS, Grenoble-INP, SPINTEC, Grenoble 38000, France
| | | |
Collapse
|
4
|
Park TJ, Deng S, Manna S, Islam ANMN, Yu H, Yuan Y, Fong DD, Chubykin AA, Sengupta A, Sankaranarayanan SKRS, Ramanathan S. Complex Oxides for Brain-Inspired Computing: A Review. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2023; 35:e2203352. [PMID: 35723973 DOI: 10.1002/adma.202203352] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 06/02/2022] [Indexed: 06/15/2023]
Abstract
The fields of brain-inspired computing, robotics, and, more broadly, artificial intelligence (AI) seek to implement knowledge gleaned from the natural world into human-designed electronics and machines. In this review, the opportunities presented by complex oxides, a class of electronic ceramic materials whose properties can be elegantly tuned by doping, electron interactions, and a variety of external stimuli near room temperature, are discussed. The review begins with a discussion of natural intelligence at the elementary level in the nervous system, followed by collective intelligence and learning at the animal colony level mediated by social interactions. An important aspect highlighted is the vast spatial and temporal scales involved in learning and memory. The focus then turns to collective phenomena, such as metal-to-insulator transitions (MITs), ferroelectricity, and related examples, to highlight recent demonstrations of artificial neurons, synapses, and circuits and their learning. First-principles theoretical treatments of the electronic structure, and in situ synchrotron spectroscopy of operating devices are then discussed. The implementation of the experimental characteristics into neural networks and algorithm design is then revewed. Finally, outstanding materials challenges that require a microscopic understanding of the physical mechanisms, which will be essential for advancing the frontiers of neuromorphic computing, are highlighted.
Collapse
Affiliation(s)
- Tae Joon Park
- School of Materials Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Sunbin Deng
- School of Materials Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Sukriti Manna
- Center for Nanoscale Materials, Argonne National Laboratory, Argonne, IL, 60439, USA
| | - A N M Nafiul Islam
- Department of Electrical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Haoming Yu
- School of Materials Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Yifan Yuan
- School of Materials Engineering, Purdue University, West Lafayette, IN, 47907, USA
| | - Dillon D Fong
- Materials Science Division, Argonne National Laboratory, Lemont, IL, 60439, USA
| | - Alexander A Chubykin
- Department of Biological Sciences, Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47907, USA
| | - Abhronil Sengupta
- Department of Electrical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Subramanian K R S Sankaranarayanan
- Center for Nanoscale Materials, Argonne National Laboratory, Argonne, IL, 60439, USA
- Department of Mechanical and Industrial Engineering, University of Illinois Chicago, Chicago, IL, 60607, USA
| | - Shriram Ramanathan
- School of Materials Engineering, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
5
|
Oh S, An J, Cho S, Yoon R, Min KS. Memristor Crossbar Circuits Implementing Equilibrium Propagation for On-Device Learning. MICROMACHINES 2023; 14:1367. [PMID: 37512678 PMCID: PMC10384638 DOI: 10.3390/mi14071367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/22/2023] [Accepted: 07/01/2023] [Indexed: 07/30/2023]
Abstract
Equilibrium propagation (EP) has been proposed recently as a new neural network training algorithm based on a local learning concept, where only local information is used to calculate the weight update of the neural network. Despite the advantages of local learning, numerical iteration for solving the EP dynamic equations makes the EP algorithm less practical for realizing edge intelligence hardware. Some analog circuits have been suggested to solve the EP dynamic equations physically, not numerically, using the original EP algorithm. However, there are still a few problems in terms of circuit implementation: for example, the need for storing the free-phase solution and the lack of essential peripheral circuits for calculating and updating synaptic weights. Therefore, in this paper, a new analog circuit technique is proposed to realize the EP algorithm in practical and implementable hardware. This work has two major contributions in achieving this objective. First, the free-phase and nudge-phase solutions are calculated by the proposed analog circuits simultaneously, not at different times. With this process, analog voltage memories or digital memories with converting circuits between digital and analog domains for storing the free-phase solution temporarily can be eliminated in the proposed EP circuit. Second, a simple EP learning rule relying on a fixed amount of conductance change per programming pulse is newly proposed and implemented in peripheral circuits. The modified EP learning rule can make the weight update circuit practical and implementable without requiring the use of a complicated program verification scheme. The proposed memristor conductance update circuit is simulated and verified for training synaptic weights on memristor crossbars. The simulation results showed that the proposed EP circuit could be used for realizing on-device learning in edge intelligence hardware.
Collapse
Affiliation(s)
- Seokjin Oh
- School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea
| | - Jiyong An
- School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea
| | - Seungmyeong Cho
- School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea
| | - Rina Yoon
- School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea
| | - Kyeong-Sik Min
- School of Electrical Engineering, Kookmin University, Seoul 02707, Republic of Korea
| |
Collapse
|
6
|
Xiao M, Meng Q, Zhang Z, Wang Y, Lin Z. SPIDE: A purely spike-based method for training feedback spiking neural networks. Neural Netw 2023; 161:9-24. [PMID: 36736003 DOI: 10.1016/j.neunet.2023.01.026] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 11/19/2022] [Accepted: 01/19/2023] [Indexed: 01/26/2023]
Abstract
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. However, most supervised SNN training methods, such as conversion from artificial neural networks or direct training with surrogate gradients, require complex computation rather than spike-based operations of spiking neurons during training. In this paper, we study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method, implicit differentiation on the equilibrium state (IDE), for supervised learning with purely spike-based computation, which demonstrates the potential for energy-efficient training of SNNs. Specifically, we introduce ternary spiking neuron couples and prove that implicit differentiation can be solved by spikes based on this design, so the whole training procedure, including both forward and backward passes, is made as event-driven spike computation, and weights are updated locally with two-stage average firing rates. Then we propose to modify the reset membrane potential to reduce the approximation error of spikes. With these key components, we can train SNNs with flexible structures in a small number of time steps and with firing sparsity during training, and the theoretical estimation of energy costs demonstrates the potential for high efficiency. Meanwhile, experiments show that even with these constraints, our trained models can still achieve competitive results on MNIST, CIFAR-10, CIFAR-100, and CIFAR10-DVS.
Collapse
Affiliation(s)
- Mingqing Xiao
- National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University, China.
| | - Qingyan Meng
- The Chinese University of Hong Kong, Shenzhen, China; Shenzhen Research Institute of Big Data, Shenzhen 518115, China.
| | - Zongpeng Zhang
- Center for Data Science, Academy for Advanced Interdisciplinary Studies, Peking University, China.
| | - Yisen Wang
- National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University, China; Institute for Artificial Intelligence, Peking University, China.
| | - Zhouchen Lin
- National Key Laboratory of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University, China; Institute for Artificial Intelligence, Peking University, China; Peng Cheng Laboratory, China.
| |
Collapse
|
7
|
Watfa M, Garcia-Ortiz A, Sassatelli G. Energy-based analog neural network framework. Front Comput Neurosci 2023; 17:1114651. [PMID: 36936192 PMCID: PMC10020340 DOI: 10.3389/fncom.2023.1114651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 02/14/2023] [Indexed: 03/06/2023] Open
Abstract
Over the past decade a body of work has emerged and shown the disruptive potential of neuromorphic systems across a broad range of studies, often combining novel machine learning models and nanotechnologies. Still, the scope of investigations often remains limited to simple problems since the process of building, training, and evaluating mixed-signal neural models is slow and laborious. In this paper, we introduce an open-source framework, called EBANA, that provides a unified, modularized, and extensible infrastructure, similar to conventional machine learning pipelines, for building and validating analog neural networks (ANNs). It uses Python as interface language with a syntax similar to Keras, while hiding the complexity of the underlying analog simulations. It already includes the most common building blocks and maintains sufficient modularity and extensibility to easily incorporate new concepts, electrical, and technological models. These features make EBANA suitable for researchers and practitioners to experiment with different design topologies and explore the various tradeoffs that exist in the design space. We illustrate the framework capabilities by elaborating on the increasingly popular Energy-Based Models (EBMs), used in conjunction with the local Equilibrium Propagation (EP) training algorithm. Our experiments cover 3 datasets having up to 60,000 entries and explore network topologies generating circuits in excess of 1,000 electrical nodes that can be extensively benchmarked with ease and in reasonable time thanks to the native EBANA parallelization capability.
Collapse
Affiliation(s)
- Mohamed Watfa
- LIRMM, University of Montpellier, CNRS, Montpellier, France
- ITEM, University of Bremen, Bremen, Germany
| | - Alberto Garcia-Ortiz
- ITEM, University of Bremen, Bremen, Germany
- *Correspondence: Alberto Garcia-Ortiz
| | | |
Collapse
|
8
|
Zhou G, Ji X, Li J, Zhou F, Dong Z, Yan B, Sun B, Wang W, Hu X, Song Q, Wang L, Duan S. Second-order associative memory circuit hardware implemented by the evolution from battery-like capacitance to resistive switching memory. iScience 2022; 25:105240. [PMID: 36262310 PMCID: PMC9574501 DOI: 10.1016/j.isci.2022.105240] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 08/29/2022] [Accepted: 09/27/2022] [Indexed: 12/04/2022] Open
Abstract
Memristor-based Pavlov associative memory circuit presented today only realizes the simple condition reflex process. The secondary condition reflex endows the simple condition reflex process with more bionic, but it is only demonstrated in design and involves the large number of redundant circuits. A FeOx-based memristor exhibits an evolution process from battery-like capacitance (BLC) state to resistive switching (RS) memory as the I-V sweeping increase. The BLC is triggered by the active metal ion and hydroxide ion originated from water molecule splitting at different interfaces, while the RS memory behavior is dominated by the diffusion and migration of ion in the FeOx switching function layer. The evolution processes share the nearly same biophysical mechanism with the second-order conditioning. It enables a hardware-implemented second-order associative memory circuit to be feasible and simple. This work provides a novel path to realize the associative memory circuit with the second-order conditioning at hardware level.
Collapse
Affiliation(s)
- Guangdong Zhou
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| | - Xiaoyue Ji
- College of Electrical Engineering, Zhejiang University, Hangzhou 310027, PR China
| | - Jie Li
- Shenzhen-Hong Kong College of Microelectronics, Southern University of Science and Technology, Shenzhen 518055, China
| | - Feichi Zhou
- Shenzhen-Hong Kong College of Microelectronics, Southern University of Science and Technology, Shenzhen 518055, China
| | - Zhekang Dong
- College of Electrical Engineering, Zhejiang University, Hangzhou 310027, PR China
| | - Bingtao Yan
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| | - Bai Sun
- Department of Mechanics and Mechatronics Engineering, Centre for Advanced Materials Joining, Waterloo Institute for Nanotechnology, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Wenhua Wang
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| | - Xiaofang Hu
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| | - Qunliang Song
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| | - Lidan Wang
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| | - Shukai Duan
- College of Artificial Intelligence, School of Materials and Energy, Southwest University, Chongqing 400715, PR China
| |
Collapse
|
9
|
Rohlfs C. A descriptive analysis of olfactory sensation and memory in Drosophila and its relation to artificial neural networks. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.10.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
10
|
Feldhoff F, Toepfer H, Harczos T, Klefenz F. Periodicity Pitch Perception Part III: Sensibility and Pachinko Volatility. Front Neurosci 2022; 16:736642. [PMID: 35356050 PMCID: PMC8959216 DOI: 10.3389/fnins.2022.736642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 02/07/2022] [Indexed: 11/29/2022] Open
Abstract
Neuromorphic computer models are used to explain sensory perceptions. Auditory models generate cochleagrams, which resemble the spike distributions in the auditory nerve. Neuron ensembles along the auditory pathway transform sensory inputs step by step and at the end pitch is represented in auditory categorical spaces. In two previous articles in the series on periodicity pitch perception an extended auditory model had been successfully used for explaining periodicity pitch proved for various musical instrument generated tones and sung vowels. In this third part in the series the focus is on octopus cells as they are central sensitivity elements in auditory cognition processes. A powerful numerical model had been devised, in which auditory nerve fibers (ANFs) spike events are the inputs, triggering the impulse responses of the octopus cells. Efficient algorithms are developed and demonstrated to explain the behavior of octopus cells with a focus on a simple event-based hardware implementation of a layer of octopus neurons. The main finding is, that an octopus' cell model in a local receptive field fine-tunes to a specific trajectory by a spike-timing-dependent plasticity (STDP) learning rule with synaptic pre-activation and the dendritic back-propagating signal as post condition. Successful learning explains away the teacher and there is thus no need for a temporally precise control of plasticity that distinguishes between learning and retrieval phases. Pitch learning is cascaded: At first octopus cells respond individually by self-adjustment to specific trajectories in their local receptive fields, then unions of octopus cells are collectively learned for pitch discrimination. Pitch estimation by inter-spike intervals is shown exemplary using two input scenarios: a simple sinus tone and a sung vowel. The model evaluation indicates an improvement in pitch estimation on a fixed time-scale.
Collapse
Affiliation(s)
- Frank Feldhoff
- Advanced Electromagnetics Group, Technische Universität Ilmenau, Ilmenau, Germany
| | - Hannes Toepfer
- Advanced Electromagnetics Group, Technische Universität Ilmenau, Ilmenau, Germany
| | - Tamas Harczos
- Fraunhofer-Institut für Digitale Medientechnologie, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Göttingen, Germany
- audifon GmbH & Co. KG, Kölleda, Germany
| | - Frank Klefenz
- Fraunhofer-Institut für Digitale Medientechnologie, Ilmenau, Germany
| |
Collapse
|
11
|
Schuman CD, Kulkarni SR, Parsa M, Mitchell JP, Date P, Kay B. Opportunities for neuromorphic computing algorithms and applications. NATURE COMPUTATIONAL SCIENCE 2022; 2:10-19. [PMID: 38177712 DOI: 10.1038/s43588-021-00184-y] [Citation(s) in RCA: 68] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 12/07/2021] [Indexed: 01/06/2024]
Abstract
Neuromorphic computing technologies will be important for the future of computing, but much of the work in neuromorphic computing has focused on hardware development. Here, we review recent results in neuromorphic computing algorithms and applications. We highlight characteristics of neuromorphic computing technologies that make them attractive for the future of computing and we discuss opportunities for future development of algorithms and applications on these systems.
Collapse
Affiliation(s)
- Catherine D Schuman
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
- Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA.
| | - Shruti R Kulkarni
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Maryam Parsa
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
- Department of Electrical and Computer Engineering, George Mason University, Fairfax, VA, USA
| | - J Parker Mitchell
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Prasanna Date
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Bill Kay
- Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| |
Collapse
|
12
|
Wright LG, Onodera T, Stein MM, Wang T, Schachter DT, Hu Z, McMahon PL. Deep physical neural networks trained with backpropagation. Nature 2022; 601:549-555. [PMID: 35082422 PMCID: PMC8791835 DOI: 10.1038/s41586-021-04223-6] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 11/09/2021] [Indexed: 11/08/2022]
Abstract
Deep-learning models have become pervasive tools in science and engineering. However, their energy requirements now increasingly limit their scalability1. Deep-learning accelerators2-9 aim to perform deep learning energy-efficiently, usually targeting the inference phase and often by exploiting physical substrates beyond conventional electronics. Approaches so far10-22 have been unable to apply the backpropagation algorithm to train unconventional novel hardware in situ. The advantages of backpropagation have made it the de facto training method for large-scale neural networks, so this deficiency constitutes a major impediment. Here we introduce a hybrid in situ-in silico algorithm, called physics-aware training, that applies backpropagation to train controllable physical systems. Just as deep learning realizes computations with deep neural networks made from layers of mathematical functions, our approach allows us to train deep physical neural networks made from layers of controllable physical systems, even when the physical layers lack any mathematical isomorphism to conventional artificial neural network layers. To demonstrate the universality of our approach, we train diverse physical neural networks based on optics, mechanics and electronics to experimentally perform audio and image classification tasks. Physics-aware training combines the scalability of backpropagation with the automatic mitigation of imperfections and noise achievable with in situ algorithms. Physical neural networks have the potential to perform machine learning faster and more energy-efficiently than conventional electronic processors and, more broadly, can endow physical systems with automatically designed physical functionalities, for example, for robotics23-26, materials27-29 and smart sensors30-32.
Collapse
Affiliation(s)
- Logan G Wright
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA.
- NTT Physics and Informatics Laboratories, NTT Research, Inc., Sunnyvale, CA, USA.
| | - Tatsuhiro Onodera
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA.
- NTT Physics and Informatics Laboratories, NTT Research, Inc., Sunnyvale, CA, USA.
| | - Martin M Stein
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA
| | - Tianyu Wang
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA
| | - Darren T Schachter
- School of Electrical and Computer Engineering, Cornell University, Ithaca, NY, USA
| | - Zoey Hu
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA
| | - Peter L McMahon
- School of Applied and Engineering Physics, Cornell University, Ithaca, NY, USA.
| |
Collapse
|