1
|
Dillavou S, Beyer BD, Stern M, Liu AJ, Miskin MZ, Durian DJ. Machine learning without a processor: Emergent learning in a nonlinear analog network. Proc Natl Acad Sci U S A 2024; 121:e2319718121. [PMID: 38954545 PMCID: PMC11252732 DOI: 10.1073/pnas.2319718121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 05/16/2024] [Indexed: 07/04/2024] Open
Abstract
Standard deep learning algorithms require differentiating large nonlinear networks, a process that is slow and power-hungry. Electronic contrastive local learning networks (CLLNs) offer potentially fast, efficient, and fault-tolerant hardware for analog machine learning, but existing implementations are linear, severely limiting their capabilities. These systems differ significantly from artificial neural networks as well as the brain, so the feasibility and utility of incorporating nonlinear elements have not been explored. Here, we introduce a nonlinear CLLN-an analog electronic network made of self-adjusting nonlinear resistive elements based on transistors. We demonstrate that the system learns tasks unachievable in linear systems, including XOR (exclusive or) and nonlinear regression, without a computer. We find our decentralized system reduces modes of training error in order (mean, slope, curvature), similar to spectral bias in artificial neural networks. The circuitry is robust to damage, retrainable in seconds, and performs learned tasks in microseconds while dissipating only picojoules of energy across each transistor. This suggests enormous potential for fast, low-power computing in edge systems like sensors, robotic controllers, and medical devices, as well as manufacturability at scale for performing and studying emergent learning.
Collapse
Affiliation(s)
- Sam Dillavou
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA19104
| | - Benjamin D. Beyer
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA19104
| | - Menachem Stern
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA19104
| | - Andrea J. Liu
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA19104
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY10010
| | - Marc Z. Miskin
- Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA19104
| | - Douglas J. Durian
- Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA19104
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY10010
| |
Collapse
|
2
|
Baltussen MG, de Jong TJ, Duez Q, Robinson WE, Huck WTS. Chemical reservoir computation in a self-organizing reaction network. Nature 2024; 631:549-555. [PMID: 38926572 PMCID: PMC11254755 DOI: 10.1038/s41586-024-07567-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 05/14/2024] [Indexed: 06/28/2024]
Abstract
Chemical reaction networks, such as those found in metabolism and signalling pathways, enable cells to process information from their environment1,2. Current approaches to molecular information processing and computation typically pursue digital computation models and require extensive molecular-level engineering3. Despite considerable advances, these approaches have not reached the level of information processing capabilities seen in living systems. Here we report on the discovery and implementation of a chemical reservoir computer based on the formose reaction4. We demonstrate how this complex, self-organizing chemical reaction network can perform several nonlinear classification tasks in parallel, predict the dynamics of other complex systems and achieve time-series forecasting. This in chemico information processing system provides proof of principle for the emergent computational capabilities of complex chemical reaction networks, paving the way for a new class of biomimetic information processing systems.
Collapse
Affiliation(s)
- Mathieu G Baltussen
- Institute for Molecules and Materials, Radboud University, Nijmegen, The Netherlands
| | - Thijs J de Jong
- Institute for Molecules and Materials, Radboud University, Nijmegen, The Netherlands
| | - Quentin Duez
- Institute for Molecules and Materials, Radboud University, Nijmegen, The Netherlands
| | - William E Robinson
- Institute for Molecules and Materials, Radboud University, Nijmegen, The Netherlands
| | - Wilhelm T S Huck
- Institute for Molecules and Materials, Radboud University, Nijmegen, The Netherlands.
| |
Collapse
|
3
|
Meinders MBJ, Yang J, Linden EVD. Application of physics encoded neural networks to improve predictability of properties of complex multi-scale systems. Sci Rep 2024; 14:15015. [PMID: 38951589 PMCID: PMC11217277 DOI: 10.1038/s41598-024-65304-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 06/19/2024] [Indexed: 07/03/2024] Open
Abstract
Predicting physical properties of complex multi-scale systems is a common challenge and demands analysis of various temporal and spatial scales. However, physics alone is often not sufficient due to lack of knowledge on certain details of the system. With sufficient data, however, machine learning techniques may aid. If data are yet relatively cumbersome to obtain, hybrid methods may come to the rescue. We focus in this report on using various types of neural networks (NN) including NN's into which physics information is encoded (PeNN's) and also studied effects of NN's hyperparameters. We apply the networks to predict the viscosity of an emulsion as a function of shear rate. We show that using various network performance metrics as the mean squared error and the coefficient of determination ( R 2 ) that the PeNN's always perform better than the NN's, as also confirmed by a Friedman test with a p-value smaller than 0.0002. The PeNN's capture extrapolation and interpolation very well, contrary to the NN's. In addition, we have found that the NN's hyperparameters including network complexity and optimization methods do not have any effect on the above conclusions. We suggest that encoding NN's with any disciplinary system based information yields promise to better predict properties of complex systems than NN's alone, which will be in particular advantageous for small numbers of data. Such encoding would also be scalable, allowing different properties to be combined, without repetitive training of the NN's.
Collapse
Affiliation(s)
- Marcel B J Meinders
- Wageningen University and Research Centre, Wageningen, The Netherlands.
- Wageningen Food and Biobased Research, Wageningen, The Netherlands.
| | - Jack Yang
- Wageningen University and Research Centre, Wageningen, The Netherlands
- Wageningen University, Wageningen, The Netherlands
| | - Erik van der Linden
- Wageningen University and Research Centre, Wageningen, The Netherlands
- Wageningen University, Wageningen, The Netherlands
| |
Collapse
|
4
|
Laydevant J, Marković D, Grollier J. Training an Ising machine with equilibrium propagation. Nat Commun 2024; 15:3671. [PMID: 38693108 PMCID: PMC11063034 DOI: 10.1038/s41467-024-46879-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 03/12/2024] [Indexed: 05/03/2024] Open
Abstract
Ising machines, which are hardware implementations of the Ising model of coupled spins, have been influential in the development of unsupervised learning algorithms at the origins of Artificial Intelligence (AI). However, their application to AI has been limited due to the complexities in matching supervised training methods with Ising machine physics, even though these methods are essential for achieving high accuracy. In this study, we demonstrate an efficient approach to train Ising machines in a supervised way through the Equilibrium Propagation algorithm, achieving comparable results to software-based implementations. We employ the quantum annealing procedure of the D-Wave Ising machine to train a fully-connected neural network on the MNIST dataset. Furthermore, we demonstrate that the machine's connectivity supports convolution operations, enabling the training of a compact convolutional network with minimal spins per neuron. Our findings establish Ising machines as a promising trainable hardware platform for AI, with the potential to enhance machine learning applications.
Collapse
Affiliation(s)
- Jérémie Laydevant
- Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, 91767, Palaiseau, France.
| | - Danijela Marković
- Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, 91767, Palaiseau, France
| | - Julie Grollier
- Laboratoire Albert Fert, CNRS, Thales, Université Paris-Saclay, 91767, Palaiseau, France.
| |
Collapse
|
5
|
Gast R, Solla SA, Kennedy A. Neural heterogeneity controls computations in spiking neural networks. Proc Natl Acad Sci U S A 2024; 121:e2311885121. [PMID: 38198531 PMCID: PMC10801870 DOI: 10.1073/pnas.2311885121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 11/27/2023] [Indexed: 01/12/2024] Open
Abstract
The brain is composed of complex networks of interacting neurons that express considerable heterogeneity in their physiology and spiking characteristics. How does this neural heterogeneity influence macroscopic neural dynamics, and how might it contribute to neural computation? In this work, we use a mean-field model to investigate computation in heterogeneous neural networks, by studying how the heterogeneity of cell spiking thresholds affects three key computational functions of a neural population: the gating, encoding, and decoding of neural signals. Our results suggest that heterogeneity serves different computational functions in different cell types. In inhibitory interneurons, varying the degree of spike threshold heterogeneity allows them to gate the propagation of neural signals in a reciprocally coupled excitatory population. Whereas homogeneous interneurons impose synchronized dynamics that narrow the dynamic repertoire of the excitatory neurons, heterogeneous interneurons act as an inhibitory offset while preserving excitatory neuron function. Spike threshold heterogeneity also controls the entrainment properties of neural networks to periodic input, thus affecting the temporal gating of synaptic inputs. Among excitatory neurons, heterogeneity increases the dimensionality of neural dynamics, improving the network's capacity to perform decoding tasks. Conversely, homogeneous networks suffer in their capacity for function generation, but excel at encoding signals via multistable dynamic regimes. Drawing from these findings, we propose intra-cell-type heterogeneity as a mechanism for sculpting the computational properties of local circuits of excitatory and inhibitory spiking neurons, permitting the same canonical microcircuit to be tuned for diverse computational tasks.
Collapse
Affiliation(s)
- Richard Gast
- Department of Neuroscience, Feinberg School of Medicine, Northwestern University, Chicago, IL60611
- Aligning Science Across Parkinson’s Collaborative Research Network, Chevy Chase, MD20815
| | - Sara A. Solla
- Department of Neuroscience, Feinberg School of Medicine, Northwestern University, Chicago, IL60611
| | - Ann Kennedy
- Department of Neuroscience, Feinberg School of Medicine, Northwestern University, Chicago, IL60611
- Aligning Science Across Parkinson’s Collaborative Research Network, Chevy Chase, MD20815
| |
Collapse
|
6
|
Momeni A, Rahmani B, Malléjac M, Del Hougne P, Fleury R. Backpropagation-free training of deep physical neural networks. Science 2023:eadi8474. [PMID: 37995209 DOI: 10.1126/science.adi8474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 11/07/2023] [Indexed: 11/25/2023]
Abstract
Recent successes in deep learning for vision and natural language processing are attributed to larger models but come with energy consumption and scalability issues. Current training of digital deep learning models primarily relies on backpropagation that is unsuitable for physical implementation. Here, we proposed a simple deep neural network architecture augmented by a physical local learning (PhyLL) algorithm, enabling supervised and unsupervised training of deep physical neural networks, without detailed knowledge of the nonlinear physical layer's properties. We trained diverse wave-based physical neural networks in vowel and image classification experiments, showcasing our approach's universality. Our method shows advantages over other hardware-aware training schemes by improving training speed, enhancing robustness, and reducing power consumption by eliminating the need for system modelling and thus decreasing digital computation.
Collapse
Affiliation(s)
- Ali Momeni
- Laboratory of Wave Engineering, Department of Electrical Engineering, EPFL, Lausanne CH-1015, Switzerland
| | | | - Matthieu Malléjac
- Laboratory of Wave Engineering, Department of Electrical Engineering, EPFL, Lausanne CH-1015, Switzerland
| | | | - Romain Fleury
- Laboratory of Wave Engineering, Department of Electrical Engineering, EPFL, Lausanne CH-1015, Switzerland
| |
Collapse
|
7
|
Zhu R, Lilak S, Loeffler A, Lizier J, Stieg A, Gimzewski J, Kuncic Z. Online dynamical learning and sequence memory with neuromorphic nanowire networks. Nat Commun 2023; 14:6697. [PMID: 37914696 PMCID: PMC10620219 DOI: 10.1038/s41467-023-42470-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 10/11/2023] [Indexed: 11/03/2023] Open
Abstract
Nanowire Networks (NWNs) belong to an emerging class of neuromorphic systems that exploit the unique physical properties of nanostructured materials. In addition to their neural network-like physical structure, NWNs also exhibit resistive memory switching in response to electrical inputs due to synapse-like changes in conductance at nanowire-nanowire cross-point junctions. Previous studies have demonstrated how the neuromorphic dynamics generated by NWNs can be harnessed for temporal learning tasks. This study extends these findings further by demonstrating online learning from spatiotemporal dynamical features using image classification and sequence memory recall tasks implemented on an NWN device. Applied to the MNIST handwritten digit classification task, online dynamical learning with the NWN device achieves an overall accuracy of 93.4%. Additionally, we find a correlation between the classification accuracy of individual digit classes and mutual information. The sequence memory task reveals how memory patterns embedded in the dynamical features enable online learning and recall of a spatiotemporal sequence pattern. Overall, these results provide proof-of-concept of online learning from spatiotemporal dynamics using NWNs and further elucidate how memory can enhance learning.
Collapse
Affiliation(s)
- Ruomin Zhu
- School of Physics, The University of Sydney, Sydney, NSW, Australia.
| | - Sam Lilak
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, US
| | - Alon Loeffler
- School of Physics, The University of Sydney, Sydney, NSW, Australia
| | - Joseph Lizier
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia
- Centre for Complex Systems, The University of Sydney, Sydney, NSW, Australia
| | - Adam Stieg
- California NanoSystems Institute, University of California, Los Angeles, Los Angeles, CA, US.
- WPI Center for Materials Nanoarchitectonics (MANA), National Institute for Materials Science (NIMS), Tsukuba, Japan.
| | - James Gimzewski
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, US.
- California NanoSystems Institute, University of California, Los Angeles, Los Angeles, CA, US.
- WPI Center for Materials Nanoarchitectonics (MANA), National Institute for Materials Science (NIMS), Tsukuba, Japan.
- Research Center for Neuromorphic AI Hardware, Kyutech, Kitakyushu, Japan.
| | - Zdenka Kuncic
- School of Physics, The University of Sydney, Sydney, NSW, Australia.
- Centre for Complex Systems, The University of Sydney, Sydney, NSW, Australia.
- The University of Sydney Nano Institute, Sydney, NSW, Australia.
| |
Collapse
|
8
|
Abstract
The current gap between computing algorithms and neuromorphic hardware to emulate brains is an outstanding bottleneck in developing neural computing technologies. Aimone and Parekh discuss the possibility of bridging this gap using theoretical computing frameworks from a neuroscience perspective.
Collapse
Affiliation(s)
- James B Aimone
- Neural Exploration and Research Laboratory, Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, USA.
| | - Ojas Parekh
- Neural Exploration and Research Laboratory, Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, USA.
| |
Collapse
|