1
|
Tamosiunaite M, Tetzlaff C, Wörgötter F. Unsupervised learning of perceptual feature combinations. PLoS Comput Biol 2024; 20:e1011926. [PMID: 38442095 PMCID: PMC10942261 DOI: 10.1371/journal.pcbi.1011926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 03/15/2024] [Accepted: 02/19/2024] [Indexed: 03/07/2024] Open
Abstract
In many situations it is behaviorally relevant for an animal to respond to co-occurrences of perceptual, possibly polymodal features, while these features alone may have no importance. Thus, it is crucial for animals to learn such feature combinations in spite of the fact that they may occur with variable intensity and occurrence frequency. Here, we present a novel unsupervised learning mechanism that is largely independent of these contingencies and allows neurons in a network to achieve specificity for different feature combinations. This is achieved by a novel correlation-based (Hebbian) learning rule, which allows for linear weight growth and which is combined with a mechanism for gradually reducing the learning rate as soon as the neuron's response becomes feature combination specific. In a set of control experiments, we show that other existing advanced learning rules cannot satisfactorily form ordered multi-feature representations. In addition, we show that networks, which use this type of learning always stabilize and converge to subsets of neurons with different feature-combination specificity. Neurons with this property may, thus, serve as an initial stage for the processing of ecologically relevant real world situations for an animal.
Collapse
Affiliation(s)
- Minija Tamosiunaite
- Department for Computational Neuroscience, Third Physics Institute, University of Göttingen, Göttingen, Germany
- Vytautas Magnus University, Faculty of Informatics, Kaunas, Lithuania
| | - Christian Tetzlaff
- Computational Synaptic Physiology, Department for Neuro- and Sensory Physiology, University Medical Center Göttingen, Göttingen, Germany
- Campus Institute Data Science, Göttingen, Germany
| | - Florentin Wörgötter
- Department for Computational Neuroscience, Third Physics Institute, University of Göttingen, Göttingen, Germany
| |
Collapse
|
2
|
Norouzi A, Shahpouri S, Gordon D, Shahbakhti M, Koch CR. Safe deep reinforcement learning in diesel engine emission control. PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS. PART I, JOURNAL OF SYSTEMS AND CONTROL ENGINEERING 2023; 237:1440-1453. [PMID: 37692899 PMCID: PMC10483989 DOI: 10.1177/09596518231153445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Accepted: 01/10/2023] [Indexed: 09/12/2023]
Abstract
A deep reinforcement learning application is investigated to control the emissions of a compression ignition diesel engine. The main purpose of this study is to reduce the engine-out nitrogen oxide ( N O x ) emissions and to minimize fuel consumption while tracking a reference engine load. First, a physics-based engine simulation model is developed in GT-Power and calibrated using experimental data. Using this model and a GT-Power/Simulink co-simulation, a deep deterministic policy gradient is developed. To reduce the risk of an unwanted output, a safety filter is added to the deep reinforcement learning. Based on the simulation results, this filter has no effect on the final trained deep reinforcement learning; however, during the training process, it is crucial to enforce constraints on the controller output. The developed safe reinforcement learning is then compared with an iterative learning controller and a deep neural network-based nonlinear model predictive controller. This comparison shows that the safe reinforcement learning is capable of accurately tracking an arbitrary reference input while the iterative learning controller is limited to a repetitive reference. The comparison between the nonlinear model predictive control and reinforcement learning indicates that for this case reinforcement learning is able to learn the optimal control output directly from the experiment without the need for a model. However, to enforce output constraint for safe learning reinforcement learning, a simple model of system is required. In this work, reinforcement learning was able to reduce N O x emissions more than the nonlinear model predictive control; however, it suffered from slightly higher error in load tracking and a higher fuel consumption.
Collapse
Affiliation(s)
- Armin Norouzi
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB, Canada
| | - Saeid Shahpouri
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB, Canada
| | - David Gordon
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB, Canada
| | - Mahdi Shahbakhti
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB, Canada
| | - Charles Robert Koch
- Department of Mechanical Engineering, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
3
|
Huang K, Ma X, Song R, Rong X, Tian X, Li Y. A self-organizing developmental cognitive architecture with interactive reinforcement learning. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.07.109] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
4
|
|
5
|
Reinforced learning systems based on merged and cumulative knowledge to predict human actions. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2014.02.051] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
6
|
Chen B, Zhang A, Cao L. Autonomous intelligent decision-making system based on Bayesian SOM neural network for robot soccer. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2013.08.021] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
7
|
Reduction of state space in reinforcement learning by sensor selection. ARTIFICIAL LIFE AND ROBOTICS 2013. [DOI: 10.1007/s10015-013-0092-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
8
|
Spadaro S. The dilemma of the symbols: analogies between philosophy, biology and artificial life. SPRINGERPLUS 2013; 2:495. [PMID: 24109563 PMCID: PMC3793079 DOI: 10.1186/2193-1801-2-495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 08/09/2013] [Accepted: 09/26/2013] [Indexed: 11/23/2022]
Abstract
This article analyzes some analogies going from Artificial Life questions about the symbol-matter connection to Artificial Intelligence questions about symbol-grounding. It focuses on the notion of the interpretability of syntax and how the symbols are integrated in a unity ("binding problem"). Utilizing the DNA code as a model, this paper discusses how syntactic features could be defined as high-grade characteristics of the non syntactic relations in a material-dynamic structure, by using an emergentist approach. This topic furnishes the ground for a confutation of J. Searle's statement that syntax is observer-relative, as he wrote in his book "Mind: A Brief Introduction". Moreover the evolving discussion also modifies the classic symbol-processing doctrine in the mind which Searle attacks as a strong AL argument, that life could be implemented in a computational mode. Lastly, this paper furnishes a new way of support for the autonomous systems thesis in Artificial Life and Artificial Intelligence, using, inter alia, the "adaptive resonance theory" (ART).
Collapse
Affiliation(s)
- Salvatore Spadaro
- Department of Philosophy, Sapienza University of Rome, via Carlo Fea n.2, 00161 Rome, Italy
| |
Collapse
|
9
|
Montazeri H, Moradi S, Safabakhsh R. Continuous state/action reinforcement learning: A growing self-organizing map approach. Neurocomputing 2011. [DOI: 10.1016/j.neucom.2010.11.012] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
10
|
Er MJ, Zhou Y. Automatic generation of fuzzy inference systems via unsupervised learning. Neural Netw 2008; 21:1556-66. [DOI: 10.1016/j.neunet.2008.06.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2007] [Revised: 05/05/2008] [Accepted: 06/14/2008] [Indexed: 11/24/2022]
|
11
|
Daoyi Dong, Chunlin Chen, Hanxiong Li, Tzyh-Jong Tarn. Quantum Reinforcement Learning. ACTA ACUST UNITED AC 2008; 38:1207-20. [DOI: 10.1109/tsmcb.2008.925743] [Citation(s) in RCA: 167] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
12
|
Ah-Hwee Tan, Ning Lu, Dan Xiao. Integrating Temporal Difference Methods and Self-Organizing Neural Networks for Reinforcement Learning With Delayed Evaluative Feedback. ACTA ACUST UNITED AC 2008; 19:230-44. [DOI: 10.1109/tnn.2007.905839] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
13
|
|
14
|
Touzet CF. Modeling and Simulation of Elementary Robot Behaviors using Associative Memories. INT J ADV ROBOT SYST 2006. [DOI: 10.5772/5742] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Today, there are several drawbacks that impede the necessary and much needed use of robot learning techniques in real applications. First, the time needed to achieve the synthesis of any behavior is prohibitive. Second, the robot behavior during the learning phase is – by definition – bad, it may even be dangerous. Third, except within the lazy learning approach, a new behavior implies a new learning phase. We propose in this paper to use associative memories (self-organizing maps) to encode the non explicit model of the robot-world interaction sampled by the lazy memory, and then generate a robot behavior by means of situations to be achieved, i.e., points on the self-organizing maps. Any behavior can instantaneously be synthesized by the definition of a goal situation. Its performance will be minimal (not necessarily bad) and will improve by the mere repetition of the behavior.
Collapse
Affiliation(s)
- Claude F. Touzet
- Adaptive and Integrative Neurobiology, UMR 6149, University of Provence / CNRS, Centre St Charles - Pôle 3C - Case B. 3, Place Victor Hugo, F - 13331 Marseille Cedex 03, France
| |
Collapse
|
15
|
Combining Self-organizing Maps with Mixtures of Experts: Application to an Actor-Critic Model of Reinforcement Learning in the Basal Ganglia. FROM ANIMALS TO ANIMATS 9 2006. [DOI: 10.1007/11840541_33] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
16
|
Low KH, Leow WK, Ang MH. An Ensemble of Cooperative Extended Kohonen Maps for Complex Robot Motion Tasks. Neural Comput 2005. [DOI: 10.1162/0899766053630378] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Self-organizing feature maps such as extended Kohonen maps (EKMs) have been very successful at learning sensorimotor control for mobile robot tasks. This letter presents a new ensemble approach, cooperative EKMs with indirect mapping, to achieve complex robot motion. An indirect-mapping EKM self-organizes to map from the sensory input space to the motor control space indirectly via a control parameter space. Quantitative evaluation reveals that indirect mapping can provide finer, smoother, and more efficient motion control than does direct mapping by operating in a continuous, rather than discrete, motor control space. It is also shown to outperform basis function neural networks. Furthermore, training its control parameters with recursive least squares enables faster convergence and better performance compared to gradient descent. The cooperation and competition of multiple self-organized EKMs allow a nonholonomic mobile robot to negotiate unforeseen, concave, closely spaced, and dynamic obstacles. Qualitative and quantitative comparisons with neural network ensembles employing weighted sum reveal that our method can achieve more sophisticated motion tasks even though the weighted-sum ensemble approach also operates in continuous motor control space.
Collapse
Affiliation(s)
- Kian Hsiang Low
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213-3890, U.S.A
| | - Wee Kheng Leow
- Department of Computer Science, National University of Singapore, Singapore 117543, Singapore
| | - Marcelo H. Ang
- Department of Mechanical Engineering, National University of Singapore, Singapore 119260, Singapore
| |
Collapse
|
17
|
Zhang Y, Weng J, Hwang WS. Auditory learning: a developmental method. IEEE TRANSACTIONS ON NEURAL NETWORKS 2005; 16:601-16. [PMID: 15940990 DOI: 10.1109/tnn.2005.845217] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Motivated by the human autonomous development process from infancy to adulthood, we have built a robot that develops its cognitive and behavioral skills through real-time interactions with the environment. We call such a robot a developmental robot. In this paper, we present the theory and the architecture to implement a developmental robot and discuss the related techniques that address an array of challenging technical issues. As an application, experimental results on a real robot, self-organizing, autonomous, incremental learner (SAIL), are presented with emphasis on its audition perception and audition-related action generation. In particular, the SAIL robot conducts the auditory learning from unsegmented and unlabeled speech streams without any prior knowledge about the auditory signals, such as the designated language or the phoneme models. Neither available before learning starts are the actions that the robot is expected to perform. SAIL learns the auditory commands and the desired actions from physical contacts with the environment including the trainers.
Collapse
Affiliation(s)
- Yilu Zhang
- Research Center, General Motors Corporation, Warren, MI 48090, USA
| | | | | |
Collapse
|