1
|
Quantifying Reinforcement-Learning Agent’s Autonomy, Reliance on Memory and Internalisation of the Environment. ENTROPY 2022; 24:e24030401. [PMID: 35327912 PMCID: PMC8947692 DOI: 10.3390/e24030401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 03/08/2022] [Accepted: 03/09/2022] [Indexed: 11/16/2022]
Abstract
Intuitively, the level of autonomy of an agent is related to the degree to which the agent’s goals and behaviour are decoupled from the immediate control by the environment. Here, we capitalise on a recent information-theoretic formulation of autonomy and introduce an algorithm for calculating autonomy in a limiting process of time step approaching infinity. We tackle the question of how the autonomy level of an agent changes during training. In particular, in this work, we use the partial information decomposition (PID) framework to monitor the levels of autonomy and environment internalisation of reinforcement-learning (RL) agents. We performed experiments on two environments: a grid world, in which the agent has to collect food, and a repeating-pattern environment, in which the agent has to learn to imitate a sequence of actions by memorising the sequence. PID also allows us to answer how much the agent relies on its internal memory (versus how much it relies on the observations) when transitioning to its next internal state. The experiments show that specific terms of PID strongly correlate with the obtained reward and with the agent’s behaviour against perturbations in the observations.
Collapse
|
2
|
Zhu R, Hochstetter J, Loeffler A, Diaz-Alvarez A, Nakayama T, Lizier JT, Kuncic Z. Information dynamics in neuromorphic nanowire networks. Sci Rep 2021; 11:13047. [PMID: 34158521 PMCID: PMC8219687 DOI: 10.1038/s41598-021-92170-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 05/31/2021] [Indexed: 12/18/2022] Open
Abstract
Neuromorphic systems comprised of self-assembled nanowires exhibit a range of neural-like dynamics arising from the interplay of their synapse-like electrical junctions and their complex network topology. Additionally, various information processing tasks have been demonstrated with neuromorphic nanowire networks. Here, we investigate the dynamics of how these unique systems process information through information-theoretic metrics. In particular, Transfer Entropy (TE) and Active Information Storage (AIS) are employed to investigate dynamical information flow and short-term memory in nanowire networks. In addition to finding that the topologically central parts of networks contribute the most to the information flow, our results also reveal TE and AIS are maximized when the networks transitions from a quiescent to an active state. The performance of neuromorphic networks in memory and learning tasks is demonstrated to be dependent on their internal dynamical states as well as topological structure. Optimal performance is found when these networks are pre-initialised to the transition state where TE and AIS are maximal. Furthermore, an optimal range of information processing resources (i.e. connectivity density) is identified for performance. Overall, our results demonstrate information dynamics is a valuable tool to study and benchmark neuromorphic systems.
Collapse
Affiliation(s)
- Ruomin Zhu
- School of Physics, The University of Sydney, Sydney, NSW, 2006, Australia.
| | - Joel Hochstetter
- School of Physics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Alon Loeffler
- School of Physics, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Adrian Diaz-Alvarez
- International Center for Materials Nanoarchitectonics (WPI-MANA), National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki, 305-0044, Japan
| | - Tomonobu Nakayama
- School of Physics, The University of Sydney, Sydney, NSW, 2006, Australia
- International Center for Materials Nanoarchitectonics (WPI-MANA), National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki, 305-0044, Japan
- Graduate School of Pure and Applied Sciences, University of Tsukuba, Tsukuba, Japan
| | - Joseph T Lizier
- Centre for Complex Systems, Faculty of Engineering, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Zdenka Kuncic
- School of Physics, The University of Sydney, Sydney, NSW, 2006, Australia.
- International Center for Materials Nanoarchitectonics (WPI-MANA), National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba, Ibaraki, 305-0044, Japan.
- Centre for Complex Systems, Faculty of Engineering, The University of Sydney, Sydney, NSW, 2006, Australia.
- Sydney Nano Institute, The University of Sydney, Sydney, NSW, 2006, Australia.
| |
Collapse
|
3
|
Fuentes J, López JL, Obregón O. Generalized Fokker-Planck equations derived from nonextensive entropies asymptotically equivalent to Boltzmann-Gibbs. Phys Rev E 2020; 102:012118. [PMID: 32794918 DOI: 10.1103/physreve.102.012118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Accepted: 06/21/2020] [Indexed: 11/07/2022]
Abstract
We derive generalized Fokker-Planck equations (FPEs) based on two nonextensive entropy measures S_{±} that depend exclusively on the probability. These entropies have been originally obtained from the superstatistics framework, therefore they regard nonequilibrium systems outlined by a long-term stationary state in view of a spatiotemporally fluctuating intensive quantity. Moreover, entropies S_{±} as well as Boltzmann-Gibbs (BG) entropy S_{B} both pertain to the same asymptotical equivalence class, thus suggesting that S_{±} could depict a consistent thermodynamic generalization of BG. For these reasons, we assert that transport phenomena to be accounted for by our models shall coincide with the portrait given by the conventional FPEs for systems comprehending short-range interactions or a high number of accessible microstates, whereas, for systems composed of a small number of microstates, or those with long-range interactions, the governing equations of motion are to be the FPEs here derived, as long as the system fulfills the attributes mentioned above. We discuss the anomalous diffusion exhibited by the two generalized FPEs and also present some numerical applications. In particular, we find that there are models regarding biological sciences, for the study of congregation and aggregation behavior, the structure of which coincides with the one of our models.
Collapse
Affiliation(s)
- Jesús Fuentes
- Departamento de Física, División de Ciencias e Ingenierías Campus León, and Universidad de Guanajuato, A.P. E-143, C.P. 37150 León, Guanajuato, México
| | - José Luis López
- Departamento de Física, División de Ciencias e Ingenierías Campus León, and Universidad de Guanajuato, A.P. E-143, C.P. 37150 León, Guanajuato, México
| | - Octavio Obregón
- Departamento de Física, División de Ciencias e Ingenierías Campus León, and Universidad de Guanajuato, A.P. E-143, C.P. 37150 León, Guanajuato, México
| |
Collapse
|
4
|
Finn C, Lizier JT. Generalised Measures of Multivariate Information Content. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E216. [PMID: 33285991 PMCID: PMC7851747 DOI: 10.3390/e22020216] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 02/05/2020] [Accepted: 02/12/2020] [Indexed: 12/12/2022]
Abstract
The entropy of a pair of random variables is commonly depicted using a Venn diagram. This representation is potentially misleading, however, since the multivariate mutual information can be negative. This paper presents new measures of multivariate information content that can be accurately depicted using Venn diagrams for any number of random variables. These measures complement the existing measures of multivariate mutual information and are constructed by considering the algebraic structure of information sharing. It is shown that the distinct ways in which a set of marginal observers can share their information with a non-observing third party corresponds to the elements of a free distributive lattice. The redundancy lattice from partial information decomposition is then subsequently and independently derived by combining the algebraic structures of joint and shared information content.
Collapse
Affiliation(s)
- Conor Finn
- Centre for Complex Systems, The University of Sydney, Sydney NSW 2006, Australia;
- CSIRO Data61, Marsfield NSW 2122, Australia
| | - Joseph T. Lizier
- Centre for Complex Systems, The University of Sydney, Sydney NSW 2006, Australia;
| |
Collapse
|
5
|
Makkeh A, Chicharro D, Theis DO, Vicente R. MAXENT3D_PID: An Estimator for the Maximum-Entropy Trivariate Partial Information Decomposition. ENTROPY 2019; 21:862. [PMCID: PMC7515392 DOI: 10.3390/e21090862] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 08/27/2019] [Indexed: 07/04/2023]
Abstract
Partial information decomposition (PID) separates the contributions of sources about a target into unique, redundant, and synergistic components of information. In essence, PID answers the question of “who knows what” of a system of random variables and hence has applications to a wide spectrum of fields ranging from social to biological sciences. The paper presents MaxEnt3D_Pid, an algorithm that computes the PID of three sources, based on a recently-proposed maximum entropy measure, using convex optimization (cone programming). We describe the algorithm and its associated software utilization and report the results of various experiments assessing its accuracy. Moreover, the paper shows that a hierarchy of bivariate and trivariate PID allows obtaining the finer quantities of the trivariate partial information measure.
Collapse
Affiliation(s)
- Abdullah Makkeh
- Institute of Computer Science, University of Tartu, 51014 Tartu, Estonia
| | - Daniel Chicharro
- Neural Computation Laboratory, Center for Neuroscience and Cognitive Systems@UniTn, Istituto Italiano di Tecnologia, 38068 Rovereto (TN), Italy
| | - Dirk Oliver Theis
- Institute of Computer Science, University of Tartu, 51014 Tartu, Estonia
| | - Raul Vicente
- Institute of Computer Science, University of Tartu, 51014 Tartu, Estonia
| |
Collapse
|
6
|
Marinazzo D, Angelini L, Pellicoro M, Stramaglia S. Synergy as a warning sign of transitions: The case of the two-dimensional Ising model. Phys Rev E 2019; 99:040101. [PMID: 31108637 DOI: 10.1103/physreve.99.040101] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Indexed: 06/09/2023]
Abstract
We consider the formalism of information decomposition of target effects from multisource interactions, i.e., the problem of defining redundant and synergistic components of the information that a set of source variables provides about a target, and apply it to the two-dimensional Ising model as a paradigm of a critically transitioning system. Intuitively, synergy is the information about the target variable that is uniquely obtained by taking the sources together, but not considering them alone; redundancy is the information which is shared by the sources. To disentangle the components of the information both at the static level and at the dynamical one, the decomposition is applied respectively to the mutual information and to the transfer entropy between a given spin, target, and a pair of neighboring spins (taken as the drivers). We show that a key signature of an impending phase transition (approached from the disordered size) is the fact that the synergy peaks in the disordered phase, both in the static and in the dynamic case: The synergy can thus be considered a precursor of the transition. The redundancy, instead, reaches its maximum at the critical temperature. The peak of the synergy of the transfer entropy is far more pronounced than those of the static mutual information. We show that these results are robust with respect to the details of the information decomposition approach, as we find the same results using two different methods; moreover, with respect to previous literature rooted in the notion of global transfer entropy, our results demonstrate that considering as few as three variables is sufficient to construct a precursor of the transition, and provide a paradigm for the investigation of a variety of systems prone to crisis, such as financial markets, social media, or epileptic seizures.
Collapse
Affiliation(s)
- D Marinazzo
- Department of Data Analysis, Ghent University, 2 Henri Dunantlaan, 9000 Ghent, Belgium
| | - L Angelini
- Dipartimento Interateneo di Fisica, Universitá degli Studi Aldo Moro, Bari and INFN, Sezione di Bari, via Orabona 4, 70126 Bari, Italy
- Center of Innovative Technologies for Signal Detection and Processing (TIRES),Universitá degli Studi Aldo Moro, Bari, via Orabona 4, 70126 Bari, Italy
| | - M Pellicoro
- Dipartimento Interateneo di Fisica, Universitá degli Studi Aldo Moro, Bari and INFN, Sezione di Bari, via Orabona 4, 70126 Bari, Italy
| | - S Stramaglia
- Dipartimento Interateneo di Fisica, Universitá degli Studi Aldo Moro, Bari and INFN, Sezione di Bari, via Orabona 4, 70126 Bari, Italy
- Center of Innovative Technologies for Signal Detection and Processing (TIRES),Universitá degli Studi Aldo Moro, Bari, via Orabona 4, 70126 Bari, Italy
| |
Collapse
|
7
|
Lizier JT, Bertschinger N, Jost J, Wibral M. Information Decomposition of Target Effects from Multi-Source Interactions: Perspectives on Previous, Current and Future Work. ENTROPY 2018; 20:e20040307. [PMID: 33265398 PMCID: PMC7512824 DOI: 10.3390/e20040307] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Revised: 04/19/2018] [Accepted: 04/19/2018] [Indexed: 11/29/2022]
Abstract
The formulation of the Partial Information Decomposition (PID) framework by Williams and Beer in 2010 attracted a significant amount of attention to the problem of defining redundant (or shared), unique and synergistic (or complementary) components of mutual information that a set of source variables provides about a target. This attention resulted in a number of measures proposed to capture these concepts, theoretical investigations into such measures, and applications to empirical data (in particular to datasets from neuroscience). In this Special Issue on “Information Decomposition of Target Effects from Multi-Source Interactions” at Entropy, we have gathered current work on such information decomposition approaches from many of the leading research groups in the field. We begin our editorial by providing the reader with a review of previous information decomposition research, including an overview of the variety of measures proposed, how they have been interpreted and applied to empirical investigations. We then introduce the articles included in the special issue one by one, providing a similar categorisation of these articles into: i. proposals of new measures; ii. theoretical investigations into properties and interpretations of such approaches, and iii. applications of these measures in empirical studies. We finish by providing an outlook on the future of the field.
Collapse
Affiliation(s)
- Joseph T. Lizier
- Complex Systems Research Group and Centre for Complex Systems, Faculty of Engineering & IT, The University of Sydney, NSW 2006, Australia
- Correspondence: ; Tel.:+61-2-9351-3208
| | - Nils Bertschinger
- Frankfurt Institute of Advanced Studies (FIAS) and Goethe University, 60438 Frankfurt am Main, Germany
| | - Jürgen Jost
- Max Planck Institute for Mathematics in the Sciences, Inselstraße 22, 04103 Leipzig, Germany
- Santa Fe Institute, 1399 Hyde Park Road, Santa Fe, NM 87501, USA
| | - Michael Wibral
- MEG Unit, Brain Imaging Center, Goethe University, 60528 Frankfurt, Germany
- Max Planck Institute for Dynamics and Self-Organization, 37077 Göttingen, Germany
| |
Collapse
|