1
|
Mages T, Rohner C. Quantifying redundancies and synergies with measures of inequality. PLoS One 2024; 19:e0313281. [PMID: 39565765 PMCID: PMC11578534 DOI: 10.1371/journal.pone.0313281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Accepted: 10/21/2024] [Indexed: 11/22/2024] Open
Abstract
Inequality measures provide a valuable tool for the analysis, comparison, and optimization based on system models. This work studies the relation between attributes or features of an individual to understand how redundant, unique, and synergetic interactions between attributes construct inequality. For this purpose, we define a family of inequality measures (f-inequality) from f-divergences. Special cases of this family are, among others, the Pietra index and the Generalized Entropy index. We present a decomposition for any f-inequality with intuitive set-theoretic behavior that enables studying the dynamics between attributes. Moreover, we use the Atkinson index as an example to demonstrate how the decomposition can be transformed to measures beyond f-inequality. The presented decomposition provides practical insights for system analyses and complements subgroup decompositions. Additionally, the results present an interesting interpretation of Shapley values and demonstrate the close relation between decomposing measures of inequality and information.
Collapse
Affiliation(s)
- Tobias Mages
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| | - Christian Rohner
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| |
Collapse
|
2
|
Wörtwein T, Allen NB, Cohn JF, Morency LP. SMURF: Statistical Modality Uniqueness and Redundancy Factorization. PROCEEDINGS OF THE ... ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. ICMI (CONFERENCE) 2024; 2024:339-349. [PMID: 39669698 PMCID: PMC11637459 DOI: 10.1145/3678957.3685716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2024]
Abstract
Multimodal late fusion is a well-performing fusion method that sums the outputs of separately processed modalities, so-called modality contributions, to create a prediction; for example, summing contributions from vision, acoustic, and language to predict affective states. In this paper, our primary goal is to improve the interpretability of what modalities contribute to the prediction in late fusion models. More specifically, we want to factorize modality contributions into what is consistently shared by at least two modalities (pairwise redundant contributions) and what the remaining modality-specific contributions are (unique contributions). Our secondary goal is to improve robustness to missing modalities by encouraging the model to learn redundant contributions. To achieve our two goals, we propose SMURF (Statistical Modality Uniqueness and Redundancy Factorization), a late fusion method that factorizes its outputs into a) unique contributions that are uncorrelated with all other modalities and b) pairwise redundant contributions that are maximally correlated between two modalities. For our primary goal, we 1) verify SMURF's factorization on a synthetic dataset, 2) ensure that its factorization does not degrade predictive performance on eight affective datasets, and 3) observe significant relationships between its factorization and human judgments on three datasets. For our secondary goal, we demonstrate that SMURF leads to more robustness to missing modalities at test time compared to three late fusion baselines.
Collapse
|
3
|
Martínez-Sánchez Á, Arranz G, Lozano-Durán A. Decomposing causality into its synergistic, unique, and redundant components. Nat Commun 2024; 15:9296. [PMID: 39487116 PMCID: PMC11530654 DOI: 10.1038/s41467-024-53373-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 10/09/2024] [Indexed: 11/04/2024] Open
Abstract
Causality lies at the heart of scientific inquiry, serving as the fundamental basis for understanding interactions among variables in physical systems. Despite its central role, current methods for causal inference face significant challenges due to nonlinear dependencies, stochastic interactions, self-causation, collider effects, and influences from exogenous factors, among others. While existing methods can effectively address some of these challenges, no single approach has successfully integrated all these aspects. Here, we address these challenges with SURD: Synergistic-Unique-Redundant Decomposition of causality. SURD quantifies causality as the increments of redundant, unique, and synergistic information gained about future events from past observations. The formulation is non-intrusive and applicable to both computational and experimental investigations, even when samples are scarce. We benchmark SURD in scenarios that pose significant challenges for causal inference and demonstrate that it offers a more reliable quantification of causality compared to previous methods.
Collapse
Affiliation(s)
- Álvaro Martínez-Sánchez
- Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Gonzalo Arranz
- Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Adrián Lozano-Durán
- Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, MA, USA
- Graduate Aerospace Laboratories, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
4
|
Sevostianov I, Feinerman O. Synergy as the Failure of Distributivity. ENTROPY (BASEL, SWITZERLAND) 2024; 26:916. [PMID: 39593861 PMCID: PMC11592723 DOI: 10.3390/e26110916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/16/2024] [Accepted: 10/26/2024] [Indexed: 11/28/2024]
Abstract
The concept of emergence, or synergy in its simplest form, is widely used but lacks a rigorous definition. Our work connects information and set theory to uncover the mathematical nature of synergy as the failure of distributivity. For the trivial case of discrete random variables, we explore whether and how it is possible to get more information out of lesser parts. The approach is inspired by the role of set theory as the fundamental description of part-whole relations. If taken unaltered, synergistic behavior is forbidden by the set-theoretic axioms. However, random variables are not a perfect analogy of sets: we formalize the distinction, highlighting a single broken axiom-union/intersection distributivity. Nevertheless, it remains possible to describe information using Venn-type diagrams. The proposed multivariate theory resolves the persistent self-contradiction of partial information decomposition and reinstates it as a primary route toward a rigorous definition of emergence. Our results suggest that non-distributive variants of set theory may be used to describe emergent physical systems.
Collapse
Affiliation(s)
- Ivan Sevostianov
- Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 7610001, Israel;
| | | |
Collapse
|
5
|
Varley TF. A Synergistic Perspective on Multivariate Computation and Causality in Complex Systems. ENTROPY (BASEL, SWITZERLAND) 2024; 26:883. [PMID: 39451959 PMCID: PMC11507062 DOI: 10.3390/e26100883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 10/17/2024] [Accepted: 10/18/2024] [Indexed: 10/26/2024]
Abstract
What does it mean for a complex system to "compute" or perform "computations"? Intuitively, we can understand complex "computation" as occurring when a system's state is a function of multiple inputs (potentially including its own past state). Here, we discuss how computational processes in complex systems can be generally studied using the concept of statistical synergy, which is information about an output that can only be learned when the joint state of all inputs is known. Building on prior work, we show that this approach naturally leads to a link between multivariate information theory and topics in causal inference, specifically, the phenomenon of causal colliders. We begin by showing how Berkson's paradox implies a higher-order, synergistic interaction between multidimensional inputs and outputs. We then discuss how causal structure learning can refine and orient analyses of synergies in empirical data, and when empirical synergies meaningfully reflect computation versus when they may be spurious. We end by proposing that this conceptual link between synergy, causal colliders, and computation can serve as a foundation on which to build a mathematically rich general theory of computation in complex systems.
Collapse
Affiliation(s)
- Thomas F Varley
- Vermont Complex Systems Center, University of Vermont, Burlington, VT 05405, USA
| |
Collapse
|
6
|
Gehlen J, Li J, Hourican C, Tassi S, Mishra PP, Lehtimäki T, Kähönen M, Raitakari O, Bosch JA, Quax R. Bias in O-Information Estimation. ENTROPY (BASEL, SWITZERLAND) 2024; 26:837. [PMID: 39451914 PMCID: PMC11507536 DOI: 10.3390/e26100837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 09/18/2024] [Accepted: 09/24/2024] [Indexed: 10/26/2024]
Abstract
Higher-order relationships are a central concept in the science of complex systems. A popular method of attempting to estimate the higher-order relationships of synergy and redundancy from data is through the O-information. It is an information-theoretic measure composed of Shannon entropy terms that quantifies the balance between redundancy and synergy in a system. However, bias is not yet taken into account in the estimation of the O-information of discrete variables. In this paper, we explain where this bias comes from and explore it for fully synergistic, fully redundant, and fully independent simulated systems of n=3 variables. Specifically, we explore how the sample size and number of bins affect the bias in the O-information estimation. The main finding is that the O-information of independent systems is severely biased towards synergy if the sample size is smaller than the number of jointly possible observations. This could mean that triplets identified as highly synergistic may in fact be close to independent. A bias approximation based on the Miller-Maddow method is derived for the O-information. We find that for systems of n=3 variables the bias approximation can partially correct for the bias. However, simulations of fully independent systems are still required as null models to provide a benchmark of the bias of the O-information.
Collapse
Grants
- 848146 To_Aition
- 356405, 322098, 286284, 134309 (Eye), 126925, 121584, 124282, 129378 (Salve), 117797 (Gendi), and 141071 (Skidi) Academy of Finland
- 09120012010063 Netherlands Organisation for Health Research and Development (ZonMw),
- grant 755320 for TAXINOMISIS and grant 848146 for To Aition EU Horizon 2020
- grant 742927 for MULTIEPIGEN project European Research Council
- 101080117 pBETTER4U EU (Preventing obesity through Biologically and bEhaviorally Tailored inTERventions for you)
- 349708 Academy of Finland
Collapse
Affiliation(s)
- Johanna Gehlen
- Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands; (J.G.); (J.L.); (C.H.)
| | - Jie Li
- Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands; (J.G.); (J.L.); (C.H.)
| | - Cillian Hourican
- Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands; (J.G.); (J.L.); (C.H.)
| | - Stavroula Tassi
- Unit of Medical Technology and Intelligent Information Systems (MEDLAB), Department of Material Science and Engineering, University of Ioannina, 45110 Ioannina, Greece;
- Department of Mechanical and Aeronautics Engineering, University of Patras, 26504 Rio, Greece
| | - Pashupati P. Mishra
- Department of Clinical Chemistry, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland; (P.P.M.); (T.L.)
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland;
- Department of Clinical Chemistry, Fimlab Laboratories, 33520 Tampere, Finland
| | - Terho Lehtimäki
- Department of Clinical Chemistry, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland; (P.P.M.); (T.L.)
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland;
- Department of Clinical Chemistry, Fimlab Laboratories, 33520 Tampere, Finland
| | - Mika Kähönen
- Finnish Cardiovascular Research Center Tampere, Faculty of Medicine and Health Technology, Tampere University, 33720 Tampere, Finland;
- Department of Clinical Physiology, Tampere University Hospital, 33520 Tampere, Finland
| | - Olli Raitakari
- Centre for Population Health Research, University of Turku and Turku University Hospital, 20520 Turku, Finland;
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, 20520 Turku, Finland
- Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, 20520 Turku, Finland
- InFLAMES Research Flagship, University of Turku, 20520 Turku, Finland
| | - Jos A. Bosch
- Clinical Psychology, Faculty of Social and Behavioural Sciences, University of Amsterdam, 1018 Amsterdam, The Netherlands;
| | - Rick Quax
- Computational Science Lab, Informatics Institute, University of Amsterdam, 1098 Amsterdam, The Netherlands; (J.G.); (J.L.); (C.H.)
- Institute for Advanced Study, 1012 Amsterdam, The Netherlands
| |
Collapse
|
7
|
Ehrlich DA, Schick-Poland K, Makkeh A, Lanfermann F, Wollstadt P, Wibral M. Partial information decomposition for continuous variables based on shared exclusions: Analytical formulation and estimation. Phys Rev E 2024; 110:014115. [PMID: 39161017 DOI: 10.1103/physreve.110.014115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 05/02/2024] [Indexed: 08/21/2024]
Abstract
Describing statistical dependencies is foundational to empirical scientific research. For uncovering intricate and possibly nonlinear dependencies between a single target variable and several source variables within a system, a principled and versatile framework can be found in the theory of partial information decomposition (PID). Nevertheless, the majority of existing PID measures are restricted to categorical variables, while many systems of interest in science are continuous. In this paper, we present a novel analytic formulation for continuous redundancy-a generalization of mutual information-drawing inspiration from the concept of shared exclusions in probability space as in the discrete PID definition of I_{∩}^{sx}. Furthermore, we introduce a nearest-neighbor-based estimator for continuous PID and showcase its effectiveness by applying it to a simulated energy management system provided by the Honda Research Institute Europe GmbH. This work bridges the gap between the measure-theoretically postulated existence proofs for a continuous I_{∩}^{sx} and its practical application to real-world scientific problems.
Collapse
|
8
|
Kolchinsky A. Partial Information Decomposition: Redundancy as Information Bottleneck. ENTROPY (BASEL, SWITZERLAND) 2024; 26:546. [PMID: 39056909 PMCID: PMC11276267 DOI: 10.3390/e26070546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 06/20/2024] [Accepted: 06/21/2024] [Indexed: 07/28/2024]
Abstract
The partial information decomposition (PID) aims to quantify the amount of redundant information that a set of sources provides about a target. Here, we show that this goal can be formulated as a type of information bottleneck (IB) problem, termed the "redundancy bottleneck" (RB). The RB formalizes a tradeoff between prediction and compression: it extracts information from the sources that best predict the target, without revealing which source provided the information. It can be understood as a generalization of "Blackwell redundancy", which we previously proposed as a principled measure of PID redundancy. The "RB curve" quantifies the prediction-compression tradeoff at multiple scales. This curve can also be quantified for individual sources, allowing subsets of redundant sources to be identified without combinatorial optimization. We provide an efficient iterative algorithm for computing the RB curve.
Collapse
Affiliation(s)
- Artemy Kolchinsky
- ICREA-Complex Systems Lab, Universitat Pompeu Fabra, 08003 Barcelona, Spain;
- Universal Biology Institute, The University of Tokyo, Tokyo 113-0033, Japan
| |
Collapse
|
9
|
Kay JW. A Partial Information Decomposition for Multivariate Gaussian Systems Based on Information Geometry. ENTROPY (BASEL, SWITZERLAND) 2024; 26:542. [PMID: 39056905 PMCID: PMC11276306 DOI: 10.3390/e26070542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/28/2024]
Abstract
There is much interest in the topic of partial information decomposition, both in developing new algorithms and in developing applications. An algorithm, based on standard results from information geometry, was recently proposed by Niu and Quinn (2019). They considered the case of three scalar random variables from an exponential family, including both discrete distributions and a trivariate Gaussian distribution. The purpose of this article is to extend their work to the general case of multivariate Gaussian systems having vector inputs and a vector output. By making use of standard results from information geometry, explicit expressions are derived for the components of the partial information decomposition for this system. These expressions depend on a real-valued parameter which is determined by performing a simple constrained convex optimisation. Furthermore, it is proved that the theoretical properties of non-negativity, self-redundancy, symmetry and monotonicity, which were proposed by Williams and Beer (2010), are valid for the decomposition Iig derived herein. Application of these results to real and simulated data show that the Iig algorithm does produce the results expected when clear expectations are available, although in some scenarios, it can overestimate the level of the synergy and shared information components of the decomposition, and correspondingly underestimate the levels of unique information. Comparisons of the Iig and Idep (Kay and Ince, 2018) methods show that they can both produce very similar results, but interesting differences are provided. The same may be said about comparisons between the Iig and Immi (Barrett, 2015) methods.
Collapse
Affiliation(s)
- Jim W Kay
- School of Mathematics and Statistics, University of Glasgow, Glasgow G12 8QQ, UK
| |
Collapse
|
10
|
Koçillari L, Lorenz GM, Engel NM, Celotto M, Curreli S, Malerba SB, Engel AK, Fellin T, Panzeri S. Sampling bias corrections for accurate neural measures of redundant, unique, and synergistic information. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.04.597303. [PMID: 38895197 PMCID: PMC11185652 DOI: 10.1101/2024.06.04.597303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Shannon Information theory has long been a tool of choice to measure empirically how populations of neurons in the brain encode information about cognitive variables. Recently, Partial Information Decomposition (PID) has emerged as principled way to break down this information into components identifying not only the unique information carried by each neuron, but also whether relationships between neurons generate synergistic or redundant information. While it has been long recognized that Shannon information measures on neural activity suffer from a (mostly upward) limited sampling estimation bias, this issue has largely been ignored in the burgeoning field of PID analysis of neural activity. We used simulations to investigate the limited sampling bias of PID computed from discrete probabilities (suited to describe neural spiking activity). We found that PID suffers from a large bias that is uneven across components, with synergy by far the most biased. Using approximate analytical expansions, we found that the bias of synergy increases quadratically with the number of discrete responses of each neuron, whereas the bias of unique and redundant information increase only linearly or sub-linearly. Based on the understanding of the PID bias properties, we developed simple yet effective procedures that correct for the bias effectively, and that improve greatly the PID estimation with respect to current state-of-the-art procedures. We apply these PID bias correction procedures to datasets of 53117 pairs neurons in auditory cortex, posterior parietal cortex and hippocampus of mice performing cognitive tasks, deriving precise estimates and bounds of how synergy and redundancy vary across these brain regions.
Collapse
Affiliation(s)
- Loren Koçillari
- Institute for Neural Information Processing, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Gabriel Matías Lorenz
- Institute for Neural Information Processing, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Istituto Italiano di Tecnologia, Genova, Italy
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Nicola Marie Engel
- Institute for Neural Information Processing, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Marco Celotto
- Institute for Neural Information Processing, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Istituto Italiano di Tecnologia, Genova, Italy
| | | | - Simone Blanco Malerba
- Institute for Neural Information Processing, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Andreas K. Engel
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | | | - Stefano Panzeri
- Institute for Neural Information Processing, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Istituto Italiano di Tecnologia, Genova, Italy
| |
Collapse
|
11
|
Chicharro D, Nguyen JK. Causal Structure Learning with Conditional and Unique Information Groups-Decomposition Inequalities. ENTROPY (BASEL, SWITZERLAND) 2024; 26:440. [PMID: 38920449 PMCID: PMC11202884 DOI: 10.3390/e26060440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 05/12/2024] [Accepted: 05/17/2024] [Indexed: 06/27/2024]
Abstract
The causal structure of a system imposes constraints on the joint probability distribution of variables that can be generated by the system. Archetypal constraints consist of conditional independencies between variables. However, particularly in the presence of hidden variables, many causal structures are compatible with the same set of independencies inferred from the marginal distributions of observed variables. Additional constraints allow further testing for the compatibility of data with specific causal structures. An existing family of causally informative inequalities compares the information about a set of target variables contained in a collection of variables, with a sum of the information contained in different groups defined as subsets of that collection. While procedures to identify the form of these groups-decomposition inequalities have been previously derived, we substantially enlarge the applicability of the framework. We derive groups-decomposition inequalities subject to weaker independence conditions, with weaker requirements in the configuration of the groups, and additionally allowing for conditioning sets. Furthermore, we show how constraints with higher inferential power may be derived with collections that include hidden variables, and then converted into testable constraints using data processing inequalities. For this purpose, we apply the standard data processing inequality of conditional mutual information and derive an analogous property for a measure of conditional unique information recently introduced to separate redundant, synergistic, and unique contributions to the information that a set of variables has about a target.
Collapse
Affiliation(s)
- Daniel Chicharro
- Artificial Intelligence Research Centre, Department of Computer Science, City, University of London, London EC1V 0HB, UK
| | - Julia K. Nguyen
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
12
|
Mages T, Anastasiadi E, Rohner C. Non-Negative Decomposition of Multivariate Information: From Minimum to Blackwell-Specific Information. ENTROPY (BASEL, SWITZERLAND) 2024; 26:424. [PMID: 38785673 PMCID: PMC11120422 DOI: 10.3390/e26050424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 05/06/2024] [Accepted: 05/11/2024] [Indexed: 05/25/2024]
Abstract
Partial information decompositions (PIDs) aim to categorize how a set of source variables provides information about a target variable redundantly, uniquely, or synergetically. The original proposal for such an analysis used a lattice-based approach and gained significant attention. However, finding a suitable underlying decomposition measure is still an open research question at an arbitrary number of discrete random variables. This work proposes a solution with a non-negative PID that satisfies an inclusion-exclusion relation for any f-information measure. The decomposition is constructed from a pointwise perspective of the target variable to take advantage of the equivalence between the Blackwell and zonogon order in this setting. Zonogons are the Neyman-Pearson region for an indicator variable of each target state, and f-information is the expected value of quantifying its boundary. We prove that the proposed decomposition satisfies the desired axioms and guarantees non-negative partial information results. Moreover, we demonstrate how the obtained decomposition can be transformed between different decomposition lattices and that it directly provides a non-negative decomposition of Rényi-information at a transformed inclusion-exclusion relation. Finally, we highlight that the decomposition behaves differently depending on the information measure used and how it can be used for tracing partial information flows through Markov chains.
Collapse
Affiliation(s)
- Tobias Mages
- Department of Information Technology, Uppsala University, 752 36 Uppsala, Sweden
| | | | | |
Collapse
|
13
|
Menesse G, Houben AM, Soriano J, Torres JJ. Integrated information decomposition unveils major structural traits of in silico and in vitro neuronal networks. CHAOS (WOODBURY, N.Y.) 2024; 34:053139. [PMID: 38809907 DOI: 10.1063/5.0201454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 05/06/2024] [Indexed: 05/31/2024]
Abstract
The properties of complex networked systems arise from the interplay between the dynamics of their elements and the underlying topology. Thus, to understand their behavior, it is crucial to convene as much information as possible about their topological organization. However, in large systems, such as neuronal networks, the reconstruction of such topology is usually carried out from the information encoded in the dynamics on the network, such as spike train time series, and by measuring the transfer entropy between system elements. The topological information recovered by these methods does not necessarily capture the connectivity layout, but rather the causal flow of information between elements. New theoretical frameworks, such as Integrated Information Decomposition (Φ-ID), allow one to explore the modes in which information can flow between parts of a system, opening a rich landscape of interactions between network topology, dynamics, and information. Here, we apply Φ-ID on in silico and in vitro data to decompose the usual transfer entropy measure into different modes of information transfer, namely, synergistic, redundant, or unique. We demonstrate that the unique information transfer is the most relevant measure to uncover structural topological details from network activity data, while redundant information only introduces residual information for this application. Although the retrieved network connectivity is still functional, it captures more details of the underlying structural topology by avoiding to take into account emergent high-order interactions and information redundancy between elements, which are important for the functional behavior, but mask the detection of direct simple interactions between elements constituted by the structural network topology.
Collapse
Affiliation(s)
- Gustavo Menesse
- Department of Electromagnetism and Physics of the Matter & Institute Carlos I for Theoretical and Computational Physics, University of Granada, 18071 Granada, Spain
- Departamento de Física, Facultad de Ciencias Exactas y Naturales, Universidad Nacional de Asunción, 111451 San Lorenzo, Paraguay
| | - Akke Mats Houben
- Departament de Física de la Matèria Condensada, Universitat de Barcelona and Universitat de Barcelona Institute of Complex Systems (UBICS), E-08028 Barcelona, Spain
| | - Jordi Soriano
- Departament de Física de la Matèria Condensada, Universitat de Barcelona and Universitat de Barcelona Institute of Complex Systems (UBICS), E-08028 Barcelona, Spain
| | - Joaquín J Torres
- Department of Electromagnetism and Physics of the Matter & Institute Carlos I for Theoretical and Computational Physics, University of Granada, 18071 Granada, Spain
| |
Collapse
|
14
|
Luppi AI, Rosas FE, Mediano PAM, Menon DK, Stamatakis EA. Information decomposition and the informational architecture of the brain. Trends Cogn Sci 2024; 28:352-368. [PMID: 38199949 DOI: 10.1016/j.tics.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 01/12/2024]
Abstract
To explain how the brain orchestrates information-processing for cognition, we must understand information itself. Importantly, information is not a monolithic entity. Information decomposition techniques provide a way to split information into its constituent elements: unique, redundant, and synergistic information. We review how disentangling synergistic and redundant interactions is redefining our understanding of integrative brain function and its neural organisation. To explain how the brain navigates the trade-offs between redundancy and synergy, we review converging evidence integrating the structural, molecular, and functional underpinnings of synergy and redundancy; their roles in cognition and computation; and how they might arise over evolution and development. Overall, disentangling synergistic and redundant information provides a guiding principle for understanding the informational architecture of the brain and cognition.
Collapse
Affiliation(s)
- Andrea I Luppi
- Division of Anaesthesia, University of Cambridge, Cambridge, UK; Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK; Montreal Neurological Institute, McGill University, Montreal, QC, Canada
| | - Fernando E Rosas
- Department of Informatics, University of Sussex, Brighton, UK; Centre for Psychedelic Research, Department of Brain Sciences, Imperial College London, London, UK; Centre for Complexity Science, Imperial College London, London, UK; Centre for Eudaimonia and Human Flourishing, University of Oxford, Oxford, UK
| | - Pedro A M Mediano
- Department of Computing, Imperial College London, London, UK; Department of Psychology, University of Cambridge, Cambridge, UK
| | - David K Menon
- Department of Medicine, University of Cambridge, Cambridge, UK; Wolfson Brain Imaging Centre, University of Cambridge, Cambridge, UK
| | - Emmanuel A Stamatakis
- Division of Anaesthesia, University of Cambridge, Cambridge, UK; Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK.
| |
Collapse
|
15
|
Murphy KA, Bassett DS. Information decomposition in complex systems via machine learning. Proc Natl Acad Sci U S A 2024; 121:e2312988121. [PMID: 38498714 PMCID: PMC10990158 DOI: 10.1073/pnas.2312988121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 01/30/2024] [Indexed: 03/20/2024] Open
Abstract
One of the fundamental steps toward understanding a complex system is identifying variation at the scale of the system's components that is most relevant to behavior on a macroscopic scale. Mutual information provides a natural means of linking variation across scales of a system due to its independence of functional relationship between observables. However, characterizing the manner in which information is distributed across a set of observables is computationally challenging and generally infeasible beyond a handful of measurements. Here, we propose a practical and general methodology that uses machine learning to decompose the information contained in a set of measurements by jointly optimizing a lossy compression of each measurement. Guided by the distributed information bottleneck as a learning objective, the information decomposition identifies the variation in the measurements of the system state most relevant to specified macroscale behavior. We focus our analysis on two paradigmatic complex systems: a Boolean circuit and an amorphous material undergoing plastic deformation. In both examples, the large amount of entropy of the system state is decomposed, bit by bit, in terms of what is most related to macroscale behavior. The identification of meaningful variation in data, with the full generality brought by information theory, is made practical for studying the connection between micro- and macroscale structure in complex systems.
Collapse
Affiliation(s)
- Kieran A. Murphy
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA19104
| | - Dani S. Bassett
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA19104
- Department of Electrical & Systems Engineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, PA19104
- Department of Neurology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA19104
- Department of Physics & Astronomy, College of Arts & Sciences, University of Pennsylvania, Philadelphia, PA19104
- The Santa Fe Institute, Santa Fe, NM87501
| |
Collapse
|
16
|
Gomes AFC, Figueiredo MAT. A Measure of Synergy Based on Union Information. ENTROPY (BASEL, SWITZERLAND) 2024; 26:271. [PMID: 38539782 PMCID: PMC10969115 DOI: 10.3390/e26030271] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/08/2024] [Accepted: 03/16/2024] [Indexed: 11/11/2024]
Abstract
The partial information decomposition (PID) framework is concerned with decomposing the information that a set of (two or more) random variables (the sources) has about another variable (the target) into three types of information: unique, redundant, and synergistic. Classical information theory alone does not provide a unique way to decompose information in this manner and additional assumptions have to be made. One often overlooked way to achieve this decomposition is using a so-called measure of union information-which quantifies the information that is present in at least one of the sources-from which a synergy measure stems. In this paper, we introduce a new measure of union information based on adopting a communication channel perspective, compare it with existing measures, and study some of its properties. We also include a comprehensive critical review of characterizations of union information and synergy measures that have been proposed in the literature.
Collapse
Affiliation(s)
- André F. C. Gomes
- Instituto de Telecomunicações and LUMLIS (Lisbon ELLIS Unit), Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisboa, Portugal;
| | | |
Collapse
|
17
|
Varley TF. Generalized decomposition of multivariate information. PLoS One 2024; 19:e0297128. [PMID: 38315691 PMCID: PMC10843128 DOI: 10.1371/journal.pone.0297128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 12/28/2023] [Indexed: 02/07/2024] Open
Abstract
Since its introduction, the partial information decomposition (PID) has emerged as a powerful, information-theoretic technique useful for studying the structure of (potentially higher-order) interactions in complex systems. Despite its utility, the applicability of the PID is restricted by the need to assign elements as either "sources" or "targets", as well as the specific structure of the mutual information itself. Here, I introduce a generalized information decomposition that relaxes the source/target distinction while still satisfying the basic intuitions about information. This approach is based on the decomposition of the Kullback-Leibler divergence, and consequently allows for the analysis of any information gained when updating from an arbitrary prior to an arbitrary posterior. As a result, any information-theoretic measure that can be written as a linear combination of Kullback-Leibler divergences admits a decomposition in the style of Williams and Beer, including the total correlation, the negentropy, and the mutual information as special cases. This paper explores how the generalized information decomposition can reveal novel insights into existing measures, as well as the nature of higher-order synergies. We show that synergistic information is intimately related to the well-known Tononi-Sporns-Edelman (TSE) complexity, and that synergistic information requires a similar integration/segregation balance as a high TSE complexity. Finally, I end with a discussion of how this approach fits into other attempts to generalize the PID and the possibilities for empirical applications.
Collapse
Affiliation(s)
- Thomas F. Varley
- Department of Computer Science, University of Vermont, Burlington, VT, United States of America
- Vermont Complex Systems Center, University of Vermont, Burlington, VT, United States of America
| |
Collapse
|
18
|
Koçillari L, Celotto M, Francis NA, Mukherjee S, Babadi B, Kanold PO, Panzeri S. Behavioural relevance of redundant and synergistic stimulus information between functionally connected neurons in mouse auditory cortex. Brain Inform 2023; 10:34. [PMID: 38052917 PMCID: PMC10697912 DOI: 10.1186/s40708-023-00212-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 11/02/2023] [Indexed: 12/07/2023] Open
Abstract
Measures of functional connectivity have played a central role in advancing our understanding of how information is transmitted and processed within the brain. Traditionally, these studies have focused on identifying redundant functional connectivity, which involves determining when activity is similar across different sites or neurons. However, recent research has highlighted the importance of also identifying synergistic connectivity-that is, connectivity that gives rise to information not contained in either site or neuron alone. Here, we measured redundant and synergistic functional connectivity between neurons in the mouse primary auditory cortex during a sound discrimination task. Specifically, we measured directed functional connectivity between neurons simultaneously recorded with calcium imaging. We used Granger Causality as a functional connectivity measure. We then used Partial Information Decomposition to quantify the amount of redundant and synergistic information about the presented sound that is carried by functionally connected or functionally unconnected pairs of neurons. We found that functionally connected pairs present proportionally more redundant information and proportionally less synergistic information about sound than unconnected pairs, suggesting that their functional connectivity is primarily redundant. Further, synergy and redundancy coexisted both when mice made correct or incorrect perceptual discriminations. However, redundancy was much higher (both in absolute terms and in proportion to the total information available in neuron pairs) in correct behavioural choices compared to incorrect ones, whereas synergy was higher in absolute terms but lower in relative terms in correct than in incorrect behavioural choices. Moreover, the proportion of redundancy reliably predicted perceptual discriminations, with the proportion of synergy adding no extra predictive power. These results suggest a crucial contribution of redundancy to correct perceptual discriminations, possibly due to the advantage it offers for information propagation, and also suggest a role of synergy in enhancing information level during correct discriminations.
Collapse
Affiliation(s)
- Loren Koçillari
- Istituto Italiano Di Tecnologia, 38068, Rovereto, Italy.
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Falkenried 94, 20251, Hamburg, Germany.
- Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf (UKE), 20246, Hamburg, Germany.
| | - Marco Celotto
- Istituto Italiano Di Tecnologia, 38068, Rovereto, Italy
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Falkenried 94, 20251, Hamburg, Germany
- Department of Pharmacy and Biotechnology, University of Bologna, 40126, Bologna, Italy
| | - Nikolas A Francis
- Department of Biology and Brain and Behavior Institute, University of Maryland, College Park, MD, 20742, USA
| | - Shoutik Mukherjee
- Department of Electrical and Computer Engineering and Institute for Systems Research, University of Maryland, College Park, MD, 20742, USA
| | - Behtash Babadi
- Department of Electrical and Computer Engineering and Institute for Systems Research, University of Maryland, College Park, MD, 20742, USA
| | - Patrick O Kanold
- Department of Biomedical Engineering and Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD, 21205, USA
| | - Stefano Panzeri
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Falkenried 94, 20251, Hamburg, Germany.
| |
Collapse
|
19
|
Varley TF, Pope M, Puxeddu MG, Faskowitz J, Sporns O. Partial entropy decomposition reveals higher-order information structures in human brain activity. Proc Natl Acad Sci U S A 2023; 120:e2300888120. [PMID: 37467265 PMCID: PMC10372615 DOI: 10.1073/pnas.2300888120] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 06/06/2023] [Indexed: 07/21/2023] Open
Abstract
The standard approach to modeling the human brain as a complex system is with a network, where the basic unit of interaction is a pairwise link between two brain regions. While powerful, this approach is limited by the inability to assess higher-order interactions involving three or more elements directly. In this work, we explore a method for capturing higher-order dependencies in multivariate data: the partial entropy decomposition (PED). Our approach decomposes the joint entropy of the whole system into a set of nonnegative atoms that describe the redundant, unique, and synergistic interactions that compose the system's structure. PED gives insight into the mathematics of functional connectivity and its limitation. When applied to resting-state fMRI data, we find robust evidence of higher-order synergies that are largely invisible to standard functional connectivity analyses. Our approach can also be localized in time, allowing a frame-by-frame analysis of how the distributions of redundancies and synergies change over the course of a recording. We find that different ensembles of regions can transiently change from being redundancy-dominated to synergy-dominated and that the temporal pattern is structured in time. These results provide strong evidence that there exists a large space of unexplored structures in human brain data that have been largely missed by a focus on bivariate network connectivity models. This synergistic structure is dynamic in time and likely will illuminate interesting links between brain and behavior. Beyond brain-specific application, the PED provides a very general approach for understanding higher-order structures in a variety of complex systems.
Collapse
Affiliation(s)
- Thomas F. Varley
- School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN47405
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47405
| | - Maria Pope
- School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN47405
- Program in Neuroscience, Indiana University, Bloomington, IN47405
| | - Maria Grazia Puxeddu
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47405
| | - Joshua Faskowitz
- School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN47405
- Program in Neuroscience, Indiana University, Bloomington, IN47405
| | - Olaf Sporns
- School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN47405
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN47405
- Program in Neuroscience, Indiana University, Bloomington, IN47405
| |
Collapse
|
20
|
Mages T, Rohner C. Decomposing and Tracing Mutual Information by Quantifying Reachable Decision Regions. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1014. [PMID: 37509961 PMCID: PMC10378359 DOI: 10.3390/e25071014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 06/27/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
The idea of a partial information decomposition (PID) gained significant attention for attributing the components of mutual information from multiple variables about a target to being unique, redundant/shared or synergetic. Since the original measure for this analysis was criticized, several alternatives have been proposed but have failed to satisfy the desired axioms, an inclusion-exclusion principle or have resulted in negative partial information components. For constructing a measure, we interpret the achievable type I/II error pairs for predicting each state of a target variable (reachable decision regions) as notions of pointwise uncertainty. For this representation of uncertainty, we construct a distributive lattice with mutual information as consistent valuation and obtain an algebra for the constructed measure. The resulting definition satisfies the original axioms, an inclusion-exclusion principle and provides a non-negative decomposition for an arbitrary number of variables. We demonstrate practical applications of this approach by tracing the flow of information through Markov chains. This can be used to model and analyze the flow of information in communication networks or data processing systems.
Collapse
Affiliation(s)
- Tobias Mages
- Department of Information Technology, Uppsala University, 752 36 Uppsala, Sweden
| | - Christian Rohner
- Department of Information Technology, Uppsala University, 752 36 Uppsala, Sweden
| |
Collapse
|
21
|
Gomes AFC, Figueiredo MAT. Orders between Channels and Implications for Partial Information Decomposition. ENTROPY (BASEL, SWITZERLAND) 2023; 25:975. [PMID: 37509922 PMCID: PMC10377940 DOI: 10.3390/e25070975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 07/30/2023]
Abstract
The partial information decomposition (PID) framework is concerned with decomposing the information that a set of random variables has with respect to a target variable into three types of components: redundant, synergistic, and unique. Classical information theory alone does not provide a unique way to decompose information in this manner, and additional assumptions have to be made. Recently, Kolchinsky proposed a new general axiomatic approach to obtain measures of redundant information based on choosing an order relation between information sources (equivalently, order between communication channels). In this paper, we exploit this approach to introduce three new measures of redundant information (and the resulting decompositions) based on well-known preorders between channels, contributing to the enrichment of the PID landscape. We relate the new decompositions to existing ones, study several of their properties, and provide examples illustrating their novelty. As a side result, we prove that any preorder that satisfies Kolchinsky's axioms yields a decomposition that meets the axioms originally introduced by Williams and Beer when they first proposed PID.
Collapse
Affiliation(s)
- André F C Gomes
- Instituto de Telecomunicações and LUMLIS (Lisbon ELLIS Unit), Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisboa, Portugal
| | - Mário A T Figueiredo
- Instituto de Telecomunicações and LUMLIS (Lisbon ELLIS Unit), Instituto Superior Técnico, Universidade de Lisboa, 1049-001 Lisboa, Portugal
| |
Collapse
|
22
|
Celotto M, Bím J, Tlaie A, De Feo V, Lemke S, Chicharro D, Nili H, Bieler M, Hanganu-Opatz IL, Donner TH, Brovelli A, Panzeri S. An information-theoretic quantification of the content of communication between brain regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.14.544903. [PMID: 37398375 PMCID: PMC10312682 DOI: 10.1101/2023.06.14.544903] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Quantifying the amount, content and direction of communication between brain regions is key to understanding brain function. Traditional methods to analyze brain activity based on the Wiener-Granger causality principle quantify the overall information propagated by neural activity between simultaneously recorded brain regions, but do not reveal the information flow about specific features of interest (such as sensory stimuli). Here, we develop a new information theoretic measure termed Feature-specific Information Transfer (FIT), quantifying how much information about a specific feature flows between two regions. FIT merges the Wiener-Granger causality principle with information-content specificity. We first derive FIT and prove analytically its key properties. We then illustrate and test them with simulations of neural activity, demonstrating that FIT identifies, within the total information flowing between regions, the information that is transmitted about specific features. We then analyze three neural datasets obtained with different recording methods, magneto- and electro-encephalography, and spiking activity, to demonstrate the ability of FIT to uncover the content and direction of information flow between brain regions beyond what can be discerned with traditional anaytical methods. FIT can improve our understanding of how brain regions communicate by uncovering previously hidden feature-specific information flow.
Collapse
Affiliation(s)
- Marco Celotto
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Neural Computation Laboratory, Istituto Italiano di Tecnologia, Rovereto (TN), Italy
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Jan Bím
- Datamole, s. r. o, Vitezne namesti 577/2 Dejvice, 160 00 Praha 6, The Czech Republic
| | - Alejandro Tlaie
- Neural Computation Laboratory, Istituto Italiano di Tecnologia, Rovereto (TN), Italy
| | - Vito De Feo
- Artificial Intelligence Team, Future Health Technology, and Brain-Computer Interfaces laboratories, School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK
| | - Stefan Lemke
- Department of Cell Biology and Physiology, University of North Carolina, Chapel Hill, United States
| | - Daniel Chicharro
- Department of Computer Science, City, University of London, London, UK
| | - Hamed Nili
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
| | - Malte Bieler
- Mobile Technology Lab, School of Economics, Innovation and Technology, University College Kristiania, Oslo, Norway
| | - Ileana L. Hanganu-Opatz
- Institute of Developmental Neurophysiology, Center for Molecular Neurobiology, University Medical Center, Hamburg-Eppendorf, Hamburg, Germany
| | - Tobias H. Donner
- Section Computational Cognitive Neuroscience, Department of Neurophysiology and Pathophysiology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Andrea Brovelli
- Institut de Neurosciences de la Timone, UMR 7289, Aix Marseille Université, CNRS, Marseille, France
| | - Stefano Panzeri
- Department of Excellence for Neural Information Processing, Center for Molecular Neurobiology (ZMNH), University Medical Center Hamburg-Eppendorf (UKE), Hamburg, Germany
- Neural Computation Laboratory, Istituto Italiano di Tecnologia, Rovereto (TN), Italy
| |
Collapse
|
23
|
Dutta S, Hamman F. A Review of Partial Information Decomposition in Algorithmic Fairness and Explainability. ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25050795. [PMID: 37238550 DOI: 10.3390/e25050795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/02/2023] [Accepted: 05/07/2023] [Indexed: 05/28/2023]
Abstract
Partial Information Decomposition (PID) is a body of work within information theory that allows one to quantify the information that several random variables provide about another random variable, either individually (unique information), redundantly (shared information), or only jointly (synergistic information). This review article aims to provide a survey of some recent and emerging applications of partial information decomposition in algorithmic fairness and explainability, which are of immense importance given the growing use of machine learning in high-stakes applications. For instance, PID, in conjunction with causality, has enabled the disentanglement of the non-exempt disparity which is the part of the overall disparity that is not due to critical job necessities. Similarly, in federated learning, PID has enabled the quantification of tradeoffs between local and global disparities. We introduce a taxonomy that highlights the role of PID in algorithmic fairness and explainability in three main avenues: (i) Quantifying the legally non-exempt disparity for auditing or training; (ii) Explaining contributions of various features or data points; and (iii) Formalizing tradeoffs among different disparities in federated learning. Lastly, we also review techniques for the estimation of PID measures, as well as discuss some challenges and future directions.
Collapse
Affiliation(s)
- Sanghamitra Dutta
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD 20742, USA
| | - Faisal Hamman
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
24
|
van Enk SJ. Pooling probability distributions and partial information decomposition. Phys Rev E 2023; 107:054133. [PMID: 37329048 DOI: 10.1103/physreve.107.054133] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 05/09/2023] [Indexed: 06/18/2023]
Abstract
Notwithstanding various attempts to construct a partial information decomposition (PID) for multiple variables by defining synergistic, redundant, and unique information, there is no consensus on how one ought to precisely define either of these quantities. One aim here is to illustrate how that ambiguity-or, more positively, freedom of choice-may arise. Using the basic idea that information equals the average reduction in uncertainty when going from an initial to a final probability distribution, synergistic information will likewise be defined as a difference between two entropies. One term is uncontroversial and characterizes "the whole" information that source variables carry jointly about a target variable T. The other term then is meant to characterize the information carried by the "sum of its parts." Here we interpret that concept as needing a suitable probability distribution aggregated ("pooled") from multiple marginal distributions (the parts). Ambiguity arises in the definition of the optimum way to pool two (or more) probability distributions. Independent of the exact definition of optimum pooling, the concept of pooling leads to a lattice that differs from the often-used redundancy-based lattice. One can associate not just a number (an average entropy) with each node of the lattice, but (pooled) probability distributions. As an example, one simple and reasonable approach to pooling is presented, which naturally gives rise to the overlap between different probability distributions as being a crucial quantity that characterizes both synergistic and unique information.
Collapse
Affiliation(s)
- S J van Enk
- Department of Physics, University of Oregon, Eugene, Oregon 97403, USA
| |
Collapse
|
25
|
Varley TF, Pope M, Faskowitz J, Sporns O. Multivariate information theory uncovers synergistic subsystems of the human cerebral cortex. Commun Biol 2023; 6:451. [PMID: 37095282 PMCID: PMC10125999 DOI: 10.1038/s42003-023-04843-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Accepted: 04/14/2023] [Indexed: 04/26/2023] Open
Abstract
One of the most well-established tools for modeling the brain is the functional connectivity network, which is constructed from pairs of interacting brain regions. While powerful, the network model is limited by the restriction that only pairwise dependencies are considered and potentially higher-order structures are missed. Here, we explore how multivariate information theory reveals higher-order dependencies in the human brain. We begin with a mathematical analysis of the O-information, showing analytically and numerically how it is related to previously established information theoretic measures of complexity. We then apply the O-information to brain data, showing that synergistic subsystems are widespread in the human brain. Highly synergistic subsystems typically sit between canonical functional networks, and may serve an integrative role. We then use simulated annealing to find maximally synergistic subsystems, finding that such systems typically comprise ≈10 brain regions, recruited from multiple canonical brain systems. Though ubiquitous, highly synergistic subsystems are invisible when considering pairwise functional connectivity, suggesting that higher-order dependencies form a kind of shadow structure that has been unrecognized by established network-based analyses. We assert that higher-order interactions in the brain represent an under-explored space that, accessible with tools of multivariate information theory, may offer novel scientific insights.
Collapse
Affiliation(s)
- Thomas F Varley
- School of Informatics, Computing & Engineering, Indiana University, Bloomington, IN, 47405, USA.
- Department of Psychological & Brain Sciences, Indiana University, Bloomington, IN, 47405, USA.
| | - Maria Pope
- School of Informatics, Computing & Engineering, Indiana University, Bloomington, IN, 47405, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, 47405, USA
| | - Joshua Faskowitz
- Department of Psychological & Brain Sciences, Indiana University, Bloomington, IN, 47405, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, 47405, USA
| | - Olaf Sporns
- School of Informatics, Computing & Engineering, Indiana University, Bloomington, IN, 47405, USA
- Department of Psychological & Brain Sciences, Indiana University, Bloomington, IN, 47405, USA
- Program in Neuroscience, Indiana University, Bloomington, IN, 47405, USA
| |
Collapse
|
26
|
Mares I, Mares C, Dobrica V, Demetrescu C. Selection of Optimal Palmer Predictors for Increasing the Predictability of the Danube Discharge: New Findings Based on Information Theory and Partial Wavelet Coherence Analysis. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1375. [PMID: 37420396 DOI: 10.3390/e24101375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/22/2022] [Accepted: 09/23/2022] [Indexed: 07/09/2023]
Abstract
The purpose of this study was to obtain synergistic information and details in the time-frequency domain of the relationships between the Palmer drought indices in the upper and middle Danube River basin and the discharge (Q) in the lower basin. Four indices were considered: the Palmer drought severity index (PDSI), Palmer hydrological drought index (PHDI), weighted PDSI (WPLM) and Palmer Z-index (ZIND). These indices were quantified through the first principal component (PC1) analysis of empirical orthogonal function (EOF) decomposition, which was obtained from hydro-meteorological parameters at 15 stations located along the Danube River basin. The influences of these indices on the Danube discharge were tested, both simultaneously and with certain lags, via linear and nonlinear methods applying the elements of information theory. Linear connections were generally obtained for synchronous links in the same season, and nonlinear ones for the predictors considered with certain lags (in advance) compared to the discharge predictand. The redundancy-synergy index was also considered to eliminate redundant predictors. Few cases were obtained in which all four predictors could be considered together to establish a significant information base for the discharge evolution. In the fall season, nonstationarity was tested through wavelet analysis applied for the multivariate case, using partial wavelet coherence (pwc). The results differed, depending on the predictor kept in pwc, and on those excluded.
Collapse
Affiliation(s)
- Ileana Mares
- Institute of Geodynamics of the Romanian Academy, R-020032 Bucharest, Romania
| | - Constantin Mares
- Institute of Geodynamics of the Romanian Academy, R-020032 Bucharest, Romania
| | - Venera Dobrica
- Institute of Geodynamics of the Romanian Academy, R-020032 Bucharest, Romania
| | - Crisan Demetrescu
- Institute of Geodynamics of the Romanian Academy, R-020032 Bucharest, Romania
| |
Collapse
|
27
|
Kay JW, Schulz JM, Phillips WA. A Comparison of Partial Information Decompositions Using Data from Real and Simulated Layer 5b Pyramidal Cells. ENTROPY 2022; 24:e24081021. [PMID: 35893001 PMCID: PMC9394329 DOI: 10.3390/e24081021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 07/18/2022] [Accepted: 07/20/2022] [Indexed: 02/04/2023]
Abstract
Partial information decomposition allows the joint mutual information between an output and a set of inputs to be divided into components that are synergistic or shared or unique to each input. We consider five different decompositions and compare their results using data from layer 5b pyramidal cells in two different studies. The first study was on the amplification of somatic action potential output by apical dendritic input and its regulation by dendritic inhibition. We find that two of the decompositions produce much larger estimates of synergy and shared information than the others, as well as large levels of unique misinformation. When within-neuron differences in the components are examined, the five methods produce more similar results for all but the shared information component, for which two methods produce a different statistical conclusion from the others. There are some differences in the expression of unique information asymmetry among the methods. It is significantly larger, on average, under dendritic inhibition. Three of the methods support a previous conclusion that apical amplification is reduced by dendritic inhibition. The second study used a detailed compartmental model to produce action potentials for many combinations of the numbers of basal and apical synaptic inputs. Decompositions of the entire data set produce similar differences to those in the first study. Two analyses of decompositions are conducted on subsets of the data. In the first, the decompositions reveal a bifurcation in unique information asymmetry. For three of the methods, this suggests that apical drive switches to basal drive as the strength of the basal input increases, while the other two show changing mixtures of information and misinformation. Decompositions produced using the second set of subsets show that all five decompositions provide support for properties of cooperative context-sensitivity—to varying extents.
Collapse
Affiliation(s)
- Jim W. Kay
- School of Mathematics and Statistics, University of Glasgow, Glasgow G12 8QQ, UK
- Correspondence:
| | - Jan M. Schulz
- Department of Biomedicine, University of Basel, 4001 Basel, Switzerland;
| | | |
Collapse
|
28
|
Newman EL, Varley TF, Parakkattu VK, Sherrill SP, Beggs JM. Revealing the Dynamics of Neural Information Processing with Multivariate Information Decomposition. ENTROPY (BASEL, SWITZERLAND) 2022; 24:930. [PMID: 35885153 PMCID: PMC9319160 DOI: 10.3390/e24070930] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 11/16/2022]
Abstract
The varied cognitive abilities and rich adaptive behaviors enabled by the animal nervous system are often described in terms of information processing. This framing raises the issue of how biological neural circuits actually process information, and some of the most fundamental outstanding questions in neuroscience center on understanding the mechanisms of neural information processing. Classical information theory has long been understood to be a natural framework within which information processing can be understood, and recent advances in the field of multivariate information theory offer new insights into the structure of computation in complex systems. In this review, we provide an introduction to the conceptual and practical issues associated with using multivariate information theory to analyze information processing in neural circuits, as well as discussing recent empirical work in this vein. Specifically, we provide an accessible introduction to the partial information decomposition (PID) framework. PID reveals redundant, unique, and synergistic modes by which neurons integrate information from multiple sources. We focus particularly on the synergistic mode, which quantifies the "higher-order" information carried in the patterns of multiple inputs and is not reducible to input from any single source. Recent work in a variety of model systems has revealed that synergistic dynamics are ubiquitous in neural circuitry and show reliable structure-function relationships, emerging disproportionately in neuronal rich clubs, downstream of recurrent connectivity, and in the convergence of correlated activity. We draw on the existing literature on higher-order information dynamics in neuronal networks to illustrate the insights that have been gained by taking an information decomposition perspective on neural activity. Finally, we briefly discuss future promising directions for information decomposition approaches to neuroscience, such as work on behaving animals, multi-target generalizations of PID, and time-resolved local analyses.
Collapse
Affiliation(s)
- Ehren L. Newman
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA;
| | - Thomas F. Varley
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA;
| | - Vibin K. Parakkattu
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA;
| | | | - John M. Beggs
- Department of Physics, Indiana University, Bloomington, IN 47405, USA;
| |
Collapse
|