1
|
Marzen SE, Riechers PM, Crutchfield JP. Complexity-calibrated benchmarks for machine learning reveal when prediction algorithms succeed and mislead. Sci Rep 2024; 14:8727. [PMID: 38622279 PMCID: PMC11018857 DOI: 10.1038/s41598-024-58814-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 04/03/2024] [Indexed: 04/17/2024] Open
Abstract
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past memory traces in this intriguing proposal. A lower bound from Fano's inequality shows that, on highly non-Markovian processes generated by large probabilistic state machines, next-generation reservoir computers with reasonably long memory traces have an error probability that is at least ∼ 60 % higher than the minimal attainable error probability in predicting the next observation. More generally, it appears that popular recurrent neural networks fall far short of optimally predicting such complex processes. These results highlight the need for a new generation of optimized recurrent neural network architectures. Alongside this finding, we present concentration-of-measure results for randomly-generated but complex processes. One conclusion is that large probabilistic state machines-specifically, large ϵ -machines-are key to generating challenging and structurally-unbiased stimuli for ground-truthing recurrent neural network architectures.
Collapse
Affiliation(s)
- Sarah E Marzen
- W. M. Keck Science Department of Pitzer, Scripps, and Claremont McKenna College, Claremont, CA, 91711, USA.
| | - Paul M Riechers
- Beyond Institute for Theoretical Science, San Francisco, CA, USA
| | - James P Crutchfield
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, CA, 95616, USA
| |
Collapse
|
2
|
Soriano J, Marzen S. How Well Can We Infer Selection Benefits and Mutation Rates from Allele Frequencies? ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25040615. [PMID: 37190403 PMCID: PMC10137336 DOI: 10.3390/e25040615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 03/24/2023] [Accepted: 03/26/2023] [Indexed: 05/17/2023]
Abstract
Experimentalists observe allele frequency distributions and try to infer mutation rates and selection coefficients. How easy is this? We calculate limits to their ability in the context of the Wright-Fisher model by first finding the maximal amount of information that can be acquired using allele frequencies about the mutation rate and selection coefficient- at least 2 bits per allele- and then by finding how the organisms would have shaped their mutation rates and selection coefficients so as to maximize the information transfer.
Collapse
Affiliation(s)
- Jonathan Soriano
- W. M. Keck Science Department, Pitzer, Scripps, and Claremont McKenna College, Claremont, CA 91711, USA
| | - Sarah Marzen
- W. M. Keck Science Department, Pitzer, Scripps, and Claremont McKenna College, Claremont, CA 91711, USA
| |
Collapse
|
3
|
Loomis SP, Crutchfield JP. Exploring predictive states via Cantor embeddings and Wasserstein distance. CHAOS (WOODBURY, N.Y.) 2022; 32:123115. [PMID: 36587324 DOI: 10.1063/5.0102603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 11/02/2022] [Indexed: 06/17/2023]
Abstract
Predictive states for stochastic processes are a nonparametric and interpretable construct with relevance across a multitude of modeling paradigms. Recent progress on the self-supervised reconstruction of predictive states from time-series data focused on the use of reproducing kernel Hilbert spaces. Here, we examine how Wasserstein distances may be used to detect predictive equivalences in symbolic data. We compute Wasserstein distances between distributions over sequences ("predictions") using a finite-dimensional embedding of sequences based on the Cantor set for the underlying geometry. We show that exploratory data analysis using the resulting geometry via hierarchical clustering and dimension reduction provides insight into the temporal structure of processes ranging from the relatively simple (e.g., generated by finite-state hidden Markov models) to the very complex (e.g., generated by infinite-state indexed grammars).
Collapse
Affiliation(s)
- Samuel P Loomis
- Complexity Sciences Center and Department of Physics and Astronomy, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center and Department of Physics and Astronomy, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
4
|
Brodu N, Crutchfield JP. Discovering causal structure with reproducing-kernel Hilbert space ε-machines. CHAOS (WOODBURY, N.Y.) 2022; 32:023103. [PMID: 35232043 DOI: 10.1063/5.0062829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 01/10/2022] [Indexed: 06/14/2023]
Abstract
We merge computational mechanics' definition of causal states (predictively equivalent histories) with reproducing-kernel Hilbert space (RKHS) representation inference. The result is a widely applicable method that infers causal structure directly from observations of a system's behaviors whether they are over discrete or continuous events or time. A structural representation-a finite- or infinite-state kernel ϵ-machine-is extracted by a reduced-dimension transform that gives an efficient representation of causal states and their topology. In this way, the system dynamics are represented by a stochastic (ordinary or partial) differential equation that acts on causal states. We introduce an algorithm to estimate the associated evolution operator. Paralleling the Fokker-Planck equation, it efficiently evolves causal-state distributions and makes predictions in the original data space via an RKHS functional mapping. We demonstrate these techniques, together with their predictive abilities, on discrete-time, discrete-value infinite Markov-order processes generated by finite-state hidden Markov models with (i) finite or (ii) uncountably infinite causal states and (iii) continuous-time, continuous-value processes generated by thermally driven chaotic flows. The method robustly estimates causal structure in the presence of varying external and measurement noise levels and for very high-dimensional data.
Collapse
Affiliation(s)
- Nicolas Brodu
- Geostat Team-Geometry and Statistics in Acquisition Data, INRIA Bordeaux Sud Ouest, 200 rue de la Vieille Tour, 33405 Talence Cedex, France
| | - James P Crutchfield
- Complexity Sciences Center and Department of Physics and Astronomy, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
5
|
Jurgens AM, Crutchfield JP. Divergent predictive states: The statistical complexity dimension of stationary, ergodic hidden Markov processes. CHAOS (WOODBURY, N.Y.) 2021; 31:083114. [PMID: 34470245 DOI: 10.1063/5.0050460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 07/04/2021] [Indexed: 06/13/2023]
Abstract
Even simply defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains, the consequences are dramatic. Their predictive models are generically infinite state. Until recently, one could determine neither their intrinsic randomness nor structural complexity. The prequel to this work introduced methods to accurately calculate the Shannon entropy rate (randomness) and to constructively determine their minimal (though, infinite) set of predictive features. Leveraging this, we address the complementary challenge of determining how structured hidden Markov processes are by calculating their statistical complexity dimension-the information dimension of the minimal set of predictive features. This tracks the divergence rate of the minimal memory resources required to optimally predict a broad class of truly complex processes.
Collapse
Affiliation(s)
- Alexandra M Jurgens
- Complexity Sciences Center, Physics and Astronomy Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Physics and Astronomy Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
6
|
Elliott TJ, Yang C, Binder FC, Garner AJP, Thompson J, Gu M. Extreme Dimensionality Reduction with Quantum Modeling. PHYSICAL REVIEW LETTERS 2020; 125:260501. [PMID: 33449713 DOI: 10.1103/physrevlett.125.260501] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Accepted: 10/23/2020] [Indexed: 05/23/2023]
Abstract
Effective and efficient forecasting relies on identification of the relevant information contained in past observations-the predictive features-and isolating it from the rest. When the future of a process bears a strong dependence on its behavior far into the past, there are many such features to store, necessitating complex models with extensive memories. Here, we highlight a family of stochastic processes whose minimal classical models must devote unboundedly many bits to tracking the past. For this family, we identify quantum models of equal accuracy that can store all relevant information within a single two-dimensional quantum system (qubit). This represents the ultimate limit of quantum compression and highlights an immense practical advantage of quantum technologies for the forecasting and simulation of complex systems.
Collapse
Affiliation(s)
- Thomas J Elliott
- Department of Mathematics, Imperial College London, London SW7 2AZ, United Kingdom
- Complexity Institute, Nanyang Technological University, Singapore 637335, Singapore
- Nanyang Quantum Hub, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Chengran Yang
- Complexity Institute, Nanyang Technological University, Singapore 637335, Singapore
- Nanyang Quantum Hub, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Felix C Binder
- Institute for Quantum Optics and Quantum Information, Austrian Academy of Sciences, Boltzmanngasse 3, Vienna 1090, Austria
| | - Andrew J P Garner
- Institute for Quantum Optics and Quantum Information, Austrian Academy of Sciences, Boltzmanngasse 3, Vienna 1090, Austria
- Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore
| | - Jayne Thompson
- Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore
| | - Mile Gu
- Complexity Institute, Nanyang Technological University, Singapore 637335, Singapore
- Nanyang Quantum Hub, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
- Centre for Quantum Technologies, National University of Singapore, 3 Science Drive 2, Singapore 117543, Singapore
| |
Collapse
|
7
|
Street S. Upper Limit on the Thermodynamic Information Content of an Action Potential. Front Comput Neurosci 2020; 14:37. [PMID: 32477088 PMCID: PMC7237712 DOI: 10.3389/fncom.2020.00037] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 04/14/2020] [Indexed: 12/30/2022] Open
Abstract
In computational neuroscience, spiking neurons are often analyzed as computing devices that register bits of information, with each action potential carrying at most one bit of Shannon entropy. Here, I question this interpretation by using Landauer's principle to estimate an upper limit for the quantity of thermodynamic information that can be processed within a single action potential in a typical mammalian neuron. A straightforward calculation shows that an action potential in a typical mammalian cortical pyramidal cell can process up to approximately 3.4 · 1011 bits of thermodynamic information, or about 4.9 · 1011 bits of Shannon entropy. This result suggests that an action potential can, in principle, carry much more than a single bit of Shannon entropy.
Collapse
Affiliation(s)
- Sterling Street
- Department of Biology, University of Georgia, Athens, GA, United States
| |
Collapse
|
8
|
Rundle JB, Giguere A, Turcotte DL, Crutchfield JP, Donnellan A. Global Seismic Nowcasting With Shannon Information Entropy. EARTH AND SPACE SCIENCE (HOBOKEN, N.J.) 2019; 6:191-197. [PMID: 30854411 PMCID: PMC6392127 DOI: 10.1029/2018ea000464] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Revised: 10/17/2018] [Accepted: 11/20/2018] [Indexed: 06/09/2023]
Abstract
Seismic nowcasting uses counts of small earthquakes as proxy data to estimate the current dynamical state of an earthquake fault system. The result is an earthquake potential score that characterizes the current state of progress of a defined geographic region through its nominal earthquake "cycle." The count of small earthquakes since the last large earthquake is the natural time that has elapsed since the last large earthquake (Varotsos et al., 2006, https://doi.org/10.1103/PhysRevE.74.021123). In addition to natural time, earthquake sequences can also be analyzed using Shannon information entropy ("information"), an idea that was pioneered by Shannon (1948, https://doi.org/10.1002/j.1538-7305.1948.tb01338.x). As a first step to add seismic information entropy into the nowcasting method, we incorporate magnitude information into the natural time counts by using event self-information. We find in this first application of seismic information entropy that the earthquake potential score values are similar to the values using only natural time. However, other characteristics of earthquake sequences, including the interevent time intervals, or the departure of higher magnitude events from the magnitude-frequency scaling line, may contain additional information.
Collapse
Affiliation(s)
- John B. Rundle
- Department of PhysicsUniversity of CaliforniaDavisCAUSA
- Santa Fe InstituteSanta FeNMUSA
- Department of Earth and Planetary ScienceUniversity of CaliforniaDavisCAUSA
- Jet Propulsion LaboratoryCalifornia Institute of TechnologyPasadenaCAUSA
- Tohoku UniversitySendaiJapan
| | | | - Donald L. Turcotte
- Department of Earth and Planetary ScienceUniversity of CaliforniaDavisCAUSA
| | - James P. Crutchfield
- Department of PhysicsUniversity of CaliforniaDavisCAUSA
- Santa Fe InstituteSanta FeNMUSA
| | - Andrea Donnellan
- Jet Propulsion LaboratoryCalifornia Institute of TechnologyPasadenaCAUSA
| |
Collapse
|
9
|
Spinney RE, Lizier JT. Characterizing information-theoretic storage and transfer in continuous time processes. Phys Rev E 2018; 98:012314. [PMID: 30110808 DOI: 10.1103/physreve.98.012314] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2018] [Indexed: 11/07/2022]
Abstract
The characterization of information processing is an important task in complex systems science. Information dynamics is a quantitative methodology for modeling the intrinsic information processing conducted by a process represented as a time series, but to date has only been formulated in discrete time. Building on previous work which demonstrated how to formulate transfer entropy in continuous time, we give a total account of information processing in this setting, incorporating information storage. We find that a convergent rate of predictive capacity, comprising the transfer entropy and active information storage, does not exist, arising through divergent rates of active information storage. We identify that active information storage can be decomposed into two separate quantities that characterize predictive capacity stored in a process: active memory utilization and instantaneous predictive capacity. The latter involves prediction related to path regularity and so solely inherits the divergent properties of the active information storage, while the former permits definitions of pathwise and rate quantities. We formulate measures of memory utilization for jump and neural spiking processes and illustrate measures of information processing in synthetic neural spiking models and coupled Ornstein-Uhlenbeck models. The application to synthetic neural spiking models demonstrates that active memory utilization for point processes consists of discontinuous jump contributions (at spikes) interrupting a continuously varying contribution (relating to waiting times between spikes), complementing the behavior previously demonstrated for transfer entropy in these processes.
Collapse
Affiliation(s)
- Richard E Spinney
- Complex Systems Research Group and Centre for Complex Systems, Faculty of Engineering and Information Technologies, University of Sydney, Sydney, New South Wales 2006, Australia
| | - Joseph T Lizier
- Complex Systems Research Group and Centre for Complex Systems, Faculty of Engineering and Information Technologies, University of Sydney, Sydney, New South Wales 2006, Australia
| |
Collapse
|
10
|
Marzen S. Intrinsic Computation of a Monod-Wyman-Changeux Molecule. ENTROPY (BASEL, SWITZERLAND) 2018; 20:e20080599. [PMID: 33265688 PMCID: PMC7513124 DOI: 10.3390/e20080599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 08/06/2018] [Accepted: 08/10/2018] [Indexed: 06/12/2023]
Abstract
Causal states are minimal sufficient statistics of prediction of a stochastic process, their coding cost is called statistical complexity, and the implied causal structure yields a sense of the process' "intrinsic computation". We discuss how statistical complexity changes with slight changes to the underlying model- in this case, a biologically-motivated dynamical model, that of a Monod-Wyman-Changeux molecule. Perturbations to kinetic rates cause statistical complexity to jump from finite to infinite. The same is not true for excess entropy, the mutual information between past and future, or for the molecule's transfer function. We discuss the implications of this for the relationship between intrinsic and functional computation of biological sensory systems.
Collapse
Affiliation(s)
- Sarah Marzen
- Physics of Living Systems Group, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
11
|
Rupe A, Crutchfield JP. Local causal states and discrete coherent structures. CHAOS (WOODBURY, N.Y.) 2018; 28:075312. [PMID: 30070532 DOI: 10.1063/1.5021130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Accepted: 04/27/2018] [Indexed: 06/08/2023]
Abstract
Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rigorous analysis and empirical prediction, with progress being made only recently. As a step in this, we present a formal theory of coherent structures in fully discrete dynamical field theories. It builds on the notion of structure introduced by computational mechanics, generalizing it to a local spatiotemporal setting. The analysis' main tool employs the local causal states, which are used to uncover a system's hidden spatiotemporal symmetries and which identify coherent structures as spatially localized deviations from those symmetries. The approach is behavior-driven in the sense that it does not rely on directly analyzing spatiotemporal equations of motion, rather it considers only the spatiotemporal fields a system generates. As such, it offers an unsupervised approach to discover and describe coherent structures. We illustrate the approach by analyzing coherent structures generated by elementary cellular automata, comparing the results with an earlier, dynamic-invariant-set approach that decomposes fields into domains, particles, and particle interactions.
Collapse
Affiliation(s)
- Adam Rupe
- Complexity Sciences Center, Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
12
|
Bayat Mokhtari E, Lawrence JJ, Stone EF. Data Driven Models of Short-Term Synaptic Plasticity. Front Comput Neurosci 2018; 12:32. [PMID: 29872388 PMCID: PMC5972196 DOI: 10.3389/fncom.2018.00032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 04/27/2018] [Indexed: 11/29/2022] Open
Abstract
Simple models of short term synaptic plasticity that incorporate facilitation and/or depression have been created in abundance for different synapse types and circumstances. The analysis of these models has included computing mutual information between a stochastic input spike train and some sort of representation of the postsynaptic response. While this approach has proven useful in many contexts, for the purpose of determining the type of process underlying a stochastic output train, it ignores the ordering of the responses, leaving an important characterizing feature on the table. In this paper we use a broader class of information measures on output only, and specifically construct hidden Markov models (HMMs) (known as epsilon machines or causal state models) to differentiate between synapse type, and classify the complexity of the process. We find that the machines allow us to differentiate between processes in a way not possible by considering distributions alone. We are also able to understand these differences in terms of the dynamics of the model used to create the output response, bringing the analysis full circle. Hence this technique provides a complimentary description of the synaptic filtering process, and potentially expands the interpretation of future experimental results.
Collapse
Affiliation(s)
- Elham Bayat Mokhtari
- Department of Mathematical Sciences, The University of Montana, Missoula, MT, United States
| | - J Josh Lawrence
- Pharmacology and Neuroscience, Texas Tech University Health Sciences Center, Lubbock, TX, United States
| | - Emily F Stone
- Department of Mathematical Sciences, The University of Montana, Missoula, MT, United States
| |
Collapse
|
13
|
Riechers PM, Crutchfield JP. Spectral simplicity of apparent complexity. I. The nondiagonalizable metadynamics of prediction. CHAOS (WOODBURY, N.Y.) 2018; 28:033115. [PMID: 29604656 DOI: 10.1063/1.4985199] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Virtually all questions that one can ask about the behavioral and structural complexity of a stochastic process reduce to a linear algebraic framing of a time evolution governed by an appropriate hidden-Markov process generator. Each type of question-correlation, predictability, predictive cost, observer synchronization, and the like-induces a distinct generator class. Answers are then functions of the class-appropriate transition dynamic. Unfortunately, these dynamics are generically nonnormal, nondiagonalizable, singular, and so on. Tractably analyzing these dynamics relies on adapting the recently introduced meromorphic functional calculus, which specifies the spectral decomposition of functions of nondiagonalizable linear operators, even when the function poles and zeros coincide with the operator's spectrum. Along the way, we establish special properties of the spectral projection operators that demonstrate how they capture the organization of subprocesses within a complex system. Circumventing the spurious infinities of alternative calculi, this leads in the sequel, Part II [P. M. Riechers and J. P. Crutchfield, Chaos 28, 033116 (2018)], to the first closed-form expressions for complexity measures, couched either in terms of the Drazin inverse (negative-one power of a singular operator) or the eigenvalues and projection operators of the appropriate transition dynamic.
Collapse
Affiliation(s)
- Paul M Riechers
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
14
|
James RG, Mahoney JR, Crutchfield JP. Information trimming: Sufficient statistics, mutual information, and predictability from effective channel states. Phys Rev E 2017; 95:060102. [PMID: 28709305 DOI: 10.1103/physreve.95.060102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2017] [Indexed: 11/07/2022]
Abstract
One of the most basic characterizations of the relationship between two random variables, X and Y, is the value of their mutual information. Unfortunately, calculating it analytically and estimating it empirically are often stymied by the extremely large dimension of the variables. One might hope to replace such a high-dimensional variable by a smaller one that preserves its relationship with the other. It is well known that either X (or Y) can be replaced by its minimal sufficient statistic about Y (or X) while preserving the mutual information. While intuitively reasonable, it is not obvious or straightforward that both variables can be replaced simultaneously. We demonstrate that this is in fact possible: the information X's minimal sufficient statistic preserves about Y is exactly the information that Y's minimal sufficient statistic preserves about X. We call this procedure information trimming. As an important corollary, we consider the case where one variable is a stochastic process' past and the other its future. In this case, the mutual information is the channel transmission rate between the channel's effective states. That is, the past-future mutual information (the excess entropy) is the amount of information about the future that can be predicted using the past. Translating our result about minimal sufficient statistics, this is equivalent to the mutual information between the forward- and reverse-time causal states of computational mechanics. We close by discussing multivariate extensions to this use of minimal sufficient statistics.
Collapse
Affiliation(s)
- Ryan G James
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - John R Mahoney
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
15
|
Marzen SE, Crutchfield JP. Nearly maximally predictive features and their dimensions. Phys Rev E 2017; 95:051301. [PMID: 28618578 DOI: 10.1103/physreve.95.051301] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Indexed: 11/07/2022]
Abstract
Scientific explanation often requires inferring maximally predictive features from a given data set. Unfortunately, the collection of minimal maximally predictive features for most stochastic processes is uncountably infinite. In such cases, one compromises and instead seeks nearly maximally predictive features. Here, we derive upper bounds on the rates at which the number and the coding cost of nearly maximally predictive features scale with desired predictive power. The rates are determined by the fractal dimensions of a process' mixed-state distribution. These results, in turn, show how widely used finite-order Markov models can fail as predictors and that mixed-state predictive features can offer a substantial improvement.
Collapse
Affiliation(s)
- Sarah E Marzen
- Physics of Living Systems Group, Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.,Department of Physics, University of California at Berkeley, Berkeley, California 94720-5800, USA
| | - James P Crutchfield
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
16
|
Relationship in Pacemaker Neurons Between the Long-Term Correlations of Membrane Voltage Fluctuations and the Corresponding Duration of the Inter-Spike Interval. J Membr Biol 2017; 250:249-257. [PMID: 28417145 DOI: 10.1007/s00232-017-9956-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 04/07/2017] [Indexed: 10/19/2022]
Abstract
Several studies of the behavior in the voltage and frequency fluctuations of the neural electrical activity have been performed. Here, we explored the particular association between behavior of the voltage fluctuations in the inter-spike segment (VFIS) and the inter-spike intervals (ISI) of F1 pacemaker neurons from H. aspersa, by disturbing the intracellular calcium handling with cadmium and caffeine. The scaling exponent α of the VFIS, as provided by detrended fluctuations analysis, in conjunction with the corresponding duration of ISI to estimate the determination coefficient R 2 (48-50 intervals per neuron, N = 5) were all evaluated. The time-varying scaling exponent α(t) of VFIS was also studied (20 segments per neuron, N = 11). The R 2 obtained in control conditions was 0.683 ([0.647 0.776] lower and upper quartiles), 0.405 [0.381 0.495] by using cadmium, and 0.151 [0.118 0.222] with caffeine (P < 0.05). A non-uniform scaling exponent α(t) showing a profile throughout the duration of the VFIS was further identified. A significant reduction of long-term correlations by cadmium was confirmed in the first part of this profile (P = 0.0001), but no significant reductions were detected by using caffeine. Our findings endorse that the behavior of the VFIS appears associated to the activation of different populations of ionic channels, which establish the neural membrane potential and are mediated by the intracellular calcium handling. Thus, we provide evidence to consider that the behavior of the VFIS, as determined by the scaling exponent α, conveys insights into mechanisms regulating the excitability of pacemaker neurons.
Collapse
|
17
|
Mahoney JR, Aghamohammadi C, Crutchfield JP. Occam's Quantum Strop: Synchronizing and Compressing Classical Cryptic Processes via a Quantum Channel. Sci Rep 2016; 6:20495. [PMID: 26876796 PMCID: PMC4753439 DOI: 10.1038/srep20495] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 01/05/2016] [Indexed: 11/28/2022] Open
Abstract
A stochastic process’ statistical complexity stands out as a fundamental property: the minimum information required to synchronize one process generator to another. How much information is required, though, when synchronizing over a quantum channel? Recent work demonstrated that representing causal similarity as quantum state-indistinguishability provides a quantum advantage. We generalize this to synchronization and offer a sequence of constructions that exploit extended causal structures, finding substantial increase of the quantum advantage. We demonstrate that maximum compression is determined by the process’ cryptic order–a classical, topological property closely allied to Markov order, itself a measure of historical dependence. We introduce an efficient algorithm that computes the quantum advantage and close noting that the advantage comes at a cost–one trades off prediction for generation complexity.
Collapse
Affiliation(s)
- John R Mahoney
- Complexity Sciences Center and Department of Physics, University of California at Davis, One Shields Avenue, Davis, CA 95616, USA
| | - Cina Aghamohammadi
- Complexity Sciences Center and Department of Physics, University of California at Davis, One Shields Avenue, Davis, CA 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center and Department of Physics, University of California at Davis, One Shields Avenue, Davis, CA 95616, USA
| |
Collapse
|