1
|
Marzen SE, Riechers PM, Crutchfield JP. Complexity-calibrated benchmarks for machine learning reveal when prediction algorithms succeed and mislead. Sci Rep 2024; 14:8727. [PMID: 38622279 PMCID: PMC11018857 DOI: 10.1038/s41598-024-58814-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 04/03/2024] [Indexed: 04/17/2024] Open
Abstract
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past memory traces in this intriguing proposal. A lower bound from Fano's inequality shows that, on highly non-Markovian processes generated by large probabilistic state machines, next-generation reservoir computers with reasonably long memory traces have an error probability that is at least ∼ 60 % higher than the minimal attainable error probability in predicting the next observation. More generally, it appears that popular recurrent neural networks fall far short of optimally predicting such complex processes. These results highlight the need for a new generation of optimized recurrent neural network architectures. Alongside this finding, we present concentration-of-measure results for randomly-generated but complex processes. One conclusion is that large probabilistic state machines-specifically, large ϵ -machines-are key to generating challenging and structurally-unbiased stimuli for ground-truthing recurrent neural network architectures.
Collapse
Affiliation(s)
- Sarah E Marzen
- W. M. Keck Science Department of Pitzer, Scripps, and Claremont McKenna College, Claremont, CA, 91711, USA.
| | - Paul M Riechers
- Beyond Institute for Theoretical Science, San Francisco, CA, USA
| | - James P Crutchfield
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, CA, 95616, USA
| |
Collapse
|
2
|
Brodu N, Crutchfield JP. Discovering causal structure with reproducing-kernel Hilbert space ε-machines. CHAOS (WOODBURY, N.Y.) 2022; 32:023103. [PMID: 35232043 DOI: 10.1063/5.0062829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 01/10/2022] [Indexed: 06/14/2023]
Abstract
We merge computational mechanics' definition of causal states (predictively equivalent histories) with reproducing-kernel Hilbert space (RKHS) representation inference. The result is a widely applicable method that infers causal structure directly from observations of a system's behaviors whether they are over discrete or continuous events or time. A structural representation-a finite- or infinite-state kernel ϵ-machine-is extracted by a reduced-dimension transform that gives an efficient representation of causal states and their topology. In this way, the system dynamics are represented by a stochastic (ordinary or partial) differential equation that acts on causal states. We introduce an algorithm to estimate the associated evolution operator. Paralleling the Fokker-Planck equation, it efficiently evolves causal-state distributions and makes predictions in the original data space via an RKHS functional mapping. We demonstrate these techniques, together with their predictive abilities, on discrete-time, discrete-value infinite Markov-order processes generated by finite-state hidden Markov models with (i) finite or (ii) uncountably infinite causal states and (iii) continuous-time, continuous-value processes generated by thermally driven chaotic flows. The method robustly estimates causal structure in the presence of varying external and measurement noise levels and for very high-dimensional data.
Collapse
Affiliation(s)
- Nicolas Brodu
- Geostat Team-Geometry and Statistics in Acquisition Data, INRIA Bordeaux Sud Ouest, 200 rue de la Vieille Tour, 33405 Talence Cedex, France
| | - James P Crutchfield
- Complexity Sciences Center and Department of Physics and Astronomy, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
3
|
Jurgens AM, Crutchfield JP. Ambiguity rate of hidden Markov processes. Phys Rev E 2021; 104:064107. [PMID: 35030952 DOI: 10.1103/physreve.104.064107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 11/18/2021] [Indexed: 06/14/2023]
Abstract
The ε-machine is a stochastic process's optimal model-maximally predictive and minimal in size. It often happens that to optimally predict even simply defined processes, probabilistic models-including the ε-machine-must employ an uncountably infinite set of features. To constructively work with these infinite sets we map the ε-machine to a place-dependent iterated function system (IFS)-a stochastic dynamical system. We then introduce the ambiguity rate that, in conjunction with a process's Shannon entropy rate, determines the rate at which this set of predictive features must grow to maintain maximal predictive power over increasing horizons. We demonstrate, as an ancillary technical result that stands on its own, that the ambiguity rate is the (until now missing) correction to the Lyapunov dimension of an IFS's attracting invariant set. For a broad class of complex processes, this then allows calculating their statistical complexity dimension-the information dimension of the minimal set of predictive features.
Collapse
Affiliation(s)
- Alexandra M Jurgens
- Complexity Sciences Center, Physics Department University of California at Davis Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Physics Department University of California at Davis Davis, California 95616, USA
| |
Collapse
|
4
|
Jurgens AM, Crutchfield JP. Divergent predictive states: The statistical complexity dimension of stationary, ergodic hidden Markov processes. CHAOS (WOODBURY, N.Y.) 2021; 31:083114. [PMID: 34470245 DOI: 10.1063/5.0050460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 07/04/2021] [Indexed: 06/13/2023]
Abstract
Even simply defined, finite-state generators produce stochastic processes that require tracking an uncountable infinity of probabilistic features for optimal prediction. For processes generated by hidden Markov chains, the consequences are dramatic. Their predictive models are generically infinite state. Until recently, one could determine neither their intrinsic randomness nor structural complexity. The prequel to this work introduced methods to accurately calculate the Shannon entropy rate (randomness) and to constructively determine their minimal (though, infinite) set of predictive features. Leveraging this, we address the complementary challenge of determining how structured hidden Markov processes are by calculating their statistical complexity dimension-the information dimension of the minimal set of predictive features. This tracks the divergence rate of the minimal memory resources required to optimally predict a broad class of truly complex processes.
Collapse
Affiliation(s)
- Alexandra M Jurgens
- Complexity Sciences Center, Physics and Astronomy Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Physics and Astronomy Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
5
|
Venegas-Li AE, Jurgens AM, Crutchfield JP. Measurement-induced randomness and structure in controlled qubit processes. Phys Rev E 2020; 102:040102. [PMID: 33212600 DOI: 10.1103/physreve.102.040102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/14/2020] [Indexed: 06/11/2023]
Abstract
When an experimentalist measures a time series of qubits, the outcomes constitute a classical stochastic process. We show that projective measurement induces high complexity in these processes in two specific senses: They are inherently random (finite Shannon entropy rate) and they require infinite memory for optimal prediction (divergent statistical complexity). We identify nonorthogonality of the quantum states as the mechanism underlying the resulting complexities and examine the influence that measurement choice has on the randomness and structure of measured qubit processes. We introduce quantitative measures of this complexity and provide efficient algorithms for their estimation.
Collapse
Affiliation(s)
- Ariadna E Venegas-Li
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - Alexandra M Jurgens
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center and Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
6
|
Prediction and Dissipation in Nonequilibrium Molecular Sensors: Conditionally Markovian Channels Driven by Memoryful Environments. Bull Math Biol 2020; 82:25. [PMID: 31993762 DOI: 10.1007/s11538-020-00694-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 12/31/2019] [Indexed: 10/25/2022]
Abstract
Biological sensors must often predict their input while operating under metabolic constraints. However, determining whether or not a particular sensor is evolved or designed to be accurate and efficient is challenging. This arises partly from the functional constraints being at cross purposes and partly since quantifying the prediction performance of even in silico sensors can require prohibitively long simulations, especially when highly complex environments drive sensors out of equilibrium. To circumvent these difficulties, we develop new expressions for the prediction accuracy and thermodynamic costs of the broad class of conditionally Markovian sensors subject to complex, correlated (unifilar hidden semi-Markov) environmental inputs in nonequilibrium steady state. Predictive metrics include the instantaneous memory and the total predictable information (the mutual information between present sensor state and input future), while dissipation metrics include power extracted from the environment and the nonpredictive information rate. Success in deriving these formulae relies on identifying the environment's causal states, the input's minimal sufficient statistics for prediction. Using these formulae, we study large random channels and the simplest nontrivial biological sensor model-that of a Hill molecule, characterized by the number of ligands that bind simultaneously-the sensor's cooperativity. We find that the seemingly impoverished Hill molecule can capture an order of magnitude more predictable information than large random channels.
Collapse
|
7
|
Marzen S. Intrinsic Computation of a Monod-Wyman-Changeux Molecule. ENTROPY (BASEL, SWITZERLAND) 2018; 20:e20080599. [PMID: 33265688 PMCID: PMC7513124 DOI: 10.3390/e20080599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 08/06/2018] [Accepted: 08/10/2018] [Indexed: 06/12/2023]
Abstract
Causal states are minimal sufficient statistics of prediction of a stochastic process, their coding cost is called statistical complexity, and the implied causal structure yields a sense of the process' "intrinsic computation". We discuss how statistical complexity changes with slight changes to the underlying model- in this case, a biologically-motivated dynamical model, that of a Monod-Wyman-Changeux molecule. Perturbations to kinetic rates cause statistical complexity to jump from finite to infinite. The same is not true for excess entropy, the mutual information between past and future, or for the molecule's transfer function. We discuss the implications of this for the relationship between intrinsic and functional computation of biological sensory systems.
Collapse
Affiliation(s)
- Sarah Marzen
- Physics of Living Systems Group, Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
8
|
Marzen SE, Crutchfield JP. Optimized bacteria are environmental prediction engines. Phys Rev E 2018; 98:012408. [PMID: 30110764 DOI: 10.1103/physreve.98.012408] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Indexed: 06/08/2023]
Abstract
Experimentalists observe phenotypic variability even in isogenic bacteria populations. We explore the hypothesis that in fluctuating environments this variability is tuned to maximize a bacterium's expected log-growth rate, potentially aided by epigenetic (all inheritable nongenetic) markers that store information about past environments. Crucially, we assume a time delay between sensing and action, so that a past epigenetic marker is used to generate the present phenotypic variability. We show that, in a complex, memoryful environment, the maximal expected log-growth rate is linear in the instantaneous predictive information-the mutual information between a bacterium's epigenetic markers and future environmental states. Hence, under resource constraints, optimal epigenetic markers are causal states-the minimal sufficient statistics for prediction-or lossy approximations thereof. We propose new theoretical investigations into and new experiments on bacteria phenotypic bet-hedging in fluctuating complex environments.
Collapse
Affiliation(s)
- Sarah E Marzen
- Department of Physics, Physics of Living Systems Group, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - James P Crutchfield
- Complexity Sciences Center and Department of Physics, University of California at Davis, Davis, California 95616, USA
| |
Collapse
|
9
|
Rupe A, Crutchfield JP. Local causal states and discrete coherent structures. CHAOS (WOODBURY, N.Y.) 2018; 28:075312. [PMID: 30070532 DOI: 10.1063/1.5021130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2018] [Accepted: 04/27/2018] [Indexed: 06/08/2023]
Abstract
Coherent structures form spontaneously in nonlinear spatiotemporal systems and are found at all spatial scales in natural phenomena from laboratory hydrodynamic flows and chemical reactions to ocean, atmosphere, and planetary climate dynamics. Phenomenologically, they appear as key components that organize the macroscopic behaviors in such systems. Despite a century of effort, they have eluded rigorous analysis and empirical prediction, with progress being made only recently. As a step in this, we present a formal theory of coherent structures in fully discrete dynamical field theories. It builds on the notion of structure introduced by computational mechanics, generalizing it to a local spatiotemporal setting. The analysis' main tool employs the local causal states, which are used to uncover a system's hidden spatiotemporal symmetries and which identify coherent structures as spatially localized deviations from those symmetries. The approach is behavior-driven in the sense that it does not rely on directly analyzing spatiotemporal equations of motion, rather it considers only the spatiotemporal fields a system generates. As such, it offers an unsupervised approach to discover and describe coherent structures. We illustrate the approach by analyzing coherent structures generated by elementary cellular automata, comparing the results with an earlier, dynamic-invariant-set approach that decomposes fields into domains, particles, and particle interactions.
Collapse
Affiliation(s)
- Adam Rupe
- Complexity Sciences Center, Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Physics Department, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
10
|
Riechers PM, Crutchfield JP. Spectral simplicity of apparent complexity. I. The nondiagonalizable metadynamics of prediction. CHAOS (WOODBURY, N.Y.) 2018; 28:033115. [PMID: 29604656 DOI: 10.1063/1.4985199] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Virtually all questions that one can ask about the behavioral and structural complexity of a stochastic process reduce to a linear algebraic framing of a time evolution governed by an appropriate hidden-Markov process generator. Each type of question-correlation, predictability, predictive cost, observer synchronization, and the like-induces a distinct generator class. Answers are then functions of the class-appropriate transition dynamic. Unfortunately, these dynamics are generically nonnormal, nondiagonalizable, singular, and so on. Tractably analyzing these dynamics relies on adapting the recently introduced meromorphic functional calculus, which specifies the spectral decomposition of functions of nondiagonalizable linear operators, even when the function poles and zeros coincide with the operator's spectrum. Along the way, we establish special properties of the spectral projection operators that demonstrate how they capture the organization of subprocesses within a complex system. Circumventing the spurious infinities of alternative calculi, this leads in the sequel, Part II [P. M. Riechers and J. P. Crutchfield, Chaos 28, 033116 (2018)], to the first closed-form expressions for complexity measures, couched either in terms of the Drazin inverse (negative-one power of a singular operator) or the eigenvalues and projection operators of the appropriate transition dynamic.
Collapse
Affiliation(s)
- Paul M Riechers
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|
11
|
Riechers PM, Crutchfield JP. Spectral simplicity of apparent complexity. II. Exact complexities and complexity spectra. CHAOS (WOODBURY, N.Y.) 2018; 28:033116. [PMID: 29604661 DOI: 10.1063/1.4986248] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The meromorphic functional calculus developed in Part I overcomes the nondiagonalizability of linear operators that arises often in the temporal evolution of complex systems and is generic to the metadynamics of predicting their behavior. Using the resulting spectral decomposition, we derive closed-form expressions for correlation functions, finite-length Shannon entropy-rate approximates, asymptotic entropy rate, excess entropy, transient information, transient and asymptotic state uncertainties, and synchronization information of stochastic processes generated by finite-state hidden Markov models. This introduces analytical tractability to investigating information processing in discrete-event stochastic processes, symbolic dynamics, and chaotic dynamical systems. Comparisons reveal mathematical similarities between complexity measures originally thought to capture distinct informational and computational properties. We also introduce a new kind of spectral analysis via coronal spectrograms and the frequency-dependent spectra of past-future mutual information. We analyze a number of examples to illustrate the methods, emphasizing processes with multivariate dependencies beyond pairwise correlation. This includes spectral decomposition calculations for one representative example in full detail.
Collapse
Affiliation(s)
- Paul M Riechers
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| | - James P Crutchfield
- Complexity Sciences Center, Department of Physics, University of California at Davis, One Shields Avenue, Davis, California 95616, USA
| |
Collapse
|