1
|
Terada Y, Toyoizumi T. Chaotic neural dynamics facilitate probabilistic computations through sampling. Proc Natl Acad Sci U S A 2024; 121:e2312992121. [PMID: 38648479 PMCID: PMC11067032 DOI: 10.1073/pnas.2312992121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 02/13/2024] [Indexed: 04/25/2024] Open
Abstract
Cortical neurons exhibit highly variable responses over trials and time. Theoretical works posit that this variability arises potentially from chaotic network dynamics of recurrently connected neurons. Here, we demonstrate that chaotic neural dynamics, formed through synaptic learning, allow networks to perform sensory cue integration in a sampling-based implementation. We show that the emergent chaotic dynamics provide neural substrates for generating samples not only of a static variable but also of a dynamical trajectory, where generic recurrent networks acquire these abilities with a biologically plausible learning rule through trial and error. Furthermore, the networks generalize their experience in the stimulus-evoked samples to the inference without partial or all sensory information, which suggests a computational role of spontaneous activity as a representation of the priors as well as a tractable biological computation for marginal distributions. These findings suggest that chaotic neural dynamics may serve for the brain function as a Bayesian generative model.
Collapse
Affiliation(s)
- Yu Terada
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Saitama351-0198, Japan
- Department of Neurobiology, University of California, San Diego, La Jolla, CA92093
- The Institute for Physics of Intelligence, The University of Tokyo, Tokyo113-0033, Japan
| | - Taro Toyoizumi
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Saitama351-0198, Japan
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo113-8656, Japan
| |
Collapse
|
2
|
Liang X, Livingstone S, Griffin J. Adaptive MCMC for Bayesian Variable Selection in Generalised Linear Models and Survival Models. Entropy (Basel) 2023; 25:1310. [PMID: 37761609 PMCID: PMC10528396 DOI: 10.3390/e25091310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 08/30/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Developing an efficient computational scheme for high-dimensional Bayesian variable selection in generalised linear models and survival models has always been a challenging problem due to the absence of closed-form solutions to the marginal likelihood. The Reversible Jump Markov Chain Monte Carlo (RJMCMC) approach can be employed to jointly sample models and coefficients, but the effective design of the trans-dimensional jumps of RJMCMC can be challenging, making it hard to implement. Alternatively, the marginal likelihood can be derived conditional on latent variables using a data-augmentation scheme (e.g., Pólya-gamma data augmentation for logistic regression) or using other estimation methods. However, suitable data-augmentation schemes are not available for every generalised linear model and survival model, and estimating the marginal likelihood using a Laplace approximation or a correlated pseudo-marginal method can be computationally expensive. In this paper, three main contributions are presented. Firstly, we present an extended Point-wise implementation of Adaptive Random Neighbourhood Informed proposal (PARNI) to efficiently sample models directly from the marginal posterior distributions of generalised linear models and survival models. Secondly, in light of the recently proposed approximate Laplace approximation, we describe an efficient and accurate estimation method for marginal likelihood that involves adaptive parameters. Additionally, we describe a new method to adapt the algorithmic tuning parameters of the PARNI proposal by replacing Rao-Blackwellised estimates with the combination of a warm-start estimate and the ergodic average. We present numerous numerical results from simulated data and eight high-dimensional genetic mapping data-sets to showcase the efficiency of the novel PARNI proposal compared with the baseline add-delete-swap proposal.
Collapse
Affiliation(s)
- Xitong Liang
- Department of Statistical Science, University College London, London WC1E 6BT, UK; (S.L.); (J.G.)
| | | | | |
Collapse
|
3
|
Zhu Y. Convergence Rates for the Constrained Sampling via Langevin Monte Carlo. Entropy (Basel) 2023; 25:1234. [PMID: 37628264 PMCID: PMC10453724 DOI: 10.3390/e25081234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 08/08/2023] [Accepted: 08/15/2023] [Indexed: 08/27/2023]
Abstract
Sampling from constrained distributions has posed significant challenges in terms of algorithmic design and non-asymptotic analysis, which are frequently encountered in statistical and machine-learning models. In this study, we propose three sampling algorithms based on Langevin Monte Carlo with the Metropolis-Hastings steps to handle the distribution constrained within some convex body. We present a rigorous analysis of the corresponding Markov chains and derive non-asymptotic upper bounds on the convergence rates of these algorithms in total variation distance. Our results demonstrate that the sampling algorithm, enhanced with the Metropolis-Hastings steps, offers an effective solution for tackling some constrained sampling problems. The numerical experiments are conducted to compare our methods with several competing algorithms without the Metropolis-Hastings steps, and the results further support our theoretical findings.
Collapse
Affiliation(s)
- Yuanzheng Zhu
- School of Statistics, Southwestern University of Finance and Economics, Chengdu 611130, China
| |
Collapse
|
4
|
Zhang BJ, Marzouk YM, Spiliopoulos K. Geometry-informed irreversible perturbations for accelerated convergence of Langevin dynamics. Stat Comput 2022; 32:78. [PMID: 36156938 PMCID: PMC9485103 DOI: 10.1007/s11222-022-10147-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 09/03/2022] [Indexed: 06/16/2023]
Abstract
We introduce a novel geometry-informed irreversible perturbation that accelerates convergence of the Langevin algorithm for Bayesian computation. It is well documented that there exist perturbations to the Langevin dynamics that preserve its invariant measure while accelerating its convergence. Irreversible perturbations and reversible perturbations (such as Riemannian manifold Langevin dynamics (RMLD)) have separately been shown to improve the performance of Langevin samplers. We consider these two perturbations simultaneously by presenting a novel form of irreversible perturbation for RMLD that is informed by the underlying geometry. Through numerical examples, we show that this new irreversible perturbation can improve estimation performance over irreversible perturbations that do not take the geometry into account. Moreover we demonstrate that irreversible perturbations generally can be implemented in conjunction with the stochastic gradient version of the Langevin algorithm. Lastly, while continuous-time irreversible perturbations cannot impair the performance of a Langevin estimator, the situation can sometimes be more complicated when discretization is considered. To this end, we describe a discrete-time example in which irreversibility increases both the bias and variance of the resulting estimator.
Collapse
Affiliation(s)
- Benjamin J. Zhang
- Department of Aeronautics and Astronautics, Center for Computational Science and Engineering, Massachusetts Institute of Technology, Cambridge, USA
| | - Youssef M. Marzouk
- Department of Aeronautics and Astronautics, Center for Computational Science and Engineering, Massachusetts Institute of Technology, Cambridge, USA
| | | |
Collapse
|
5
|
Kim S, Kim JK, Ahn KW. A calibrated Bayesian method for the stratified proportional hazards model with missing covariates. Lifetime Data Anal 2022; 28:169-193. [PMID: 35034213 PMCID: PMC8977246 DOI: 10.1007/s10985-021-09542-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 12/21/2021] [Indexed: 06/14/2023]
Abstract
Missing covariates are commonly encountered when evaluating covariate effects on survival outcomes. Excluding missing data from the analysis may lead to biased parameter estimation and a misleading conclusion. The inverse probability weighting method is widely used to handle missing covariates. However, obtaining asymptotic variance in frequentist inference is complicated because it involves estimating parameters for propensity scores. In this paper, we propose a new approach based on an approximate Bayesian method without using Taylor expansion to handle missing covariates for survival data. We consider a stratified proportional hazards model so that it can be used for the non-proportional hazards structure. Two cases for missing pattern are studied: a single missing pattern and multiple missing patterns. The proposed estimators are shown to be consistent and asymptotically normal, which matches the frequentist asymptotic properties. Simulation studies show that our proposed estimators are asymptotically unbiased and the credible region obtained from posterior distribution is close to the frequentist confidence interval. The algorithm is straightforward and computationally efficient. We apply the proposed method to a stem cell transplantation data set.
Collapse
Affiliation(s)
- Soyoung Kim
- Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, 53226-0509, USA.
| | - Jae-Kwang Kim
- Department of Statistics, Iowa State University, 2438 Osborn Dr Ames, Ames, IA, 50011-1090, USA
| | - Kwang Woo Ahn
- Division of Biostatistics, Medical College of Wisconsin, Milwaukee, WI, 53226-0509, USA
| |
Collapse
|
6
|
Bottolo L, Banterle M, Richardson S, Ala-Korpela M, Järvelin MR, Lewin A. A computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional quantitative trait loci discovery. J R Stat Soc Ser C Appl Stat 2021; 70:886-908. [PMID: 35001978 PMCID: PMC7612194 DOI: 10.1111/rssc.12490] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Our work is motivated by the search for metabolite quantitative trait loci (QTL) in a cohort of more than 5000 people. There are 158 metabolites measured by NMR spectroscopy in the 31-year follow-up of the Northern Finland Birth Cohort 1966 (NFBC66). These metabolites, as with many multivariate phenotypes produced by high-throughput biomarker technology, exhibit strong correlation structures. Existing approaches for combining such data with genetic variants for multivariate QTL analysis generally ignore phenotypic correlations or make restrictive assumptions about the associations between phenotypes and genetic loci. We present a computationally efficient Bayesian seemingly unrelated regressions model for high-dimensional data, with cell-sparse variable selection and sparse graphical structure for covariance selection. Cell sparsity allows different phenotype responses to be associated with different genetic predictors and the graphical structure is used to represent the conditional dependencies between phenotype variables. To achieve feasible computation of the large model space, we exploit a factorisation of the covariance matrix. Applying the model to the NFBC66 data with 9000 directly genotyped single nucleotide polymorphisms, we are able to simultaneously estimate genotype-phenotype associations and the residual dependence structure among the metabolites. The R package BayesSUR with full documentation is available at https://cran.r-project.org/web/packages/BayesSUR/.
Collapse
Affiliation(s)
- Leonardo Bottolo
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
- The Alan Turing Institute, London, UK
- MRC Biostatistics Unit, Cambridge, UK
| | - Marco Banterle
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| | - Sylvia Richardson
- The Alan Turing Institute, London, UK
- MRC Biostatistics Unit, Cambridge, UK
| | - Mika Ala-Korpela
- Computational Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, Oulu, Finland
- NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland
| | - Marjo-Riitta Järvelin
- Center for Life Course Health Research, University of Oulu, Oulu, Finland
- Biocenter Oulu, University of Oulu, Oulu, Finland
- Department of Epidemiology and Biostatistics, Imperial College London, London, UK
- MRC-PHE Centre for Environment and Health, Imperial College London, London, UK
- Department of Life Sciences, Brunel University London, Uxbridge, UK
| | - Alex Lewin
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
7
|
Brown DA, McMahan CS, Self SW. Sampling Strategies for Fast Updating of Gaussian Markov Random Fields. AM STAT 2019; 75:52-65. [PMID: 33716305 PMCID: PMC7954130 DOI: 10.1080/00031305.2019.1595144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 02/02/2019] [Accepted: 03/06/2019] [Indexed: 10/27/2022]
Abstract
Gaussian Markov random fields (GMRFs) are popular for modeling dependence in large areal datasets due to their ease of interpretation and computational convenience afforded by the sparse precision matrices needed for random variable generation. Typically in Bayesian computation, GMRFs are updated jointly in a block Gibbs sampler or componentwise in a single-site sampler via the full conditional distributions. The former approach can speed convergence by updating correlated variables all at once, while the latter avoids solving large matrices. We consider a sampling approach in which the underlying graph can be cut so that conditionally independent sites are updated simultaneously. This algorithm allows a practitioner to parallelize updates of subsets of locations or to take advantage of 'vectorized' calculations in a high-level language such as R. Through both simulated and real data, we demonstrate computational savings that can be achieved versus both single-site and block updating, regardless of whether the data are on a regular or an irregular lattice. The approach provides a good compromise between statistical and computational efficiency and is accessible to statisticians without expertise in numerical analysis or advanced computing.
Collapse
Affiliation(s)
- D Andrew Brown
- School of Mathematical and Statistical Sciences, Clemson University, Clemson, SC, USA 29634-0975
| | | | | |
Collapse
|
8
|
Abstract
Signatures of recent historical admixture are ubiquitous in human populations. We present a mechanistic model of admixture with two source populations, encompassing recurrent admixture periods and study the distribution of admixture fractions for finite but arbitrary genome size. We provide simulation-based methods to estimate the introgression parameters and discuss the implications of reaching stationarity on estimability of parameters when there are recurrent admixture events with different rates.
Collapse
Affiliation(s)
- Erkan Ozge Buzbas
- Department of Statistical Science, University of Idaho, United States.
| | - Paul Verdu
- CNRS/MNHN/Université Paris Diderot/Sorbonne Paris Cité, France
| |
Collapse
|
9
|
Scribner KT, Soiseth C, McGuire J, Sage GK, Thorsteinson L, Nielsen JL, Knudsen E. Genetic assessment of the effects of streamscape succession on coho salmon Oncorhynchus kisutch colonization in recently deglaciated streams. J Fish Biol 2017; 91:195-218. [PMID: 28523791 DOI: 10.1111/jfb.13337] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 04/20/2017] [Indexed: 06/07/2023]
Abstract
Measures of genetic diversity within and among populations and historical geomorphological data on stream landscapes were used in model simulations based on approximate Bayesian computation (ABC) to examine hypotheses of the relative importance of stream features (geomorphology and age) associated with colonization events and gene flow for coho salmon Oncorhynchus kisutch breeding in recently deglaciated streams (50-240 years b.p.) in Glacier Bay National Park (GBNP), Alaska. Population estimates of genetic diversity including heterozygosity and allelic richness declined significantly and monotonically from the oldest and largest to youngest and smallest GBNP streams. Interpopulation variance in allele frequency increased with increasing distance between streams (r = 0·435, P < 0·01) and was inversely related to stream age (r = -0·281, P < 0·01). The most supported model of colonization involved ongoing or recent (<10 generations before sampling) colonization originating from large populations outside Glacier Bay proper into all other GBNP streams sampled. Results here show that sustained gene flow from large source populations is important to recently established O. kisutch metapopulations. Studies that document how genetic and demographic characteristics of newly founded populations vary associated with successional changes in stream habitat are of particular importance to and have significant implications for, restoration of declining or repatriation of extirpated populations in other regions of the species' native range.
Collapse
Affiliation(s)
- K T Scribner
- Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI, 48824-1222, U.S.A
- Department of Integrative Biology, Michigan State University, East Lansing, MI, 48824-1222, U.S.A
| | - C Soiseth
- Glacier Bay National Park and Preserve, P. O. Box 140, Gustavus, AK, 99826, U.S.A
| | - J McGuire
- Department of Integrative Biology, Michigan State University, East Lansing, MI, 48824-1222, U.S.A
| | - G K Sage
- U. S. Geological Survey, Alaska Science Center, 4210 University Drive, Anchorage, AK, 99508, U.S.A
| | - L Thorsteinson
- Alaska Region, U. S. Geological Survey, 250 Egan Drive, Juneau, AK, 99801, U.S.A
| | - J L Nielsen
- U. S. Geological Survey, Alaska Science Center, 4210 University Drive, Anchorage, AK, 99508, U.S.A
| | - E Knudsen
- U. S. Geological Survey, Alaska Science Center, 4210 University Drive, Anchorage, AK, 99508, U.S.A
| |
Collapse
|
10
|
Torabi M. Zero-inflated spatio-temporal models for disease mapping. Biom J 2017; 59:430-444. [PMID: 28187237 DOI: 10.1002/bimj.201600120] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 12/06/2016] [Accepted: 12/06/2016] [Indexed: 11/07/2022]
Abstract
In this paper, our aim is to analyze geographical and temporal variability of disease incidence when spatio-temporal count data have excess zeros. To that end, we consider random effects in zero-inflated Poisson models to investigate geographical and temporal patterns of disease incidence. Spatio-temporal models that employ conditionally autoregressive smoothing across the spatial dimension and B-spline smoothing over the temporal dimension are proposed. The analysis of these complex models is computationally difficult from the frequentist perspective. On the other hand, the advent of the Markov chain Monte Carlo algorithm has made the Bayesian analysis of complex models computationally convenient. Recently developed data cloning method provides a frequentist approach to mixed models that is also computationally convenient. We propose to use data cloning, which yields to maximum likelihood estimation, to conduct frequentist analysis of zero-inflated spatio-temporal modeling of disease incidence. One of the advantages of the data cloning approach is that the prediction and corresponding standard errors (or prediction intervals) of smoothing disease incidence over space and time is easily obtained. We illustrate our approach using a real dataset of monthly children asthma visits to hospital in the province of Manitoba, Canada, during the period April 2006 to March 2010. Performance of our approach is also evaluated through a simulation study.
Collapse
Affiliation(s)
- Mahmoud Torabi
- Department of Community Health Sciences, University of Manitoba, Winnipeg, MB R3E 0W3, Canada
| |
Collapse
|
11
|
Torabi M. Hierarchical multivariate mixture generalized linear models for the analysis of spatial data: An application to disease mapping. Biom J 2016; 58:1138-50. [PMID: 27374632 DOI: 10.1002/bimj.201500248] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 04/25/2016] [Accepted: 05/03/2016] [Indexed: 11/09/2022]
Abstract
Disease mapping of a single disease has been widely studied in the public health setup. Simultaneous modeling of related diseases can also be a valuable tool both from the epidemiological and from the statistical point of view. In particular, when we have several measurements recorded at each spatial location, we need to consider multivariate models in order to handle the dependence among the multivariate components as well as the spatial dependence between locations. It is then customary to use multivariate spatial models assuming the same distribution through the entire population density. However, in many circumstances, it is a very strong assumption to have the same distribution for all the areas of population density. To overcome this issue, we propose a hierarchical multivariate mixture generalized linear model to simultaneously analyze spatial Normal and non-Normal outcomes. As an application of our proposed approach, esophageal and lung cancer deaths in Minnesota are used to show the outperformance of assuming different distributions for different counties of Minnesota rather than assuming a single distribution for the population density. Performance of the proposed approach is also evaluated through a simulation study.
Collapse
Affiliation(s)
- Mahmoud Torabi
- Department of Community Health Sciences, University of Manitoba, S113 Medical Services Building, 750 Bannatyne Ave., Winnipeg, MB, Canada, R3E 0W3.
| |
Collapse
|
12
|
Abstract
This paper presents a Bayesian hierarchical spatiotemporal method of interpolation, termed as Markov Cube Kriging (MCK). The classical Kriging methods become computationally prohibitive, especially for large datasets due to the O(n3) matrix decomposition. MCK offers novel and computationally efficient solutions to address spatiotemporal misalignment, mismatch in the spatiotemporal scales and missing values across space and time in large spatiotemporal datasets. MCK is flexible in that it allows for non-separable spatiotemporal structure and nonstationary covariance at the hierarchical spatiotemporal scales. Employing MCK we developed estimates of daily concentration of fine particulates matter ≤2.5 μm in aerodynamic diameter (PM2.5) at 2.5 km spatial grid for the Cleveland Metropolitan Statistical Area, 2000 to 2009. Our validation and cross-validation suggest that MCK achieved robust prediction of spatiotemporal random effects and underlying hierarchical and nonstationary spatiotemporal structure in air pollution data. MCK has important implications for environmental epidemiology and environmental sciences for exposure quantification and collocation of data from different sources, available at different spatiotemporal scales.
Collapse
Affiliation(s)
- Dong Liang
- Department of Epidemiology, University of Iowa, Iowa City, IA 52242, USA
| | - Naresh Kumar
- Department of Epidemiology and Public Health, University of Miami, 1425 NW 10 Ave, Suite 308C, Miami, FL 33136, USA
| |
Collapse
|