1
|
Qi D. Unambiguous Models and Machine Learning Strategies for Anomalous Extreme Events in Turbulent Dynamical System. ENTROPY (BASEL, SWITZERLAND) 2024; 26:522. [PMID: 38920531 PMCID: PMC11202851 DOI: 10.3390/e26060522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 06/03/2024] [Accepted: 06/14/2024] [Indexed: 06/27/2024]
Abstract
Data-driven modeling methods are studied for turbulent dynamical systems with extreme events under an unambiguous model framework. New neural network architectures are proposed to effectively learn the key dynamical mechanisms including the multiscale coupling and strong instability, and gain robust skill for long-time prediction resistive to the accumulated model errors from the data-driven approximation. The machine learning model overcomes the inherent limitations in traditional long short-time memory networks by exploiting a conditional Gaussian structure informed of the essential physical dynamics. The model performance is demonstrated under a prototype model from idealized geophysical flow and passive tracers, which exhibits analytical solutions with representative statistical features. Many attractive properties are found in the trained model in recovering the hidden dynamics using a limited dataset and sparse observation time, showing uniformly high skill with persistent numerical stability in predicting both the trajectory and statistical solutions among different statistical regimes away from the training regime. The model framework is promising to be applied to a wider class of turbulent systems with complex structures.
Collapse
Affiliation(s)
- Di Qi
- Department of Mathematics, Purdue University, 150 North University Street, West Lafayette, IN 47907, USA
| |
Collapse
|
2
|
Jeon Y, Chang W, Jeong S, Han S, Park J. A Bayesian convolutional neural network-based generalized linear model. Biometrics 2024; 80:ujae057. [PMID: 38888097 DOI: 10.1093/biomtc/ujae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 03/02/2024] [Accepted: 05/24/2024] [Indexed: 06/20/2024]
Abstract
Convolutional neural networks (CNNs) provide flexible function approximations for a wide variety of applications when the input variables are in the form of images or spatial data. Although CNNs often outperform traditional statistical models in prediction accuracy, statistical inference, such as estimating the effects of covariates and quantifying the prediction uncertainty, is not trivial due to the highly complicated model structure and overparameterization. To address this challenge, we propose a new Bayesian approach by embedding CNNs within the generalized linear models (GLMs) framework. We use extracted nodes from the last hidden layer of CNN with Monte Carlo (MC) dropout as informative covariates in GLM. This improves accuracy in prediction and regression coefficient inference, allowing for the interpretation of coefficients and uncertainty quantification. By fitting ensemble GLMs across multiple realizations from MC dropout, we can account for uncertainties in extracting the features. We apply our methods to biological and epidemiological problems, which have both high-dimensional correlated inputs and vector covariates. Specifically, we consider malaria incidence data, brain tumor image data, and fMRI data. By extracting information from correlated inputs, the proposed method can provide an interpretable Bayesian analysis. The algorithm can be broadly applicable to image regressions or correlated data analysis by enabling accurate Bayesian inference quickly.
Collapse
Affiliation(s)
- Yeseul Jeon
- Department of Statistics and Data Science, Yonsei University, Seoul 03722, South Korea
| | - Won Chang
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, Ohio 45221, United States
- Department of Statistics, Seoul National University, Seoul 08826, South Korea
| | - Seonghyun Jeong
- Department of Statistics and Data Science, Yonsei University, Seoul 03722, South Korea
- Department of Applied Statistics, Yonsei University, Seoul 03722, South Korea
| | - Sanghoon Han
- Department of Psychology, Yonsei University, Seoul 03722, South Korea
| | - Jaewoo Park
- Department of Statistics and Data Science, Yonsei University, Seoul 03722, South Korea
- Department of Applied Statistics, Yonsei University, Seoul 03722, South Korea
| |
Collapse
|
3
|
Mahata A, Padhi R, Apte A. Variability of echo state network prediction horizon for partially observed dynamical systems. Phys Rev E 2023; 108:064209. [PMID: 38243433 DOI: 10.1103/physreve.108.064209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/10/2023] [Indexed: 01/21/2024]
Abstract
Study of dynamical systems using partial state observation is an important problem due to its applicability to many real-world systems. We address the problem by studying an echo state network (ESN) framework with partial state input with partial or full state output. Application to the Lorenz system and Chua's oscillator (both numerically simulated and experimental systems) demonstrate the effectiveness of our method. We show that the ESN, as an autonomous dynamical system, is capable of making short-term predictions up to a few Lyapunov times. However, the prediction horizon has high variability depending on the initial condition-an aspect that we explore in detail using the distribution of the prediction horizon. Further, using a variety of statistical metrics to compare the long-term dynamics of the ESN predictions with numerically simulated or experimental dynamics and observed similar results, we show that the ESN can effectively learn the system's dynamics even when trained with noisy numerical or experimental data sets. Thus, we demonstrate the potential of ESNs to serve as cheap surrogate models for simulating the dynamics of systems where complete observations are unavailable.
Collapse
Affiliation(s)
- Ajit Mahata
- Department of Data Science, Indian Institute of Science Education and Research, IISER Pune 411008, India
| | - Reetish Padhi
- Department of Data Science, Indian Institute of Science Education and Research, IISER Pune 411008, India
| | - Amit Apte
- Department of Data Science, Indian Institute of Science Education and Research, IISER Pune 411008, India
- International Centre for Theoretical Sciences (ICTS-TIFR), Bengaluru 560089, India
| |
Collapse
|
4
|
|
5
|
Bayesian Modeling of Discrete-Time Point-Referenced Spatio-Temporal Data. J Indian Inst Sci 2022. [DOI: 10.1007/s41745-022-00298-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
6
|
Huang H, Castruccio S, Genton MG. Forecasting high‐frequency spatio‐temporal wind power with dimensionally reduced echo state networks. J R Stat Soc Ser C Appl Stat 2022. [DOI: 10.1111/rssc.12540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Huang Huang
- Statistics ProgramKing Abdullah University of Science and Technology ThuwalSaudi Arabia
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and StatisticsUniversity of Notre Dame Notre DameIndianaUSA
| | - Marc G. Genton
- Statistics ProgramKing Abdullah University of Science and Technology ThuwalSaudi Arabia
| |
Collapse
|
7
|
Parker PA, Holan SH, Wills SA. A general Bayesian model for heteroskedastic data with fully conjugate full-conditional distributions. J STAT COMPUT SIM 2021. [DOI: 10.1080/00949655.2021.1925279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Paul A. Parker
- Department of Statistics, University of Missouri, Columbia, MO, USA
| | - Scott H. Holan
- Department of Statistics, University of Missouri, Columbia, MO, USA
| | - Skye A. Wills
- USDA-Natural Resources Conservation Service, National Soil Survey Center, Lincoln, NE, USA
| |
Collapse
|
8
|
Sakemi Y, Morino K, Leleu T, Aihara K. Model-size reduction for reservoir computing by concatenating internal states through time. Sci Rep 2020; 10:21794. [PMID: 33311595 PMCID: PMC7733507 DOI: 10.1038/s41598-020-78725-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 11/23/2020] [Indexed: 11/18/2022] Open
Abstract
Reservoir computing (RC) is a machine learning algorithm that can learn complex time series from data very rapidly based on the use of high-dimensional dynamical systems, such as random networks of neurons, called “reservoirs.” To implement RC in edge computing, it is highly important to reduce the amount of computational resources that RC requires. In this study, we propose methods that reduce the size of the reservoir by inputting the past or drifting states of the reservoir to the output layer at the current time step. To elucidate the mechanism of model-size reduction, the proposed methods are analyzed based on information processing capacity proposed by Dambre et al. (Sci Rep 2:514, 2012). In addition, we evaluate the effectiveness of the proposed methods on time-series prediction tasks: the generalized Hénon-map and NARMA. On these tasks, we found that the proposed methods were able to reduce the size of the reservoir up to one tenth without a substantial increase in regression error.
Collapse
Affiliation(s)
- Yusuke Sakemi
- Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, 153-8505, Japan. .,NEC Corporation, 1753 Shimonumabe Nakahara-ku, Kanagawa, 211-8666, Japan.
| | - Kai Morino
- Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, 153-8505, Japan.,Interdisciplinary Graduate School of Engineering Sciences, Kyushu University, 6-1 Kasuga-Koen, Kasuga-shi, Fukuoka, 816-8580, Japan
| | - Timothée Leleu
- Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, 153-8505, Japan.,International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo Institutes for Advanced Study, The University of Tokyo, Tokyo, 113-0033, Japan
| | - Kazuyuki Aihara
- Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba Meguro-ku, Tokyo, 153-8505, Japan.,International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo Institutes for Advanced Study, The University of Tokyo, Tokyo, 113-0033, Japan
| |
Collapse
|
9
|
|
10
|
Spatiotemporal forecast with local temporal drift applied to weather patterns in Patagonia. SN APPLIED SCIENCES 2020. [DOI: 10.1007/s42452-020-2814-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
11
|
Joseph MB. Neural hierarchical models of ecological populations. Ecol Lett 2020; 23:734-747. [PMID: 31970895 DOI: 10.1111/ele.13462] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 10/17/2019] [Accepted: 12/23/2019] [Indexed: 01/20/2023]
Abstract
Neural networks are increasingly being used in science to infer hidden dynamics of natural systems from noisy observations, a task typically handled by hierarchical models in ecology. This article describes a class of hierarchical models parameterised by neural networks - neural hierarchical models. The derivation of such models analogises the relationship between regression and neural networks. A case study is developed for a neural dynamic occupancy model of North American bird populations, trained on millions of detection/non-detection time series for hundreds of species, providing insights into colonisation and extinction at a continental scale. Flexible models are increasingly needed that scale to large data and represent ecological processes. Neural hierarchical models satisfy this need, providing a bridge between deep learning and ecological modelling that combines the function representation power of neural networks with the inferential capacity of hierarchical models.
Collapse
Affiliation(s)
- Maxwell B Joseph
- Earth Lab, Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO, 80303, USA
| |
Collapse
|
12
|
Covas E, Benetos E. Optimal neural network feature selection for spatial-temporal forecasting. CHAOS (WOODBURY, N.Y.) 2019; 29:063111. [PMID: 31266334 DOI: 10.1063/1.5095060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 05/24/2019] [Indexed: 06/09/2023]
Abstract
Neural networks, and in general machine learning techniques, have been widely employed in forecasting time series and more recently in predicting spatial-temporal signals. All of these approaches involve some kind of feature selection regarding what past data and what neighbor data to use for forecasting. In this article, we show extensive empirical evidence on how to independently construct the optimal feature selection or input representation used by the input layer of a feed forward neural network for the purpose of forecasting spatial-temporal signals. The approach is based on results from the dynamical systems theory, namely, nonlinear embedding theorems. We demonstrate it for a variety of spatial-temporal signals and show that the optimal input layer representation consists of a grid, with spatial-temporal lags determined by the minimum of the mutual information of the spatial-temporal signals and the number of points taken in space-time decided by the embedding dimension of the signal. We present evidence of this proposal by running a Monte Carlo simulation of several combinations of input layer feature designs and show that the one predicted by the nonlinear embedding theorems seems to be optimal or close to being optimal. In total, we show evidence in four unrelated systems: a series of coupled Hénon maps, a series of coupled ordinary differential equations (Lorenz-96) phenomenologically modeling atmospheric dynamics, the Kuramoto-Sivashinsky equation, a partial differential equation used in studies of instabilities in laminar flame fronts, and finally real physical data from sunspot areas in the Sun (in latitude and time) from 1874 to 2015. These four examples cover the range from simple toy models to complex nonlinear dynamical simulations and real data. Finally, we also compare our proposal against alternative feature selection methods and show that it also works for other machine learning forecasting models.
Collapse
Affiliation(s)
- E Covas
- CITEUC, Geophysical and Astronomical Observatory, University of Coimbra, 3040-004 Coimbra, Portugal
| | - E Benetos
- School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London E1 4NS, United Kingdom
| |
Collapse
|
13
|
Prediction of North Atlantic Oscillation Index with Convolutional LSTM Based on Ensemble Empirical Mode Decomposition. ATMOSPHERE 2019. [DOI: 10.3390/atmos10050252] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The North Atlantic Oscillation (NAO) is the most significant mode of the atmosphere in the North Atlantic, and it plays an important role in regulating the local weather and climate and even those of the entire Northern Hemisphere. Therefore, it is vital to predict NAO events. Since the NAO event can be quantified by the NAO index, an effective neural network model EEMD-ConvLSTM, which is based on Convolutional Long Short-Term Memory (ConvLSTM) with Ensemble Empirical Mode Decomposition (EEMD), is proposed for NAO index prediction in this paper. EEMD is applied to preprocess NAO index data, which are issued by the National Oceanic and Atmospheric Administration (NOAA), and NAO index data are decomposed into several Intrinsic Mode Functions (IMFs). After being filtered by the energy threshold, the remaining IMFs are used to reconstruct new NAO index data as the input of ConvLSTM. In order to evaluate the performance of EEMD-ConvLSTM, six methods were selected as the benchmark, which included traditional models, machine learning algorithms, and other deep neural networks. Furthermore, we forecast the NAO index with EEMD-ConvLSTM and the Rolling Forecast (RF) and compared the results with those of Global Forecast System (GFS) and the averaging of 11 Medium Range Forecast (MRF) model ensemble members (ENSM) provided by the NOAA Climate Prediction Center. The experimental results show that EEMD-ConvLSTM not only has the highest reliability from evaluation metrics, but also can better capture the variation trend of the NAO index data.
Collapse
|
14
|
Wikle CK. Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2019. [DOI: 10.1007/s13253-019-00361-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
15
|
Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data. ENTROPY 2019; 21:e21020184. [PMID: 33266899 PMCID: PMC7514666 DOI: 10.3390/e21020184] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 02/03/2019] [Accepted: 02/12/2019] [Indexed: 11/20/2022]
Abstract
Recurrent neural networks (RNNs) are nonlinear dynamical models commonly used in the machine learning and dynamical systems literature to represent complex dynamical or sequential relationships between variables. Recently, as deep learning models have become more common, RNNs have been used to forecast increasingly complicated systems. Dynamical spatio-temporal processes represent a class of complex systems that can potentially benefit from these types of models. Although the RNN literature is expansive and highly developed, uncertainty quantification is often ignored. Even when considered, the uncertainty is generally quantified without the use of a rigorous framework, such as a fully Bayesian setting. Here we attempt to quantify uncertainty in a more formal framework while maintaining the forecast accuracy that makes these models appealing, by presenting a Bayesian RNN model for nonlinear spatio-temporal forecasting. Additionally, we make simple modifications to the basic RNN to help accommodate the unique nature of nonlinear spatio-temporal data. The proposed model is applied to a Lorenz simulation and two real-world nonlinear spatio-temporal forecasting applications.
Collapse
|