1
|
Martin J, Elster C. Aleatoric Uncertainty for Errors-in-Variables Models in Deep Regression. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-11066-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
AbstractA Bayesian treatment of deep learning allows for the computation of uncertainties associated with the predictions of deep neural networks. We show how the concept of Errors-in-Variables can be used in Bayesian deep regression to also account for the uncertainty associated with the input of the employed neural network. The presented approach thereby exploits a relevant, but generally overlooked, source of uncertainty and yields a decomposition of the predictive uncertainty into an aleatoric and epistemic part that is more complete and, in many cases, more consistent from a statistical perspective. We discuss the approach along various simulated and real examples and observe that using an Errors-in-Variables model leads to an increase in the uncertainty while preserving the prediction performance of models without Errors-in-Variables. For examples with known regression function we observe that this ground truth is substantially better covered by the Errors-in-Variables model, indicating that the presented approach leads to a more reliable uncertainty estimation.
Collapse
|
2
|
Pavone A, Svensson J, Krychowiak M, Hergenhahn U, Winters V, Kornejew P, Kwak S, Hoefel U, Koenig R, Wolf RC. Neural network surrogates of Bayesian diagnostic models for fast inference of plasma parameters. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2021; 92:033531. [PMID: 33820062 DOI: 10.1063/5.0043772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 02/28/2021] [Indexed: 06/12/2023]
Abstract
We present a framework for training artificial neural networks (ANNs) as surrogate Bayesian models for the inference of plasma parameters from diagnostic data collected at nuclear fusion experiments, with the purpose of providing a fast approximation of conventional Bayesian inference. Because of the complexity of the models involved, conventional Bayesian inference can require tens of minutes for analyzing one single measurement, while hundreds of thousands can be collected during a single plasma discharge. The ANN surrogates can reduce the analysis time down to tens/hundreds of microseconds per single measurement. The core idea is to generate the training data by sampling them from the joint probability distribution of the parameters and observations of the original Bayesian model. The network can be trained to learn the reconstruction of plasma parameters from observations and the model joint probability distribution from plasma parameters and observations. Previous work has validated the application of such a framework to the former case at the Wendelstein 7-X and Joint European Torus experiments. Here, we first give a description of the general methodological principles allowing us to generate the training data, and then we show an example application of the reconstruction of the joint probability distribution of an effective ion charge Zeff-bremsstrahlung model from data collected at the latest W7-X experimental campaign. One key feature of such an approach is that the network is trained exclusively on data generated with the Bayesian model, requiring no experimental data. This allows us to replicate the training scheme and generate fast, surrogate ANNs for any validated Bayesian diagnostic model.
Collapse
Affiliation(s)
- A Pavone
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - J Svensson
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - M Krychowiak
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - U Hergenhahn
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - V Winters
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - P Kornejew
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - S Kwak
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - U Hoefel
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - R Koenig
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| | - R C Wolf
- Max-Planck-Institute for Plasma Physics, Greifswald 17491, Germany
| |
Collapse
|
3
|
Matos F, Svensson J, Pavone A, Odstrčil T, Jenko F. Deep learning for Gaussian process soft x-ray tomography model selection in the ASDEX Upgrade tokamak. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2020; 91:103501. [PMID: 33138591 DOI: 10.1063/5.0020680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 09/29/2020] [Indexed: 06/11/2023]
Abstract
Gaussian process tomography (GPT) is a method used for obtaining real-time tomographic reconstructions of the plasma emissivity profile in tokamaks, given some model for the underlying physical processes involved. GPT can also be used, thanks to Bayesian formalism, to perform model selection, i.e., comparing different models and choosing the one with maximum evidence. However, the computations involved in this particular step may become slow for data with high dimensionality, especially when comparing the evidence for many different models. Using measurements collected by the Soft X-Ray (SXR) diagnostic in the ASDEX Upgrade tokamak, we train a convolutional neural network to map SXR tomographic projections to the corresponding GPT model whose evidence is highest. We then compare the network's results, and the time required to calculate them, with those obtained through analytical Bayesian formalism. In addition, we use the network's classifications to produce tomographic reconstructions of the plasma emissivity profile.
Collapse
Affiliation(s)
- F Matos
- Max Planck Institute for Plasma Physics, Boltzmannstr. 2, 85748 Garching, Germany
| | - J Svensson
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, 17491 Greifswald, Germany
| | - A Pavone
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, 17491 Greifswald, Germany
| | - T Odstrčil
- Plasma Science and Fusion Center, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - F Jenko
- Max Planck Institute for Plasma Physics, Boltzmannstr. 2, 85748 Garching, Germany
| |
Collapse
|
4
|
Hoefel U, Hirsch M, Kwak S, Pavone A, Svensson J, Stange T, Hartfuß HJ, Schilling J, Weir G, Oosterbeek JW, Bozhenkov S, Braune H, Brunner KJ, Chaudhary N, Damm H, Fuchert G, Knauer J, Laqua H, Marsen S, Moseev D, Pasch E, Scott ER, Wilde F, Wolf R. Bayesian modeling of microwave radiometer calibration on the example of the Wendelstein 7-X electron cyclotron emission diagnostic. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2019; 90:043502. [PMID: 31042980 DOI: 10.1063/1.5082542] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Accepted: 03/17/2019] [Indexed: 06/09/2023]
Abstract
This paper reports about a novel approach to the absolute intensity calibration of an electron cyclotron emission (ECE) spectroscopy system. Typically, an ECE radiometer consists of tens of separated frequency channels corresponding to different plasma locations. An absolute calibration of the overall diagnostic including near plasma optics and transmission line is achieved with blackbody sources at LN2 temperature and room temperature via a hot/cold calibration mirror unit. As the thermal emission of the calibration source is typically a few thousand times lower than the receiver noise temperature, coherent averaging over several hours is required to get a sufficient signal to noise ratio. A forward model suitable for any radiometer calibration using the hot/cold method and a periodic switch between them has been developed and used to extract the voltage difference between the hot and cold temperature source via Bayesian analysis. In contrast to the classical analysis which evaluates only the reference temperatures, the forward model takes into account intermediate effective temperatures caused by the finite beam width and thus uses all available data optimally. This allows the evaluation of weak channels where a classical analysis would not be feasible, is statistically rigorous, and provides a measurement of the beam width. By using a variance scaling factor, a model sensitive adaptation of the absolute uncertainties can be implemented, which will be used for the combined diagnostic Bayesian modeling analysis.
Collapse
Affiliation(s)
- Udo Hoefel
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Matthias Hirsch
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Sehyun Kwak
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Andrea Pavone
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Jakob Svensson
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Torsten Stange
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Hans-Jürgen Hartfuß
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Jonathan Schilling
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Gavin Weir
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | | | - Sergey Bozhenkov
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Harald Braune
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Kai-Jakob Brunner
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Neha Chaudhary
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Hannes Damm
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Golo Fuchert
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Jens Knauer
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Heinrich Laqua
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Stefan Marsen
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Dmitry Moseev
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Ekkehard Pasch
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Evan R Scott
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Fabian Wilde
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| | - Robert Wolf
- Max Planck Institute for Plasma Physics, Wendelsteinstr. 1, D-17491 Greifswald, Germany
| |
Collapse
|