1
|
Schwabe D, Becker K, Seyferth M, Klaß A, Schaeffter T. The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review. NPJ Digit Med 2024; 7:203. [PMID: 39097662 PMCID: PMC11297942 DOI: 10.1038/s41746-024-01196-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 07/12/2024] [Indexed: 08/05/2024] Open
Abstract
The adoption of machine learning (ML) and, more specifically, deep learning (DL) applications into all major areas of our lives is underway. The development of trustworthy AI is especially important in medicine due to the large implications for patients' lives. While trustworthiness concerns various aspects including ethical, transparency and safety requirements, we focus on the importance of data quality (training/test) in DL. Since data quality dictates the behaviour of ML products, evaluating data quality will play a key part in the regulatory approval of medical ML products. We perform a systematic review following PRISMA guidelines using the databases Web of Science, PubMed and ACM Digital Library. We identify 5408 studies, out of which 120 records fulfil our eligibility criteria. From this literature, we synthesise the existing knowledge on data quality frameworks and combine it with the perspective of ML applications in medicine. As a result, we propose the METRIC-framework, a specialised data quality framework for medical training data comprising 15 awareness dimensions, along which developers of medical ML applications should investigate the content of a dataset. This knowledge helps to reduce biases as a major source of unfairness, increase robustness, facilitate interpretability and thus lays the foundation for trustworthy AI in medicine. The METRIC-framework may serve as a base for systematically assessing training datasets, establishing reference datasets, and designing test datasets which has the potential to accelerate the approval of medical ML products.
Collapse
Affiliation(s)
- Daniel Schwabe
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany.
| | - Katinka Becker
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Martin Seyferth
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Andreas Klaß
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
| | - Tobias Schaeffter
- Division Medical Physics and Metrological Information Technology, Physikalisch-Technische Bundesanstalt, Berlin, Germany
- Department of Medical Engineering, Technical University Berlin, Berlin, Germany
- Einstein Centre for Digital Future, Berlin, Germany
| |
Collapse
|
2
|
Chang KW, Karthikesh MS, Zhu Y, Hudson HM, Barbay S, Bundy D, Guggenmos DJ, Frost S, Nudo RJ, Wang X, Yang X. Photoacoustic imaging of squirrel monkey cortical responses induced by peripheral mechanical stimulation. JOURNAL OF BIOPHOTONICS 2024; 17:e202300347. [PMID: 38171947 PMCID: PMC10961203 DOI: 10.1002/jbio.202300347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 11/08/2023] [Accepted: 11/29/2023] [Indexed: 01/05/2024]
Abstract
Non-human primates (NHPs) are crucial models for studies of neuronal activity. Emerging photoacoustic imaging modalities offer excellent tools for studying NHP brains with high sensitivity and high spatial resolution. In this research, a photoacoustic microscopy (PAM) device was used to provide a label-free quantitative characterization of cerebral hemodynamic changes due to peripheral mechanical stimulation. A 5 × 5 mm area within the somatosensory cortex region of an adult squirrel monkey was imaged. A deep, fully connected neural network was characterized and applied to the PAM images of the cortex to enhance the vessel structures after mechanical stimulation on the forelimb digits. The quality of the PAM images was improved significantly with a neural network while preserving the hemodynamic responses. The functional responses to the mechanical stimulation were characterized based on the improved PAM images. This study demonstrates capability of PAM combined with machine learning for functional imaging of the NHP brain.
Collapse
Affiliation(s)
- Kai-Wei Chang
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, Michigan, 48109, United States
| | | | - Yunhao Zhu
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, Michigan, 48109, United States
| | - Heather M. Hudson
- Landon Center on Aging, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
- Department of Rehabilitation Medicine, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
| | - Scott Barbay
- Landon Center on Aging, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
- Department of Rehabilitation Medicine, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
| | - David Bundy
- Landon Center on Aging, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
- Department of Rehabilitation Medicine, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
| | - David J. Guggenmos
- Landon Center on Aging, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
- Department of Rehabilitation Medicine, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
| | - Shawn Frost
- Landon Center on Aging, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
- Department of Rehabilitation Medicine, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
| | - Randolph J. Nudo
- Landon Center on Aging, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
- Department of Rehabilitation Medicine, University of Kansas Medical Center, Kansas City, Kansas, 66160, United States
| | - Xueding Wang
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, Michigan, 48109, United States
| | - Xinmai Yang
- Bioengineering Graduate Program and Institute for Bioengineering Research, University of Kansas, Lawrence, Kansas, 66045, United States
- Department of Mechanical Engineering, University of Kansas, Lawrence, Kansas, 66045, United States
| |
Collapse
|
3
|
Lacan A, Sebag M, Hanczar B. GAN-based data augmentation for transcriptomics: survey and comparative assessment. Bioinformatics 2023; 39:i111-i120. [PMID: 37387181 DOI: 10.1093/bioinformatics/btad239] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Transcriptomics data are becoming more accessible due to high-throughput and less costly sequencing methods. However, data scarcity prevents exploiting deep learning models' full predictive power for phenotypes prediction. Artificially enhancing the training sets, namely data augmentation, is suggested as a regularization strategy. Data augmentation corresponds to label-invariant transformations of the training set (e.g. geometric transformations on images and syntax parsing on text data). Such transformations are, unfortunately, unknown in the transcriptomic field. Therefore, deep generative models such as generative adversarial networks (GANs) have been proposed to generate additional samples. In this article, we analyze GAN-based data augmentation strategies with respect to performance indicators and the classification of cancer phenotypes. RESULTS This work highlights a significant boost in binary and multiclass classification performances due to augmentation strategies. Without augmentation, training a classifier on only 50 RNA-seq samples yields an accuracy of, respectively, 94% and 70% for binary and tissue classification. In comparison, we achieved 98% and 94% of accuracy when adding 1000 augmented samples. Richer architectures and more expensive training of the GAN return better augmentation performances and generated data quality overall. Further analysis of the generated data shows that several performance indicators are needed to assess its quality correctly. AVAILABILITY AND IMPLEMENTATION All data used for this research are publicly available and comes from The Cancer Genome Atlas. Reproducible code is available on the GitLab repository: https://forge.ibisc.univ-evry.fr/alacan/GANs-for-transcriptomics.
Collapse
Affiliation(s)
- Alice Lacan
- IBISC, University Paris-Saclay (Univ. Evry), Evry 91000, France
| | - Michèle Sebag
- TAU, CNRS-INRIA-LISN, University Paris-Saclay, Gif-sur-Yvette 91190, France
| | - Blaise Hanczar
- IBISC, University Paris-Saclay (Univ. Evry), Evry 91000, France
| |
Collapse
|
4
|
Sum J, Leung CS. Regularization Effect of Random Node Fault/Noise on Gradient Descent Learning Algorithm. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2619-2632. [PMID: 34487503 DOI: 10.1109/tnnls.2021.3107051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
For decades, adding fault/noise during training by gradient descent has been a technique for getting a neural network (NN) tolerant to persistent fault/noise or getting an NN with better generalization. In recent years, this technique has been readvocated in deep learning to avoid overfitting. Yet, the objective function of such fault/noise injection learning has been misinterpreted as the desired measure (i.e., the expected mean squared error (mse) of the training samples) of the NN with the same fault/noise. The aims of this article are: 1) to clarify the above misconception and 2) investigate the actual regularization effect of adding node fault/noise when training by gradient descent. Based on the previous works on adding fault/noise during training, we speculate the reason why the misconception appears. In the sequel, it is shown that the learning objective of adding random node fault during gradient descent learning (GDL) for a multilayer perceptron (MLP) is identical to the desired measure of the MLP with the same fault. If additive (resp. multiplicative) node noise is added during GDL for an MLP, the learning objective is not identical to the desired measure of the MLP with such noise. For radial basis function (RBF) networks, it is shown that the learning objective is identical to the corresponding desired measure for all three fault/noise conditions. Empirical evidence is presented to support the theoretical results and, hence, clarify the misconception that the objective function of a fault/noise injection learning might not be interpreted as the desired measure of the NN with the same fault/noise. Afterward, the regularization effect of adding node fault/noise during training is revealed for the case of RBF networks. Notably, it is shown that the regularization effect of adding additive or multiplicative node noise (MNN) during training an RBF is reducing network complexity. Applying dropout regularization in RBF networks, its effect is the same as adding MNN during training.
Collapse
|
5
|
Pinot R, Meunier L, Yger F, Gouy-Pailler C, Chevaleyre Y, Atif J. On the robustness of randomized classifiers to adversarial examples. Mach Learn 2022. [DOI: 10.1007/s10994-022-06216-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AbstractThis paper investigates the theory of robustness against adversarial attacks. We focus on randomized classifiers (i.e. classifiers that output random variables) and provide a thorough analysis of their behavior through the lens of statistical learning theory and information theory. To this aim, we introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics. Equipped with this definition, we make two new contributions. The first one consists in devising a new upper bound on the adversarial generalization gap of randomized classifiers. More precisely, we devise bounds on the generalization gap and the adversarial gap i.e. the gap between the risk and the worst-case risk under attack) of randomized classifiers. The second contribution presents a yet simple but efficient noise injection method to design robust randomized classifiers. We show that our results are applicable to a wide range of machine learning models under mild hypotheses. We further corroborate our findings with experimental results using deep neural networks on standard image datasets, namely CIFAR-10 and CIFAR-100. On these tasks, we manage to design robust models that simultaneously achieve state-of-the-art accuracy (over 0.82 clean accuracy on CIFAR-10) and enjoy guaranteed robust accuracy bounds (0.45 against $$\ell _{2}$$
ℓ
2
adversaries with magnitude 0.5 on CIFAR-10).
Collapse
|
6
|
Bai S, Duan F, Chapeau-Blondeau F, Abbott D. Generalization of stochastic-resonance-based threshold networks with Tikhonov regularization. Phys Rev E 2022; 106:L012101. [PMID: 35974493 DOI: 10.1103/physreve.106.l012101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 06/17/2022] [Indexed: 06/15/2023]
Abstract
Injecting artificial noise into a feedforward threshold neural network allows it to become trainable by gradient-based methods and also enlarges the parameter space as well as the range of synaptic weights. This configuration constitutes a stochastic-resonance-based threshold neural network, where the noise level can adaptively converge to a nonzero optimal value for finding a local minimum of the loss criterion. We prove theoretically that the injected noise plays the role of a generalized Tikhonov regularizer for training the designed threshold network. Experiments on regression and classification problems demonstrate that the generalization of the stochastic-resonance-based threshold network is improved by the injection of noise. The feasibility of injecting noise into the threshold neural network opens up the potential for adaptive stochastic resonance in machine learning.
Collapse
Affiliation(s)
- Saiya Bai
- Institute of Complexity Science, College of Automation, Qingdao University, Qingdao 266071, People's Republic of China
| | - Fabing Duan
- Institute of Complexity Science, College of Automation, Qingdao University, Qingdao 266071, People's Republic of China
| | - François Chapeau-Blondeau
- Laboratoire Angevin de Recherche en Ingénierie des Systèmes, Université d'Angers, 62 Avenue Notre Dame du Lac, 49000 Angers, France
| | - Derek Abbott
- Centre for Biomedical Engineering and School of Electrical and Electronic Engineering, University of Adelaide, Adelaide, South Australia 5005, Australia
| |
Collapse
|
7
|
Zhu J, Gienger M, Kober J. Learning Task-Parameterized Skills From Few Demonstrations. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3150013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
8
|
Bal C, Demir S. JMASM 55: MATLAB Algorithms and Source Codes of 'cbnet' Function for Univariate Time Series Modeling with Neural Networks (MATLAB). JOURNAL OF MODERN APPLIED STATISTICAL METHODS 2021. [DOI: 10.22237/jmasm/1608553080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Artificial Neural Networks (ANN) can be designed as a nonparametric tool for time series modeling. MATLAB serves as a powerful environment for ANN modeling. Although Neural Network Time Series Tool (ntstool) is useful for modeling time series, more detailed functions could be more useful in order to get more detailed and comprehensive analysis results. For these purposes, cbnet function with properties such as input lag generator, step-ahead forecaster, trial-error based network selection strategy, alternative network selection with various performance measure and global repetition feature to obtain more alternative network has been developed, and MATLAB algorithms and source codes has been introduced. A detailed comparison with the ntstool is carried out, showing that the cbnet function covers the shortcomings of ntstool.
Collapse
Affiliation(s)
- Cagatay Bal
- Muğla Sitki Kocman University, Muğla, Turkey
| | | |
Collapse
|
9
|
A noise injection strategy for graph autoencoder training. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05283-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
10
|
Sum J, Leung CS, Ho K. A Limitation of Gradient Descent Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2227-2232. [PMID: 31398136 DOI: 10.1109/tnnls.2019.2927689] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Over decades, gradient descent has been applied to develop learning algorithm to train a neural network (NN). In this brief, a limitation of applying such algorithm to train an NN with persistent weight noise is revealed. Let V(w) be the performance measure of an ideal NN. V(w) is applied to develop the gradient descent learning (GDL). With weight noise, the desired performance measure (denoted as J(w) ) is E[V(~w)|w] , where ~w is the noisy weight vector. Applying GDL to train an NN with weight noise, the actual learning objective is clearly not V(w) but another scalar function L(w) . For decades, there is a misconception that L(w) = J(w) , and hence, the actual model attained by the GDL is the desired model. However, we show that it might not: 1) with persistent additive weight noise, the actual model attained is the desired model as L(w) = J(w) ; and 2) with persistent multiplicative weight noise, the actual model attained is unlikely the desired model as L(w) ≠ J(w) . Accordingly, the properties of the models attained as compared with the desired models are analyzed and the learning curves are sketched. Simulation results on 1) a simple regression problem and 2) the MNIST handwritten digit recognition are presented to support our claims.
Collapse
|
11
|
Sum J, Leung CS. Learning Algorithm for Boltzmann Machines With Additive Weight and Bias Noise. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3200-3204. [PMID: 30668482 DOI: 10.1109/tnnls.2018.2889072] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This brief presents analytical results on the effect of additive weight/bias noise on a Boltzmann machine (BM), in which the unit output is in {-1, 1} instead of {0, 1}. With such noise, it is found that the state distribution is yet another Boltzmann distribution but the temperature factor is elevated. Thus, the desired gradient ascent learning algorithm is derived, and the corresponding learning procedure is developed. This learning procedure is compared with the learning procedure applied to train a BM with noise. It is found that these two procedures are identical. Therefore, the learning algorithm for noise-free BMs is suitable for implementing as an online learning algorithm for an analog circuit-implemented BM, even if the variances of the additive weight noise and bias noise are unknown.
Collapse
|
12
|
Bertleff M, Domsch S, Weingärtner S, Zapp J, O'Brien K, Barth M, Schad LR. Diffusion parameter mapping with the combined intravoxel incoherent motion and kurtosis model using artificial neural networks at 3 T. NMR IN BIOMEDICINE 2017; 30:e3833. [PMID: 28960549 DOI: 10.1002/nbm.3833] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Revised: 08/17/2017] [Accepted: 08/17/2017] [Indexed: 06/07/2023]
Abstract
Artificial neural networks (ANNs) were used for voxel-wise parameter estimation with the combined intravoxel incoherent motion (IVIM) and kurtosis model facilitating robust diffusion parameter mapping in the human brain. The proposed ANN approach was compared with conventional least-squares regression (LSR) and state-of-the-art multi-step fitting (LSR-MS) in Monte-Carlo simulations and in vivo in terms of estimation accuracy and precision, number of outliers and sensitivity in the distinction between grey (GM) and white (WM) matter. Both the proposed ANN approach and LSR-MS yielded visually increased parameter map quality. Estimations of all parameters (perfusion fraction f, diffusion coefficient D, pseudo-diffusion coefficient D*, kurtosis K) were in good agreement with the literature using ANN, whereas LSR-MS resulted in D* overestimation and LSR yielded increased values for f and D*, as well as decreased values for K. Using ANN, outliers were reduced for the parameters f (ANN, 1%; LSR-MS, 19%; LSR, 8%), D* (ANN, 21%; LSR-MS, 25%; LSR, 23%) and K (ANN, 0%; LSR-MS, 0%; LSR, 15%). Moreover, ANN enabled significant distinction between GM and WM based on all parameters, whereas LSR facilitated this distinction only based on D and LSR-MS on f, D and K. Overall, the proposed ANN approach was found to be superior to conventional LSR, posing a powerful alternative to the state-of-the-art method LSR-MS with several advantages in the estimation of IVIM-kurtosis parameters, which might facilitate increased applicability of enhanced diffusion models at clinical scan times.
Collapse
Affiliation(s)
- Marco Bertleff
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Sebastian Domsch
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Sebastian Weingärtner
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
- Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN, USA
- Center for Magnetic Resonance Research, University of Minnesota, Minneapolis, MN, USA
| | - Jascha Zapp
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| | - Kieran O'Brien
- Center for Advanced Imaging, University of Queensland, St Lucia, QLD, Australia
| | - Markus Barth
- Center for Advanced Imaging, University of Queensland, St Lucia, QLD, Australia
| | - Lothar R Schad
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
| |
Collapse
|
13
|
Domsch S, Mürle B, Weingärtner S, Zapp J, Wenz F, Schad LR. Oxygen extraction fraction mapping at 3 Tesla using an artificial neural network: A feasibility study. Magn Reson Med 2017; 79:890-899. [DOI: 10.1002/mrm.26749] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Revised: 03/27/2017] [Accepted: 04/16/2017] [Indexed: 01/24/2023]
Affiliation(s)
- Sebastian Domsch
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim; Heidelberg University; Germany
| | - Bettina Mürle
- Department of Neuroradiology, Medical Faculty Mannheim; Heidelberg University; Germany
| | - Sebastian Weingärtner
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim; Heidelberg University; Germany
- Electrical and Computer Engineering; University of Minnesota; Minneapolis Minnesota USA
- Center for Magnetic Resonance Research; University of Minnesota; Minneapolis Minnesota USA
| | - Jascha Zapp
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim; Heidelberg University; Germany
| | - Frederik Wenz
- Department of Radiation Oncology, Medical Faculty Mannheim; Heidelberg University; Germany
| | - Lothar R. Schad
- Computer Assisted Clinical Medicine, Medical Faculty Mannheim; Heidelberg University; Germany
| |
Collapse
|
14
|
White KR, Stefanski LA, Wu Y. Variable Selection in Kernel Regression Using Measurement Error Selection Likelihoods. J Am Stat Assoc 2017; 112:1587-1597. [PMID: 29628539 DOI: 10.1080/01621459.2016.1222287] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
This paper develops a nonparametric shrinkage and selection estimator via the measurement error selection likelihood approach recently proposed by Stefanski, Wu, and White. The Measurement Error Kernel Regression Operator (MEKRO) has the same form as the Nadaraya-Watson kernel estimator, but optimizes a measurement error model selection likelihood to estimate the kernel bandwidths. Much like LASSO or COSSO solution paths, MEKRO results in solution paths depending on a tuning parameter that controls shrinkage and selection via a bound on the harmonic mean of the pseudo-measurement error standard deviations. We use small-sample-corrected AIC to select the tuning parameter. Large-sample properties of MEKRO are studied and small-sample properties are explored via Monte Carlo experiments and applications to data.
Collapse
Affiliation(s)
- Kyle R White
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Leonard A Stefanski
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| | - Yichao Wu
- Department of Statistics, North Carolina State University, Raleigh, NC 27695
| |
Collapse
|
15
|
Abbasi E, Shiri ME, Ghatee M. A regularized root–quartic mixture of experts for complex classification problems. Knowl Based Syst 2016. [DOI: 10.1016/j.knosys.2016.07.018] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
16
|
Ohno H. Uniforming the dimensionality of data with neural networks for materials informatics. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.04.017] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
17
|
Schmid T, Bogdan M, Günzel D. Discerning apical and basolateral properties of HT-29/B6 and IPEC-J2 cell layers by impedance spectroscopy, mathematical modeling and machine learning. PLoS One 2013; 8:e62913. [PMID: 23840862 PMCID: PMC3698131 DOI: 10.1371/journal.pone.0062913] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2012] [Accepted: 03/26/2013] [Indexed: 11/19/2022] Open
Abstract
Quantifying changes in partial resistances of epithelial barriers in vitro is a challenging and time-consuming task in physiology and pathophysiology. Here, we demonstrate that electrical properties of epithelial barriers can be estimated reliably by combining impedance spectroscopy measurements, mathematical modeling and machine learning algorithms. Conventional impedance spectroscopy is often used to estimate epithelial capacitance as well as epithelial and subepithelial resistance. Based on this, the more refined two-path impedance spectroscopy makes it possible to further distinguish transcellular and paracellular resistances. In a next step, transcellular properties may be further divided into their apical and basolateral components. The accuracy of these derived values, however, strongly depends on the accuracy of the initial estimates. To obtain adequate accuracy in estimating subepithelial and epithelial resistance, artificial neural networks were trained to estimate these parameters from model impedance spectra. Spectra that reflect behavior of either HT-29/B6 or IPEC-J2 cells as well as the data scatter intrinsic to the used experimental setup were created computationally. To prove the proposed approach, reliability of the estimations was assessed with both modeled and measured impedance spectra. Transcellular and paracellular resistances obtained by such neural network-enhanced two-path impedance spectroscopy are shown to be sufficiently reliable to derive the underlying apical and basolateral resistances and capacitances. As an exemplary perturbation of pathophysiological importance, the effect of forskolin on the apical resistance of HT-29/B6 cells was quantified.
Collapse
Affiliation(s)
- Thomas Schmid
- Department of Mathematics and Computer Science, Universität Leipzig, Leipzig, Germany
| | - Martin Bogdan
- Department of Mathematics and Computer Science, Universität Leipzig, Leipzig, Germany
| | - Dorothee Günzel
- Institute of Clinical Physiology, Charité Universitätsmedizin Berlin, Berlin, Germany
- * E-mail:
| |
Collapse
|
18
|
|
19
|
Ge Z, Song Z, Gao F. Statistical Prediction of Product Quality in Batch Processes with Limited Batch-Cycle Data. Ind Eng Chem Res 2012. [DOI: 10.1021/ie202554r] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Zhiqiang Ge
- State Key Laboratory of Industrial
Control Technology, Institute of Industrial Process Control, Zhejiang University, Hangzhou 310027, Zhejiang, P.
R. China
| | - Zhihuan Song
- State Key Laboratory of Industrial
Control Technology, Institute of Industrial Process Control, Zhejiang University, Hangzhou 310027, Zhejiang, P.
R. China
| | | |
Collapse
|
20
|
Bachtiar LR, Unsworth CP, Newcomb RD, Crampin EJ. Using artificial neural networks to classify unknown volatile chemicals from the firings of insect olfactory sensory neurons. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:2752-5. [PMID: 22254911 DOI: 10.1109/iembs.2011.6090754] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The olfactory system detects volatile chemical compounds, known as odour molecules or odorants. Such odorants have a diverse chemical structure which in turn interact with the receptors of the olfactory system. The insect olfactory system provides a unique opportunity to directly measure the firing rates that are generated by the individual olfactory sensory neurons (OSNs) which have been stimulated by odorants in order to use this data to inform their classification. In this work, we demonstrate that it is possible to use the firing rates from an array of OSNs of the vinegar fly, Drosophila melanogaster, to train an Artificial Neural Network (ANN), as a series of a Multi-Layer Perceptrons (MLPs), to differentiate between eight distinct chemical classes. We demonstrate that the MLPs when trained on 108 odorants, for both clean and 10% noise injected data, can reliably identify 87% of an unseen validation set of chemicals using noise injection. In addition, the noise injected MLPs provide a more accurate level of identification. This demonstrates that a 10% noise injected series of MLPs provides a robust method for classifying chemicals from the firing rates of OSNs and paves the way to a future realisation of an artificial olfactory biosensor.
Collapse
Affiliation(s)
- Luqman R Bachtiar
- Department of Engineering Science, The University of Auckland, Auckland 1010, New Zealand.
| | | | | | | |
Collapse
|
21
|
Sum JPF, Leung CS, Ho KIJ. On-line node fault injection training algorithm for MLP networks: objective function and convergence analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:211-222. [PMID: 24808501 DOI: 10.1109/tnnls.2011.2178477] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Improving fault tolerance of a neural network has been studied for more than two decades. Various training algorithms have been proposed in sequel. The on-line node fault injection-based algorithm is one of these algorithms, in which hidden nodes randomly output zeros during training. While the idea is simple, theoretical analyses on this algorithm are far from complete. This paper presents its objective function and the convergence proof. We consider three cases for multilayer perceptrons (MLPs). They are: (1) MLPs with single linear output node; (2) MLPs with multiple linear output nodes; and (3) MLPs with single sigmoid output node. For the convergence proof, we show that the algorithm converges with probability one. For the objective function, we show that the corresponding objective functions of cases (1) and (2) are of the same form. They both consist of a mean square errors term, a regularizer term, and a weight decay term. For case (3), the objective function is slight different from that of cases (1) and (2). With the objective functions derived, we can compare the similarities and differences among various algorithms and various cases.
Collapse
|
22
|
|
23
|
|
24
|
Ho K, Leung CS, Sum J. Objective functions of online weight noise injection training algorithms for MLPs. IEEE TRANSACTIONS ON NEURAL NETWORKS 2010; 22:317-23. [PMID: 21189237 DOI: 10.1109/tnn.2010.2095881] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Injecting weight noise during training has been a simple strategy to improve the fault tolerance of multilayer perceptrons (MLPs) for almost two decades, and several online training algorithms have been proposed in this regard. However, there are some misconceptions about the objective functions being minimized by these algorithms. Some existing results misinterpret that the prediction error of a trained MLP affected by weight noise is equivalent to the objective function of a weight noise injection algorithm. In this brief, we would like to clarify these misconceptions. Two weight noise injection scenarios will be considered: one is based on additive weight noise injection and the other is based on multiplicative weight noise injection. To avoid the misconceptions, we use their mean updating equations to analyze the objective functions. For injecting additive weight noise during training, we show that the true objective function is identical to the prediction error of a faulty MLP whose weights are affected by additive weight noise. It consists of the conventional mean square error and a smoothing regularizer. For injecting multiplicative weight noise during training, we show that the objective function is different from the prediction error of a faulty MLP whose weights are affected by multiplicative weight noise. With our results, some existing misconceptions regarding MLP training with weight noise injection can now be resolved.
Collapse
Affiliation(s)
- Kevin Ho
- Department of Computer Science and Communication Engineering, Providence University, Taichung 43301, Taiwan.
| | | | | |
Collapse
|
25
|
Grandvalet Y. Anisotropic noise injection for input variables relevance determination. ACTA ACUST UNITED AC 2010; 11:1201-12. [PMID: 18249847 DOI: 10.1109/72.883393] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
There are two archetypal ways to control the complexity of a flexible regressor: subset selection and ridge regression. In neural-networks jargon, they are, respectively, known as pruning and weight decay. These techniques may also be adapted to estimate which features of the input space are relevant for predicting the output variables. Relevance is given by a binary indicator for subset selection, and by a continuous rating for ridge regression. This paper shows how to achieve such a rating for a multilayer perceptron trained with noise (or jitter). Noise injection (NI) is modified in order to penalize heavily irrelevant features. The proposed algorithm is attractive as it requires the tuning of a single parameter. This parameter controls the complexity of the model (effective number of parameters) together with the rating of feature relevances (effective input space dimension). Bounds on the effective number of parameters support that the stability of this adaptive scheme is enforced by the constraints applied to the admissible set of relevance indices. The good properties of the algorithm are confirmed by satisfactory experimental results on simulated data sets.
Collapse
Affiliation(s)
- Y Grandvalet
- Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, 60205 Compiègne Cedex, France.
| |
Collapse
|
26
|
|
27
|
Ho KIJ, Leung CS, Sum J. Convergence and objective functions of some fault/noise-injection-based online learning algorithms for RBF networks. IEEE TRANSACTIONS ON NEURAL NETWORKS 2010; 21:938-47. [PMID: 20388593 DOI: 10.1109/tnn.2010.2046179] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In the last two decades, many online fault/noise injection algorithms have been developed to attain a fault tolerant neural network. However, not much theoretical works related to their convergence and objective functions have been reported. This paper studies six common fault/noise-injection-based online learning algorithms for radial basis function (RBF) networks, namely 1) injecting additive input noise, 2) injecting additive/multiplicative weight noise, 3) injecting multiplicative node noise, 4) injecting multiweight fault (random disconnection of weights), 5) injecting multinode fault during training, and 6) weight decay with injecting multinode fault. Based on the Gladyshev theorem, we show that the convergence of these six online algorithms is almost sure. Moreover, their true objective functions being minimized are derived. For injecting additive input noise during training, the objective function is identical to that of the Tikhonov regularizer approach. For injecting additive/multiplicative weight noise during training, the objective function is the simple mean square training error. Thus, injecting additive/multiplicative weight noise during training cannot improve the fault tolerance of an RBF network. Similar to injective additive input noise, the objective functions of other fault/noise-injection-based online algorithms contain a mean square error term and a specialized regularization term.
Collapse
Affiliation(s)
- Kevin I-J Ho
- Department of Computer Science and Communication Engineering, Providence University, Sha-Lu 433, Taiwan.
| | | | | |
Collapse
|
28
|
Zur RM, Jiang Y, Pesce LL, Drukker K. Noise injection for training artificial neural networks: a comparison with weight decay and early stopping. Med Phys 2010; 36:4810-8. [PMID: 19928111 DOI: 10.1118/1.3213517] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The purpose of this study was to investigate the effect of a noise injection method on the "overfitting" problem of artificial neural networks (ANNs) in two-class classification tasks. The authors compared ANNs trained with noise injection to ANNs trained with two other methods for avoiding overfitting: weight decay and early stopping. They also evaluated an automatic algorithm for selecting the magnitude of the noise injection. They performed simulation studies of an exclusive-or classification task with training datasets of 50, 100, and 200 cases (half normal and half abnormal) and an independent testing dataset of 2000 cases. They also compared the methods using a breast ultrasound dataset of 1126 cases. For simulated training datasets of 50 cases, the area under the receiver operating characteristic curve (AUC) was greater (by 0.03) when training with noise injection than when training without any regularization, and the improvement was greater than those from weight decay and early stopping (both of 0.02). For training datasets of 100 cases, noise injection and weight decay yielded similar increases in the AUC (0.02), whereas early stopping produced a smaller increase (0.01). For training datasets of 200 cases, the increases in the AUC were negligibly small for all methods (0.005). For the ultrasound dataset, noise injection had a greater average AUC than ANNs trained without regularization and a slightly greater average AUC than ANNs trained with weight decay. These results indicate that training ANNs with noise injection can reduce overfitting to a greater degree than early stopping and to a similar degree as weight decay.
Collapse
Affiliation(s)
- Richard M Zur
- Department of Radiology, The University of Chicago, 5841 South Maryland Avenue, MC2026, Chicago, Illinois 60637, USA.
| | | | | | | |
Collapse
|
29
|
Linder R, Richards T, Wagner M. Microarray data classified by artificial neural networks. Methods Mol Biol 2007; 382:345-72. [PMID: 18220242 DOI: 10.1007/978-1-59745-304-2_22] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Systems biology has enjoyed explosive growth in both the number of people participating in this area of research and the number of publications on the topic. The field of systems biology encompasses the in silico analysis of high-throughput data as provided by DNA or protein microarrays. Along with the increasing availability of microarray data, attention is focused on methods of analyzing the expression rates. One important type of analysis is the classification task, for example, distinguishing different types of cell functions or tumors. Recently, interest has been awakened toward artificial neural networks (ANN), which have many appealing characteristics such as an exceptional degree of accuracy. Nonlinear relationships or independence from certain assumptions regarding the data distribution are also considered. The current work reviews advantages as well as disadvantages of neural networks in the context of microarray analysis. Comparisons are drawn to alternative methods. Selected solutions are discussed, and finally algorithms for the effective combination of multiple ANNs are presented. The development of approaches to use ANN-processed microarray data applicable to run cell and tissue simulations may be slated for future investigation.
Collapse
Affiliation(s)
- Roland Linder
- Institute of Medical Informatics, University of Lübeck, Germany
| | | | | |
Collapse
|
30
|
Unsworth CP, Coghill G. Excessive Noise Injection Training of Neural Networks for Markerless Tracking in Obscured and Segmented Environments. Neural Comput 2006; 18:2122-45. [PMID: 16846389 DOI: 10.1162/neco.2006.18.9.2122] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
In this letter, we demonstrate that the generalization properties of a neural network (NN) can be extended to encompass objects that obscure or segment the original image in its foreground or background. We achieve this by piloting an extension of the noise injection training technique, which we term excessive noise injection (ENI), on a simple feedforward multilayer perceptron (MLP) network with vanilla backward error propagation to achieve this aim. Six tests are reported that show the ability of an NN to distinguish six similar states of motion of a simplified human figure that has become obscured by moving vertical and horizontal bars and random blocks for different levels of obscuration. Four more extensive tests are then reported to determine the bounds of the technique. The results from the ENI network were compared to results from the same NN trained on clean states only. The results pilot strong evidence that it is possible to track a human subject behind objects using this technique, and thus this technique lends itself to a real-time markerless tracking system from a single video stream.
Collapse
Affiliation(s)
- C P Unsworth
- Department of Engineering Science, University of Auckland, Auckland 1001, New Zealand.
| | | |
Collapse
|
31
|
Hua J, Lowey J, Xiong Z, Dougherty ER. Noise-injected neural networks show promise for use on small-sample expression data. BMC Bioinformatics 2006; 7:274. [PMID: 16737545 PMCID: PMC1524820 DOI: 10.1186/1471-2105-7-274] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2006] [Accepted: 05/31/2006] [Indexed: 11/25/2022] Open
Abstract
Background Overfitting the data is a salient issue for classifier design in small-sample settings. This is why selecting a classifier from a constrained family of classifiers, ones that do not possess the potential to too finely partition the feature space, is typically preferable. But overfitting is not merely a consequence of the classifier family; it is highly dependent on the classification rule used to design a classifier from the sample data. Thus, it is possible to consider families that are rather complex but for which there are classification rules that perform well for small samples. Such classification rules can be advantageous because they facilitate satisfactory classification when the class-conditional distributions are not easily separated and the sample is not large. Here we consider neural networks, from the perspectives of classical design based solely on the sample data and from noise-injection-based design. Results This paper provides an extensive simulation-based comparative study of noise-injected neural-network design. It considers a number of different feature-label models across various small sample sizes using varying amounts of noise injection. Besides comparing noise-injected neural-network design to classical neural-network design, the paper compares it to a number of other classification rules. Our particular interest is with the use of microarray data for expression-based classification for diagnosis and prognosis. To that end, we consider noise-injected neural-network design as it relates to a study of survivability of breast cancer patients. Conclusion The conclusion is that in many instances noise-injected neural network design is superior to the other tested methods, and in almost all cases it does not perform substantially worse than the best of the other methods. Since the amount of noise injected is consequential, the effect of differing amounts of injected noise must be considered.
Collapse
Affiliation(s)
- Jianping Hua
- Computational Biology Division, Translational Genomics Research Institute, Phoenix, USA
| | - James Lowey
- Computational Biology Division, Translational Genomics Research Institute, Phoenix, USA
| | - Zixiang Xiong
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, USA
| | - Edward R Dougherty
- Computational Biology Division, Translational Genomics Research Institute, Phoenix, USA
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, USA
| |
Collapse
|
32
|
Adaptive noise injection for input variables relevance determination. ACTA ACUST UNITED AC 2005. [DOI: 10.1007/bfb0020198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
33
|
Seghouane AK, Moudden Y, Fleury G. Regularizing the effect of input noise injection in feedforward neural networks training. Neural Comput Appl 2004. [DOI: 10.1007/s00521-004-0411-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
34
|
Ping Guo, Lyu M, Chen C. Regularization parameter estimation for feedforward neural networks. ACTA ACUST UNITED AC 2003; 33:35-44. [DOI: 10.1109/tsmcb.2003.808176] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
35
|
|
36
|
Skurichina M, Raudys S, Duin R. k-nearest neighbors directed noise injection in multilayer perceptron training. ACTA ACUST UNITED AC 2000; 11:504-11. [DOI: 10.1109/72.839019] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|