1
|
Mejia AF, Bolin D, Yue YR, Wang J, Caffo BS, Nebel MB. Template independent component analysis with spatial priors for accurate subject-level brain network estimation and inference. J Comput Graph Stat 2022; 32:413-433. [PMID: 37377728 PMCID: PMC10292763 DOI: 10.1080/10618600.2022.2104289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 06/14/2022] [Indexed: 10/17/2022]
Abstract
Independent component analysis is commonly applied to functional magnetic resonance imaging (fMRI) data to extract independent components (ICs) representing functional brain networks. While ICA produces reliable group-level estimates, single-subject ICA often produces noisy results. Template ICA is a hierarchical ICA model using empirical population priors to produce more reliable subject-level estimates. However, this and other hierarchical ICA models assume unrealistically that subject effects are spatially independent. Here, we propose spatial template ICA (stICA), which incorporates spatial priors into the template ICA framework for greater estimation efficiency. Additionally, the joint posterior distribution can be used to identify brain regions engaged in each network using an excursions set approach. By leveraging spatial dependencies and avoiding massive multiple comparisons, stICA has high power to detect true effects. We derive an efficient expectation-maximization algorithm to obtain maximum likelihood estimates of the model parameters and posterior moments of the latent fields. Based on analysis of simulated data and fMRI data from the Human Connectome Project, we find that stICA produces estimates that are more accurate and reliable than benchmark approaches, and identifies larger and more reliable areas of engagement. The algorithm is computationally tractable, achieving convergence within 12 hours for whole-cortex fMRI analysis.
Collapse
Affiliation(s)
- Amanda F. Mejia
- Department of Statistics, Indiana University, Bloomington, IN, 47408
| | - David Bolin
- CEMSE Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Yu Ryan Yue
- Paul H. Chook Department of Information Systems and Statistics, Baruch College, The City University of New York, New York, NY, 10010
| | - Jiongran Wang
- Department of Statistics, Indiana University, Bloomington, IN, 47408
| | - Brian S. Caffo
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD, 21205
| | - Mary Beth Nebel
- Center for Neurodevelopmental and Imaging Research, Kennedy Krieger Institute, Baltimore, MD, 21205
- Department of Neurology, Johns Hopkins University, Baltimore, MD, 21205
| |
Collapse
|
2
|
Rice K, Ye L. Expressing regret: a unified view of credible intervals. AM STAT 2022; 76:248-256. [PMID: 36035272 PMCID: PMC9401190 DOI: 10.1080/00031305.2022.2039764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Posterior uncertainty is typically summarized as a credible interval, an interval in the parameter space that contains a fixed proportion - usually 95% - of the posterior's support. For multivariate parameters, credible sets perform the same role. There are of course many potential 95% intervals from which to choose, yet even standard choices are rarely justified in any formal way. In this paper we give a general method, focusing on the loss function that motivates an estimate - the Bayes rule - around which we construct a credible set. The set contains all points which, as estimates, would have minimally-worse expected loss than the Bayes rule: we call this excess expected loss 'regret'. The approach can be used for any model and prior, and we show how it justifies all widely-used choices of credible interval/set. Further examples show how it provides insights into more complex estimation problems.
Collapse
Affiliation(s)
- Kenneth Rice
- Department of Biostatistics, University of Washington
| | - Lingbo Ye
- Department of Biostatistics, University of Washington
| |
Collapse
|
3
|
Partington G, Cro S, Mason A, Phillips R, Cornelius V. Design and analysis features used in small population and rare disease trials: A targeted review. J Clin Epidemiol 2021; 144:93-101. [PMID: 34910979 DOI: 10.1016/j.jclinepi.2021.12.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/24/2021] [Accepted: 12/07/2021] [Indexed: 10/19/2022]
Abstract
OBJECTIVE Frequentist trials in Rare disease/small population trials often require unfeasibly large sample size to detect minimum clinically important differences. A targeted review was performed investigating what design and analysis methods these trials use when facing restricted recruitment. STUDY DESIGN AND SETTING Targeted Review searching EMBASE and MEDLINE for Phase II-IV RCTs reporting 'rare' disease or 'small population' within title or abstract, since 2009. RESULTS A total of 6,128 articles were screened with 64 trials eligible (4 Bayesian, 60 frequentist trials). Frequentists trials had planned power ranging 72-90% (median: 80%) but reported recruiting a mean of 6.6% below the planned sample size (n=38) [median 0%, IQR (-5%, 5%)], most used standard type 1 error (52 used 5% and 1 used 1%), and the average standardised effect was high (0.7) with 50% missing their assumed level. Of the 4 Bayesian trials, 3 used informed priors, 2 and 1 trials performed sensitivity analysis for the impact of priors on design and analysis respectively. Historical data, expert consensus, or both were used to construct informative priors. Bayesian trials required 30%-2400% less participants than using frequentist frameworks. CONCLUSION Bayesian trials required lower sample size through use of informative priors. Most frequentists didn't achieve their target sample size. Bayesian methods offer promising solutions for such trials but are underutilised.
Collapse
Affiliation(s)
- Giles Partington
- Imperial Clinical Trials Unit, Imperial College London, 1st Floor Stadium House, 68 Wood Lane, London, United Kingdom W12 7RH.
| | - Suzie Cro
- Imperial Clinical Trials Unit, Imperial College London, 1st Floor Stadium House, 68 Wood Lane, London, United Kingdom W12 7RH
| | - Alexina Mason
- Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, Keppel Street, London, United Kingdom WC1E 7HT
| | - Rachel Phillips
- Imperial Clinical Trials Unit, Imperial College London, 1st Floor Stadium House, 68 Wood Lane, London, United Kingdom W12 7RH
| | - Victoria Cornelius
- Imperial Clinical Trials Unit, Imperial College London, 1st Floor Stadium House, 68 Wood Lane, London, United Kingdom W12 7RH
| |
Collapse
|
4
|
Abstract
In logistic regression, separation occurs when a linear combination of predictors perfectly discriminates the binary outcome. Because finite-valued maximum likelihood parameter estimates do not exist under separation, Bayesian regressions with informative shrinkage of the regression coefficients offer a suitable alternative. Classical studies of separation imply that efficiency in estimating regression coefficients may also depend upon the choice of intercept prior, yet relatively little focus has been given on whether and how to shrink the intercept parameter. Alternative prior distributions for the intercept are proposed that downweight implausibly extreme regions of the parameter space, rendering regression estimates that are less sensitive to separation. Through simulation and the analysis of exemplar datasets, differences across priors stratified by established statistics measuring the degree of separation are quantified. Relative to diffuse priors, these proposed priors generally yield more efficient estimation of the regression coefficients themselves when the data are nearly separated. They are equally efficient in non-separated datasets, making them suitable for default use. Modest differences were observed with respect to out-of-sample discrimination. These numerical studies also highlight the interplay between priors for the intercept and the regression coefficients: findings are more sensitive to the choice of intercept prior when using a weakly informative prior on the regression coefficients than an informative shrinkage prior.
Collapse
Affiliation(s)
| | - Ryan P. Barbaro
- Division of Pediatric Critical Care, University of Michigan, Ann Arbor, USA
- Child Health Evaluation and Research Unit, University of Michigan, Ann Arbor, USA
| | - Ananda Sen
- Department of Biostatistics, University of Michigan, Ann Arbor, USA
- Department of Family Medicine, University of Michigan, Ann Arbor, USA
| |
Collapse
|
5
|
Nasution MD, Wang X. Statistical issues and advances in cancer precision medicine research. J Biopharm Stat 2018. [PMID: 29513634 DOI: 10.1080/10543406.2017.1405013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
| | - Xiaofei Wang
- b Duke University School of Medicine , Biostatistics & Bioinformatics , Durham , North Carolina , USA
| |
Collapse
|
6
|
Parker AE, Pitts B, Lorenz L, Stewart PS. Polynomial accelerated solutions to a LARGE Gaussian model for imaging biofilms: in theory and finite precision. J Am Stat Assoc 2018; 113:1431-1442. [PMID: 30906085 DOI: 10.1080/01621459.2017.1409121] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Three dimensional confocal scanning laser microscope images offer dramatic visualizations of the action of living biofilms before and after interventions. Here we use confocal microscopy to study the effect of a treatment over time that causes a biofilm to swell and contract due to osmotic pressure changes. From these data, our goal is to reconstruct biofilm surfaces, to estimate the effect of the treatment on the biofilm's volume, and to quantify the related uncertainties. We formulate the associated massive linear Bayesian inverse problem and then solve it using iterative samplers from large multivariate Gaussians that exploit well-established polynomial acceleration techniques from numerical linear algebra. Because of a general equivalence with linear solvers, these polynomial accelerated iterative samplers have known convergence rates, stopping criteria, and perform well in finite precision. An explicit algorithm is provided, for the first time, for an iterative sampler that is accelerated by the synergistic implementation of preconditioned conjugate gradient and Chebyshev polynomials.
Collapse
Affiliation(s)
- Albert E Parker
- Department of Mathematical Sciences, Center for Biofilm Engineering, Montana State University, Bozeman, Montana, 59715.
| | - Betsey Pitts
- Center for Biofilm Engineering, Montana State University, Bozeman, Montana, 59715.
| | - Lindsey Lorenz
- Center for Biofilm Engineering, Montana State University, Bozeman, Montana, 59715.
| | - Philip S Stewart
- Department of Chemical and Biological Engineering, Center for Biofilm Engineering, Montana State University, Bozeman, Montana, 59715.
| |
Collapse
|
7
|
Sabo RT, Bello G. Optimal and lead-in adaptive allocation for binary outcomes: a comparison of Bayesian methodologies. COMMUN STAT-THEOR M 2017; 46:2823-2836. [PMID: 29081575 DOI: 10.1080/03610926.2015.1053929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
We compare posterior and predictive estimators and probabilities in response-adaptive randomization designs for two- and three-group clinical trials with binary outcomes. Adaptation based upon posterior estimates are discussed, as are two predictive probability algorithms: one using the traditional definition, the other using a skeptical distribution. Optimal and natural lead-in designs are covered. Simulation studies show: efficacy comparisons lead to more adaptation than center comparisons, though at some power loss; skeptically predictive efficacy comparisons and natural lead-in approaches lead to less adaptation but offer reduced allocation variability. Though nuanced, these results help clarify the power-adaptation trade-off in adaptive randomization.
Collapse
Affiliation(s)
- Roy T Sabo
- Department of Biostatistics, Virginia Commonwealth University, 830 East Main Street, Richmond, VA 23298-0032, U.S.A
| | - Ghalib Bello
- Department of Biostatistics, Virginia Commonwealth University, 830 East Main Street, Richmond, VA 23298-0032, U.S.A
| |
Collapse
|
8
|
Morris JS. Statistical Methods for Proteomic Biomarker Discovery based on Feature Extraction or Functional Modeling Approaches. Stat Interface 2012; 5:117-135. [PMID: 23814640 PMCID: PMC3693398 DOI: 10.4310/sii.2012.v5.n1.a11] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational aspects of comparative proteomic studies, and summarizes contributions I along with numerous collaborators have made. First, there is an overview of comparative proteomics technologies, followed by a discussion of important experimental design and preprocessing issues that must be considered before statistical analysis can be done. Next, the two key approaches to analyzing proteomics data, feature extraction and functional modeling, are described. Feature extraction involves detection and quantification of discrete features like peaks or spots that theoretically correspond to different proteins in the sample. After an overview of the feature extraction approach, specific methods for mass spectrometry (Cromwell) and 2D gel electrophoresis (Pinnacle) are described. The functional modeling approach involves modeling the proteomic data in their entirety as functions or images. A general discussion of the approach is followed by the presentation of a specific method that can be applied, wavelet-based functional mixed models, and its extensions. All methods are illustrated by application to two example proteomic data sets, one from mass spectrometry and one from 2D gel electrophoresis. While the specific methods presented are applied to two specific proteomic technologies, MALDI-TOF and 2D gel electrophoresis, these methods and the other principles discussed in the paper apply much more broadly to other expression proteomics technologies.
Collapse
|