1
|
Morrison S, Gatsonis C, Eloyan A, Steingrimsson JA. Survival analysis using deep learning with medical imaging. Int J Biostat 2023; 0:ijb-2022-0113. [PMID: 37312249 PMCID: PMC11074924 DOI: 10.1515/ijb-2022-0113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 02/24/2023] [Indexed: 06/15/2023]
Abstract
There is widespread interest in using deep learning to build prediction models for medical imaging data. These deep learning methods capture the local structure of the image and require no manual feature extraction. Despite the importance of modeling survival in the context of medical data analysis, research on deep learning methods for modeling the relationship of imaging and time-to-event data is still under-developed. We provide an overview of deep learning methods for time-to-event outcomes and compare several deep learning methods to Cox model based methods through the analysis of a histology dataset of gliomas.
Collapse
Affiliation(s)
- Samantha Morrison
- Department of Biostatistics, School of Public Health, Brown University,
Providence, RI, USA
| | - Constantine Gatsonis
- Department of Biostatistics, School of Public Health, Brown University,
Providence, RI, USA
| | - Ani Eloyan
- Department of Biostatistics, School of Public Health, Brown University,
Providence, RI, USA
| | - Jon Arni Steingrimsson
- Department of Biostatistics, School of Public Health, Brown University,
Providence, RI, USA
| |
Collapse
|
2
|
Abstract
The uncertainty in predictions from deep neural network analysis of medical imaging is challenging to assess but potentially important to include in subsequent decision-making. Using data from diabetic retinopathy detection, we present an empirical evaluation of the role of model calibration in uncertainty-based referral, an approach that prioritizes referral of observations based on the magnitude of a measure of uncertainty. We consider several configurations of network architecture, methods for uncertainty estimation, and training data size. We identify a strong relationship between the effectiveness of uncertainty-based referral and having a well-calibrated model. This is especially relevant as complex deep neural networks tend to have high calibration errors. Finally, we show that post-calibration of the neural network helps uncertainty-based referral with identifying hard-to-classify observations.
Collapse
Affiliation(s)
- Ruotao Zhang
- Department of Biostatistics, 6752Brown University, Providence, Rhode Island, USA
| | - Constantine Gatsonis
- Department of Biostatistics, 6752Brown University, Providence, Rhode Island, USA
| | | |
Collapse
|
3
|
Abstract
Deep learning is a class of machine learning algorithms that are popular for building risk prediction models. When observations are censored, the outcomes are only partially observed and standard deep learning algorithms cannot be directly applied. We develop a new class of deep learning algorithms for outcomes that are potentially censored. To account for censoring, the unobservable loss function used in the absence of censoring is replaced by a censoring unbiased transformation. The resulting class of algorithms can be used to estimate both survival probabilities and restricted mean survival. We show how the deep learning algorithms can be implemented by adapting software for uncensored data by using a form of response transformation. We provide comparisons of the proposed deep learning algorithms to existing risk prediction algorithms for predicting survival probabilities and restricted mean survival through both simulated datasets and analysis of data from breast cancer patients.
Collapse
Affiliation(s)
| | - Samantha Morrison
- Department of Biostatistics, Brown University, Providence, Rhode Island, USA
| |
Collapse
|
4
|
Sigurdardottir AK, Kristófersson GK, Gústafsdóttir SS, Sigurdsson SB, Arnadottir SA, Steingrimsson JA, Gunnarsdóttir ED. Self-rated health and socio-economic status among older adults in Northern Iceland. Int J Circumpolar Health 2020; 78:1697476. [PMID: 31783724 PMCID: PMC6896473 DOI: 10.1080/22423982.2019.1697476] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Little is known about self-rated health (SRH) of older people living in more remote and Arctic areas. Iceland is a high-income country with one of the lowest rates of income inequality in the world, which may influence SRH. The research aim was to study factors affecting SRH, in such a population living in Northern Iceland. Stratified random sample according to the place of residency, age and gender was used and data collected via face-to-face interviews. Inclusion criteria included community-dwelling adults ≥65 years of age. Response rate was 57.9% (N = 175), average age 74.2 (sd 6.3) years, range 65–92 years and 57% were men. The average number of diagnosed diseases was 1.5 (sd 1.3) and prescribed medications 3.0 (sd 1.7). SRH ranged from 5 (excellent) to 1 (bad), with an average of 3.26 (sd 1.0) and no difference between the place of residency. Lower SRH was independently explained by depressed mood (OR = 0.88, 95% CI = 0.80–0.96), higher body mass index (OR = 0.93, 95% CI = 0.87–0.99), number of prescribed medications (OR = 0.88, 95% CI = 0.78–1.00) and perception of inadequate income (OR = 0.45, 95% CI = 0.21–0.98). The results highlight the importance of physical and mental health promotion for general health and for ageing in place and significance of economic factors as predictors of SRH.
Collapse
Affiliation(s)
- Arun K Sigurdardottir
- School of Health Sciences, University of Akureyri, Solborg v/Nordursloð, Akureyri, Iceland.,Department of Education and science, Akureyri Hospital Eyrarlandsvegi, Akureyri, Iceland
| | | | | | - Stefan B Sigurdsson
- School of Health Sciences, University of Akureyri, Solborg v/Nordursloð, Akureyri, Iceland
| | - Solveig A Arnadottir
- Department of Physical Therapy, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | | | | |
Collapse
|
5
|
Steingrimsson JA, Betz J, Qian T, Rosenblum M. Optimized adaptive enrichment designs for three-arm trials: learning which subpopulations benefit from different treatments. Biostatistics 2019; 22:283-297. [PMID: 31420983 DOI: 10.1093/biostatistics/kxz030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 06/13/2019] [Accepted: 07/15/2019] [Indexed: 11/13/2022] Open
Abstract
We consider the problem of designing a confirmatory randomized trial for comparing two treatments versus a common control in two disjoint subpopulations. The subpopulations could be defined in terms of a biomarker or disease severity measured at baseline. The goal is to determine which treatments benefit which subpopulations. We develop a new class of adaptive enrichment designs tailored to solving this problem. Adaptive enrichment designs involve a preplanned rule for modifying enrollment based on accruing data in an ongoing trial. At the interim analysis after each stage, for each subpopulation, the preplanned rule may decide to stop enrollment or to stop randomizing participants to one or more study arms. The motivation for this adaptive feature is that interim data may indicate that a subpopulation, such as those with lower disease severity at baseline, is unlikely to benefit from a particular treatment while uncertainty remains for the other treatment and/or subpopulation. We optimize these adaptive designs to have the minimum expected sample size under power and Type I error constraints. We compare the performance of the optimized adaptive design versus an optimized nonadaptive (single stage) design. Our approach is demonstrated in simulation studies that mimic features of a completed trial of a medical device for treating heart failure. The optimized adaptive design has $25\%$ smaller expected sample size compared to the optimized nonadaptive design; however, the cost is that the optimized adaptive design has $8\%$ greater maximum sample size. Open-source software that implements the trial design optimization is provided, allowing users to investigate the tradeoffs in using the proposed adaptive versus standard designs.
Collapse
Affiliation(s)
- Jon Arni Steingrimsson
- Department of Biostatistics, Brown University, 121 South Main Street, Providence, RI 02903, USA
| | - Joshua Betz
- Department of Biostatistics, Johns Hopkins University, 615 North Wolfe Street, Baltimore, MD 21205, USA
| | - Tianchen Qian
- Department of Statistics, Harvard University, 1 Oxford St, Cambridge, MA 02138, USA
| | - Michael Rosenblum
- Department of Biostatistics, Johns Hopkins University, 615 North Wolfe Street, Baltimore, MD 21205, USA
| |
Collapse
|
6
|
Steingrimsson JA, Yang J. Subgroup identification using covariate‐adjusted interaction trees. Stat Med 2019; 38:3974-3984. [DOI: 10.1002/sim.8214] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 03/18/2019] [Accepted: 05/08/2019] [Indexed: 12/28/2022]
Affiliation(s)
| | - Jiabei Yang
- Department of BiostatisticsBrown University Providence Rhode Island
| |
Collapse
|
7
|
Abstract
This paper proposes a novel paradigm for building regression trees and ensemble learning in survival analysis. Generalizations of the CART and Random Forests algorithms for general loss functions, and in the latter case more general bootstrap procedures, are both introduced. These results, in combination with an extension of the theory of censoring unbiased transformations applicable to loss functions, underpin the development of two new classes of algorithms for constructing survival trees and survival forests: Censoring Unbiased Regression Trees and Censoring Unbiased Regression Ensembles. For a certain "doubly robust" censoring unbiased transformation of squared error loss, we further show how these new algorithms can be implemented using existing software (e.g., CART, random forests). Comparisons of these methods to existing ensemble procedures for predicting survival probabilities are provided in both simulated settings and through applications to four datasets. It is shown that these new methods either improve upon, or remain competitive with, existing implementations of random survival forests, conditional inference forests, and recursively imputed survival trees.
Collapse
Affiliation(s)
| | - Liqun Diao
- Department of Statistics and Actuarial Science University of Waterloo, Waterloo ON, Canada
| | - Robert L Strawderman
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester NY, USA
| |
Collapse
|
8
|
Hu C, Steingrimsson JA. Personalized Risk Prediction in Clinical Oncology Research: Applications and Practical Issues Using Survival Trees and Random Forests. J Biopharm Stat 2017; 28:333-349. [PMID: 29048993 DOI: 10.1080/10543406.2017.1377730] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.
Collapse
Affiliation(s)
- Chen Hu
- a Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center , Johns Hopkins University School of Medicine , Baltimore , MD , USA
| | - Jon Arni Steingrimsson
- b Department of Biostatistics , School of Public Health, Brown University , Providence , RI , USA
| |
Collapse
|
9
|
Steingrimsson JA, Strawderman RL. Estimation in the semiparametric accelerated failure time model with missing covariates: improving efficiency through augmentation. J Am Stat Assoc 2017; 112:1221-1235. [PMID: 33033419 DOI: 10.1080/01621459.2016.1205500] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
This paper considers linear regression with missing covariates and a right censored outcome. We first consider a general two-phase outcome sampling design, where full covariate information is only ascertained for subjects in phase two and sampling occurs under an independent Bernoulli sampling scheme with known subject-specific sampling probabilities that depend on phase one information (e.g., survival time, failure status and covariates). The semiparametric information bound is derived for estimating the regression parameter in this setting. We also introduce a more practical class of augmented estimators that is shown to improve asymptotic efficiency over simple but inefficient inverse probability of sampling weighted estimators. Estimation for known sampling weights and extensions to the case of estimated sampling weights are both considered. The allowance for estimated sampling weights permits covariates to be missing at random according to a monotone but unknown mechanism. The asymptotic properties of the augmented estimators are derived and simulation results demonstrate substantial efficiency improvements over simpler inverse probability of sampling weighted estimators in the indicated settings. With suitable modification, the proposed methodology can also be used to improve augmented estimators previously used for missing covariates in a Cox regression model.
Collapse
Affiliation(s)
| | - Robert L Strawderman
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14642,
| |
Collapse
|
10
|
Steingrimsson JA, Diao L, Molinaro AM, Strawderman RL. Doubly robust survival trees. Stat Med 2016; 35:3595-612. [PMID: 27037609 DOI: 10.1002/sim.6949] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2015] [Revised: 02/06/2016] [Accepted: 03/01/2016] [Indexed: 11/09/2022]
Abstract
Estimating a patient's mortality risk is important in making treatment decisions. Survival trees are a useful tool and employ recursive partitioning to separate patients into different risk groups. Existing 'loss based' recursive partitioning procedures that would be used in the absence of censoring have previously been extended to the setting of right censored outcomes using inverse probability censoring weighted estimators of loss functions. In this paper, we propose new 'doubly robust' extensions of these loss estimators motivated by semiparametric efficiency theory for missing data that better utilize available data. Simulations and a data analysis demonstrate strong performance of the doubly robust survival trees compared with previously used methods. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Jon Arni Steingrimsson
- Department of Biostatistics, Johns Hopkins University, Baltimore, 14853, MD, 21205 U.S.A
| | - Liqun Diao
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, N2L 3G1, ON, CANADA
| | - Annette M Molinaro
- Department of Neurological Surgery, University of California, San Francisco, 94143-0372, CA, U.S.A
| | - Robert L Strawderman
- Department of Biostatistics and Computational Biology, University of Rochester, Rochester, 14642, NY, U.S.A
| |
Collapse
|