1
|
Meng R, Soper B, Lee HK, Nygård JF, Nygård M. Hierarchical continuous-time inhomogeneous hidden Markov model for cancer screening with extensive followup data. Stat Methods Med Res 2022; 31:2383-2399. [PMID: 36039541 DOI: 10.1177/09622802221122390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Continuous-time hidden Markov models are an attractive approach for disease modeling because they are explainable and capable of handling both irregularly sampled, skewed and sparse data arising from real-world medical practice, in particular to screening data with extensive followup. Most applications in this context consider time-homogeneous models due to their relative computational simplicity. However, the time homogeneous assumption is too strong to accurately model the natural history of many diseases including cancer. Moreover, cancer risk across the population is not homogeneous either, since exposure to disease risk factors can vary considerably between individuals. This is important when analyzing longitudinal datasets and different birth cohorts. We model the heterogeneity of disease progression and regression using piece-wise constant intensity functions and model the heterogeneity of risks in the population using a latent mixture structure. Different submodels under the mixture structure employ the same types of Markov states reflecting disease progression and allowing both clinical interpretation and model parsimony. We also consider flexible observational models dealing with model over-dispersion in real data. An efficient, scalable Expectation-Maximization algorithm for inference is proposed with the theoretical guaranteed convergence property. We demonstrate our method's superior performance compared to other state-of-the-art methods using synthetic data and a real-world cervical cancer screening dataset from the Cancer Registry of Norway. Moreover, we present two model-based risk stratification methods that identify the risk levels of individuals.
Collapse
Affiliation(s)
- Rui Meng
- 8787University of California, Santa Cruz, CA, USA
| | - Braden Soper
- 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
| | | | | | - Mari Nygård
- 11315Cancer Registry of Norway, Oslo, Norway
| |
Collapse
|
2
|
Kwon BC, Achenbach P, Anand V, Frohnert BI, Hagopian W, Hu J, Koski E, Lernmark Å, Lou O, Martin F, Ng K, Toppari J, Veijola R. Islet Autoantibody Levels Differentiate Progression Trajectories in Individuals With Presymptomatic Type 1 Diabetes. Diabetes 2022; 71:2632-2641. [PMID: 36112006 PMCID: PMC9750947 DOI: 10.2337/db22-0360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 08/29/2022] [Indexed: 01/24/2023]
Abstract
In our previous data-driven analysis of evolving patterns of islet autoantibodies (IAb) against insulin (IAA), GAD (GADA), and islet antigen 2 (IA-2A), we discovered three trajectories, characterized according to multiple IAb (TR1), IAA (TR2), or GADA (TR3) as the first appearing autoantibodies. Here we examined the evolution of IAb levels within these trajectories in 2,145 IAb-positive participants followed from early life and compared those who progressed to type 1 diabetes (n = 643) with those remaining undiagnosed (n = 1,502). With use of thresholds determined by 5-year diabetes risk, four levels were defined for each IAb and overlaid onto each visit. In diagnosed participants, high IAA levels were seen in TR1 and TR2 at ages <3 years, whereas IAA remained at lower levels in the undiagnosed. Proportions of dwell times (total duration of follow-up at a given level) at the four IAb levels differed between the diagnosed and undiagnosed for GADA and IA-2A in all three trajectories (P < 0.001), but for IAA dwell times differed only within TR2 (P < 0.05). Overall, undiagnosed participants more frequently had low IAb levels and later appearance of IAb than diagnosed participants. In conclusion, while it has long been appreciated that the number of autoantibodies is an important predictor of type 1 diabetes, consideration of autoantibody levels within the three autoimmune trajectories improved differentiation of IAb-positive children who progressed to type 1 diabetes from those who did not.
Collapse
Affiliation(s)
- Bum Chul Kwon
- Center for Computational Health, IBM Research, Cambridge, MA
- Corresponding author: Bum Chul Kwon,
| | - Peter Achenbach
- Institute of Diabetes Research, Helmholtz Zentrum München—German Research Center for Environmental Health, Munich-Neuherberg, Germany
| | - Vibha Anand
- Center for Computational Health, IBM Research, Cambridge, MA
| | | | | | - Jianying Hu
- Center for Computational Health, IBM Research, Yorktown Heights, NY
| | - Eileen Koski
- Center for Computational Health, IBM Research, Yorktown Heights, NY
| | - Åke Lernmark
- Department of Clinical Sciences Malmö, Lund University CRC, Skåne University Hospital, Malmö, Sweden
| | | | | | - Kenney Ng
- Center for Computational Health, IBM Research, Cambridge, MA
| | - Jorma Toppari
- Institute of Biomedicine and Centre for Population Health Research, University of Turku, and Department of Pediatrics, Turku University Hospital, Turku, Finland
| | - Riitta Veijola
- Medical Research Center, PEDEGO Research Unit, Department of Pediatrics, University of Oulu and Oulu University Hospital, Oulu, Finland
| |
Collapse
|
3
|
Progression of type 1 diabetes from latency to symptomatic disease is predicted by distinct autoimmune trajectories. Nat Commun 2022; 13:1514. [PMID: 35314671 PMCID: PMC8938551 DOI: 10.1038/s41467-022-28909-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 02/16/2022] [Indexed: 12/13/2022] Open
Abstract
Development of islet autoimmunity precedes the onset of type 1 diabetes in children, however, the presence of autoantibodies does not necessarily lead to manifest disease and the onset of clinical symptoms is hard to predict. Here we show, by longitudinal sampling of islet autoantibodies (IAb) to insulin, glutamic acid decarboxylase and islet antigen-2 that disease progression follows distinct trajectories. Of the combined Type 1 Data Intelligence cohort of 24662 participants, 2172 individuals fulfill the criteria of two or more follow-up visits and IAb positivity at least once, with 652 progressing to type 1 diabetes during the 15 years course of the study. Our Continuous-Time Hidden Markov Models, that are developed to discover and visualize latent states based on the collected data and clinical characteristics of the patients, show that the health state of participants progresses from 11 distinct latent states as per three trajectories (TR1, TR2 and TR3), with associated 5-year cumulative diabetes-free survival of 40% (95% confidence interval [CI], 35% to 47%), 62% (95% CI, 57% to 67%), and 88% (95% CI, 85% to 91%), respectively (p < 0.0001). Age, sex, and HLA-DR status further refine the progression rates within trajectories, enabling clinically useful prediction of disease onset. Presence of islet autoantibodies precedes the onset of type 1 diabetes but it does not predict whether and how fast symptomatic disease appears. Here authors present a model to predict and visualize progression to diabetes by using a large longitudinal data set on autoantibodies and clinical parameters as input.
Collapse
|
4
|
Eaton A, Sun Y, Neaton J, Luo X. Nonparametric estimation in an illness-death model with component-wise censoring. Biometrics 2021; 78:1168-1180. [PMID: 33914913 DOI: 10.1111/biom.13482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 03/06/2021] [Accepted: 04/14/2021] [Indexed: 11/28/2022]
Abstract
In disease settings where study participants are at risk for death and a serious nonfatal event, composite endpoints defined as the time until the earliest of death or the nonfatal event are often used as the primary endpoint in clinical trials. In practice, if the nonfatal event can only be detected at clinic visits and the death time is known exactly, the resulting composite endpoint exhibits "component-wise censoring." The standard method used to estimate event-free survival in this setting fails to account for component-wise censoring. We apply a kernel smoothing method previously proposed for a marker process in a novel way to produce a nonparametric estimator for event-free survival that accounts for component-wise censoring. The key insight that allows us to apply this kernel method is thinking of nonfatal event status as an intermittently observed binary time-dependent variable rather than thinking of time to the nonfatal event as interval-censored. We also propose estimators for the probability in state and restricted mean time in state for reversible or irreversible illness-death models, under component-wise censoring, and derive their large-sample properties. We perform a simulation study to compare our method to existing multistate survival methods and apply the methods on data from a large randomized trial studying a multifactor intervention for reducing morbidity and mortality among men at above average risk of coronary heart disease.
Collapse
Affiliation(s)
- Anne Eaton
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Yifei Sun
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York, USA
| | - James Neaton
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Xianghua Luo
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
5
|
Williams JP, Storlie CB, Therneau TM, Jr CRJ, Hannig J. A Bayesian Approach to Multistate Hidden Markov Models: Application to Dementia Progression. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1594831] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Jonathan P. Williams
- Mayo Clinic, Rochester, MN
- Department of Statstics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | | | | | | | - Jan Hannig
- Department of Statstics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
6
|
Bartolucci F, Farcomeni A. A shared-parameter continuous-time hidden Markov and survival model for longitudinal data with informative dropout. Stat Med 2018; 38:1056-1073. [PMID: 30324662 DOI: 10.1002/sim.7994] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 08/24/2018] [Accepted: 09/13/2018] [Indexed: 12/19/2022]
Abstract
A shared-parameter approach for jointly modeling longitudinal and survival data is proposed. With respect to available approaches, it allows for time-varying random effects that affect both the longitudinal and the survival processes. The distribution of these random effects is modeled according to a continuous-time hidden Markov chain so that transitions may occur at any time point. For maximum likelihood estimation, we propose an algorithm based on a discretization of time until censoring in an arbitrary number of time windows. The observed information matrix is used to obtain standard errors. We illustrate the approach by simulation, even with respect to the effect of the number of time windows on the precision of the estimates, and by an application to data about patients suffering from mildly dilated cardiomyopathy.
Collapse
Affiliation(s)
| | - Alessio Farcomeni
- Department of Public Health and Infectious Diseases, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
7
|
Lange JM, Gulati R, Leonardson AS, Lin DW, Newcomb LF, Trock BJ, Carter HB, Cooperberg MR, Cowan JE, Klotz LH, Etzioni R. ESTIMATING AND COMPARING CANCER PROGRESSION RISKS UNDER VARYING SURVEILLANCE PROTOCOLS. Ann Appl Stat 2018; 12:1773-1795. [PMID: 30627300 PMCID: PMC6322848 DOI: 10.1214/17-aoas1130] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Outcomes after cancer diagnosis and treatment are often observed at discrete times via doctor-patient encounters or specialized diagnostic examinations. Despite their ubiquity as endpoints in cancer studies, such outcomes pose challenges for analysis. In particular, comparisons between studies or patient populations with different surveillance schema may be confounded by differences in visit frequencies. We present a statistical framework based on multistate and hidden Markov models that represents events on a continuous time scale given data with discrete observation times. To demonstrate this framework, we consider the problem of comparing risks of prostate cancer progression across multiple active surveillance cohorts with different surveillance frequencies. We show that the different surveillance schedules partially explain observed differences in the progression risks between cohorts. Our application permits the conclusion that differences in underlying cancer progression risks across cohorts persist after accounting for different surveillance frequencies.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Ruth Etzioni
- Fred Hutchinson Cancer Research Center
- University of Washington
| |
Collapse
|
8
|
Moon NC, Zeng L, Cook RJ. Tracing studies in cohorts with attrition: Selection models for efficient sampling. Stat Med 2018; 37:2354-2366. [PMID: 29682774 DOI: 10.1002/sim.7646] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 01/06/2018] [Accepted: 02/04/2018] [Indexed: 11/08/2022]
Abstract
Cohort studies of chronic diseases involve recruitment and longitudinal follow-up of affected individuals with a view to studying the effect of risk factors on disease progression and death. When the time to withdrawal from the cohort is conditionally independent of the disease process the primary consequence is a loss of information on the parameters of interest. This loss can sometimes be mitigated through the conduct of tracing studies in which a subsample of those lost to follow up are contacted and some information is obtained on their disease and survival status. We describe the use of selection models to sample individuals for tracing who will yield more efficient estimators than those obtained by simple random sampling. Efficient sampling schemes featuring cost constraints are also developed and shown to perform well. An application to data from the University of Toronto Psoriatic Arthritis Cohort illustrates how to apply the method in a real setting.
Collapse
Affiliation(s)
- Nathalie C Moon
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
| | - Leilei Zeng
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
| | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, Canada
| |
Collapse
|
9
|
Aralis H, Brookmeyer R. A stochastic estimation procedure for intermittently-observed semi-Markov multistate models with back transitions. Stat Methods Med Res 2017; 28:770-787. [PMID: 29117850 DOI: 10.1177/0962280217736342] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Multistate models provide an important method for analyzing a wide range of life history processes including disease progression and patient recovery following medical intervention. Panel data consisting of the states occupied by an individual at a series of discrete time points are often used to estimate transition intensities of the underlying continuous-time process. When transition intensities depend on the time elapsed in the current state and back transitions between states are possible, this intermittent observation process presents difficulties in estimation due to intractability of the likelihood function. In this manuscript, we present an iterative stochastic expectation-maximization algorithm that relies on a simulation-based approximation to the likelihood function and implement this algorithm using rejection sampling. In a simulation study, we demonstrate the feasibility and performance of the proposed procedure. We then demonstrate application of the algorithm to a study of dementia, the Nun Study, consisting of intermittently-observed elderly subjects in one of four possible states corresponding to intact cognition, impaired cognition, dementia, and death. We show that the proposed stochastic expectation-maximization algorithm substantially reduces bias in model parameter estimates compared to an alternative approach used in the literature, minimal path estimation. We conclude that in estimating intermittently observed semi-Markov models, the proposed approach is a computationally feasible and accurate estimation procedure that leads to substantial improvements in back transition estimates.
Collapse
Affiliation(s)
- Hilary Aralis
- UCLA Department of Biostatistics, Fielding School of Public Health, Los Angeles, CA, USA
| | - Ron Brookmeyer
- UCLA Department of Biostatistics, Fielding School of Public Health, Los Angeles, CA, USA
| |
Collapse
|
10
|
Lu S. A continuous-time HMM approach to modeling the magnitude-frequency distribution of earthquakes. J Appl Stat 2016. [DOI: 10.1080/02664763.2016.1161736] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Shaochuan Lu
- School of Statistics, Beijing Normal University, Beijing, People's Republic of China
| |
Collapse
|
11
|
Steventon A, Roberts A. Estimating Lifetime Costs of Social Care: A Bayesian Approach Using Linked Administrative Datasets from Three Geographical Areas. HEALTH ECONOMICS 2015; 24:1573-1587. [PMID: 25385010 DOI: 10.1002/hec.3110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Revised: 07/01/2014] [Accepted: 09/08/2014] [Indexed: 06/04/2023]
Abstract
We estimated lifetime costs of publicly funded social care, covering services such as residential and nursing care homes, domiciliary care and meals. Like previous studies, we constructed microsimulation models. However, our transition probabilities were estimated from longitudinal, linked administrative health and social care datasets, rather than from survey data. Administrative data were obtained from three geographical areas of England, and we estimated transition probabilities in each of these sites flexibly using Bayesian methods. This allowed us to quantify regional variation as well as the impact of structural and parameter uncertainty regarding the transition probabilities. Expected lifetime costs at age 65 were £20,200-27,000 for men and £38,700-49,000 for women, depending on which of the three areas was used to calibrate the model. Thus, patterns of social care spending differed markedly between areas, with mean costs varying by almost £10,000 (25%) across the lifetime for people of the same age and gender. Allowing for structural and parameter uncertainty had little impact on expected lifetime costs, but slightly increased the risk of very high costs, which will have implications for insurance products for social care through increasing requirements for capital reserves.
Collapse
|
12
|
Xu J, Guttorp P, Kato-Maeda M, Minin VN. Likelihood-based inference for discretely observed birth-death-shift processes, with applications to evolution of mobile genetic elements. Biometrics 2015; 71:1009-21. [PMID: 26148963 DOI: 10.1111/biom.12352] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2015] [Revised: 05/01/2015] [Accepted: 05/01/2015] [Indexed: 11/28/2022]
Abstract
Continuous-time birth-death-shift (BDS) processes are frequently used in stochastic modeling, with many applications in ecology and epidemiology. In particular, such processes can model evolutionary dynamics of transposable elements-important genetic markers in molecular epidemiology. Estimation of the effects of individual covariates on the birth, death, and shift rates of the process can be accomplished by analyzing patient data, but inferring these rates in a discretely and unevenly observed setting presents computational challenges. We propose a multi-type branching process approximation to BDS processes and develop a corresponding expectation maximization algorithm, where we use spectral techniques to reduce calculation of expected sufficient statistics to low-dimensional integration. These techniques yield an efficient and robust optimization routine for inferring the rates of the BDS process, and apply broadly to multi-type branching processes whose rates can depend on many covariates. After rigorously testing our methodology in simulation studies, we apply our method to study intrapatient time evolution of IS6110 transposable element, a genetic marker frequently used during estimation of epidemiological clusters of Mycobacterium tuberculosis infections.
Collapse
Affiliation(s)
- Jason Xu
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Peter Guttorp
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Midori Kato-Maeda
- School of Medicine, University of California, San Francisco, CA, U.S.A
| | - Vladimir N Minin
- Department of Statistics, University of Washington, Seattle, WA, U.S.A.,Department of Biology, University of Washington, Seattle, WA, U.S.A
| |
Collapse
|
13
|
|
14
|
Lange JM, Hubbard RA, Inoue LYT, Minin VN. A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data. Biometrics 2014; 71:90-101. [PMID: 25319319 DOI: 10.1111/biom.12252] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2014] [Revised: 07/01/2014] [Accepted: 09/01/2014] [Indexed: 12/27/2022]
Abstract
Multistate models are used to characterize individuals' natural histories through diseases with discrete states. Observational data resources based on electronic medical records pose new opportunities for studying such diseases. However, these data consist of observations of the process at discrete sampling times, which may either be pre-scheduled and non-informative, or symptom-driven and informative about an individual's underlying disease status. We have developed a novel joint observation and disease transition model for this setting. The disease process is modeled according to a latent continuous-time Markov chain; and the observation process, according to a Markov-modulated Poisson process with observation rates that depend on the individual's underlying disease status. The disease process is observed at a combination of informative and non-informative sampling times, with possible misclassification error. We demonstrate that the model is computationally tractable and devise an expectation-maximization algorithm for parameter estimation. Using simulated data, we show how estimates from our joint observation and disease transition model lead to less biased and more precise estimates of the disease rate parameters. We apply the model to a study of secondary breast cancer events, utilizing mammography and biopsy records from a sample of women with a history of primary breast cancer.
Collapse
Affiliation(s)
- Jane M Lange
- Department of Bioststatistics, University of Washington, Seattle, Washington, U.S.A
| | - Rebecca A Hubbard
- Department of Bioststatistics, University of Washington, Seattle, Washington, U.S.A.,Biostatistics Unit, Group Health Research Institute, Seattle, Washington, U.S.A
| | - Lurdes Y T Inoue
- Department of Bioststatistics, University of Washington, Seattle, Washington, U.S.A
| | - Vladimir N Minin
- Departments of Statistics and Biology, University of Washington, Seattle, Washington, U.S.A
| |
Collapse
|