1
|
Ren B, Barnett I. Combining mixed effects hidden Markov models with latent alternating recurrent event processes to model diurnal active-rest cycles. Biometrics 2023; 79:3402-3417. [PMID: 37017074 DOI: 10.1111/biom.13865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 03/22/2023] [Indexed: 04/06/2023]
Abstract
Data collected from wearable devices can shed light on an individual's pattern of behavioral and circadian routine. Phone use can be modeled as alternating processes, between the state of active use and the state of being idle. Markov chains and alternating recurrent event models are commonly used to model state transitions in cases such as these, and the incorporation of random effects can be used to introduce diurnal effects. While state labels can be derived prior to modeling dynamics, this approach omits informative regression covariates that can influence state memberships. We instead propose an alternating recurrent event proportional hazards (PH) regression to model the transitions between latent states. We propose an expectation-maximization algorithm for imputing latent state labels and estimating parameters. We show that our E-step simplifies to the hidden Markov model (HMM) forward-backward algorithm, allowing us to recover an HMM with logistic regression transition probabilities. In addition, we show that PH modeling of discrete-time transitions implicitly penalizes the logistic regression likelihood and results in shrinkage estimators for the relative risk. This new estimator favors an extended stay in a state and is useful for modeling diurnal rhythms. We derive asymptotic distributions for our parameter estimates and compare our approach against competing methods through simulation as well as in a digital phenotyping study that followed smartphone use in a cohort of adolescents with mood disorders.
Collapse
Affiliation(s)
- Benny Ren
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Pennsylvania, PA, USA
| | - Ian Barnett
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Pennsylvania, PA, USA
| |
Collapse
|
2
|
Mews S, Surmann B, Hasemann L, Elkenkamp S. Markov-modulated marked Poisson processes for modeling disease dynamics based on medical claims data. Stat Med 2023; 42:3804-3815. [PMID: 37308135 DOI: 10.1002/sim.9832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 05/26/2023] [Accepted: 06/01/2023] [Indexed: 06/14/2023]
Abstract
We explore Markov-modulated marked Poisson processes (MMMPPs) as a natural framework for modeling patients' disease dynamics over time based on medical claims data. In claims data, observations do not only occur at random points in time but are also informative, that is, driven by unobserved disease levels, as poor health conditions usually lead to more frequent interactions with the health care system. Therefore, we model the observation process as a Markov-modulated Poisson process, where the rate of health care interactions is governed by a continuous-time Markov chain. Its states serve as proxies for the patients' latent disease levels and further determine the distribution of additional data collected at each observation time, the so-called marks. Overall, MMMPPs jointly model observations and their informative time points by comprising two state-dependent processes: the observation process (corresponding to the event times) and the mark process (corresponding to event-specific information), which both depend on the underlying states. The approach is illustrated using claims data from patients diagnosed with chronic obstructive pulmonary disease by modeling their drug use and the interval lengths between consecutive physician consultations. The results indicate that MMMPPs are able to detect distinct patterns of health care utilization related to disease processes and reveal interindividual differences in the state-switching dynamics.
Collapse
Affiliation(s)
- Sina Mews
- Department of Business Administration and Economics, Bielefeld University, Bielefeld, Germany
| | - Bastian Surmann
- Department for Health Economics and Health Care Management, Bielefeld University, Bielefeld, Germany
| | - Lena Hasemann
- Department for Health Economics and Health Care Management, Bielefeld University, Bielefeld, Germany
| | - Svenja Elkenkamp
- Department for Health Economics and Health Care Management, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
3
|
Meng R, Soper B, Lee HK, Nygård JF, Nygård M. Hierarchical continuous-time inhomogeneous hidden Markov model for cancer screening with extensive followup data. Stat Methods Med Res 2022; 31:2383-2399. [PMID: 36039541 DOI: 10.1177/09622802221122390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Continuous-time hidden Markov models are an attractive approach for disease modeling because they are explainable and capable of handling both irregularly sampled, skewed and sparse data arising from real-world medical practice, in particular to screening data with extensive followup. Most applications in this context consider time-homogeneous models due to their relative computational simplicity. However, the time homogeneous assumption is too strong to accurately model the natural history of many diseases including cancer. Moreover, cancer risk across the population is not homogeneous either, since exposure to disease risk factors can vary considerably between individuals. This is important when analyzing longitudinal datasets and different birth cohorts. We model the heterogeneity of disease progression and regression using piece-wise constant intensity functions and model the heterogeneity of risks in the population using a latent mixture structure. Different submodels under the mixture structure employ the same types of Markov states reflecting disease progression and allowing both clinical interpretation and model parsimony. We also consider flexible observational models dealing with model over-dispersion in real data. An efficient, scalable Expectation-Maximization algorithm for inference is proposed with the theoretical guaranteed convergence property. We demonstrate our method's superior performance compared to other state-of-the-art methods using synthetic data and a real-world cervical cancer screening dataset from the Cancer Registry of Norway. Moreover, we present two model-based risk stratification methods that identify the risk levels of individuals.
Collapse
Affiliation(s)
- Rui Meng
- 8787University of California, Santa Cruz, CA, USA
| | - Braden Soper
- 4578Lawrence Livermore National Laboratory, Livermore, CA, USA
| | | | | | - Mari Nygård
- 11315Cancer Registry of Norway, Oslo, Norway
| |
Collapse
|
4
|
Jiang H, Li Q, Lin JT, Lin FC. Classification of disease recurrence using transition likelihoods with expectation-maximization algorithm. Stat Med 2022; 41:4697-4715. [PMID: 35908812 PMCID: PMC9489660 DOI: 10.1002/sim.9534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 05/17/2022] [Accepted: 07/10/2022] [Indexed: 11/09/2022]
Abstract
When an infectious disease recurs, it may be due to treatment failure or a new infection. Being able to distinguish and classify these two different outcomes is critical in effective disease control. A multi-state model based on Markov processes is a typical approach to estimating the transition probability between the disease states. However, it can perform poorly when the disease state is unknown. This article aims to demonstrate that the transition likelihoods of baseline covariates can distinguish one cause from another with high accuracy in infectious diseases such as malaria. A more general model for disease progression can be constructed to allow for additional disease outcomes. We start from a multinomial logit model to estimate the disease transition probabilities and then utilize the baseline covariate's transition information to provide a more accurate classification result. We apply the expectation-maximization (EM) algorithm to estimate unknown parameters, including the marginal probabilities of disease outcomes. A simulation study comparing our classifier to the existing two-stage method shows that our classifier has better accuracy, especially when the sample size is small. The proposed method is applied to determining relapse vs reinfection outcomes in two Plasmodium vivax treatment studies from Cambodia that used different genotyping approaches to demonstrate its practical use.
Collapse
Affiliation(s)
- Huijun Jiang
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Quefeng Li
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| | - Jessica T. Lin
- Division of Infectious Disease, School of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Feng-Chang Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
5
|
Ren B, Xia CH, Gehrman P, Barnett I, Satterthwaite T. Measuring Daily Activity Rhythms in Young Adults at Risk of Affective Instability Using Passively Collected Smartphone Data: Observational Study. JMIR Form Res 2022; 6:e33890. [PMID: 36103225 PMCID: PMC9520392 DOI: 10.2196/33890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 01/18/2022] [Accepted: 07/19/2022] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Irregularities in circadian rhythms have been associated with adverse health outcomes. The regularity of rhythms can be quantified using passively collected smartphone data to provide clinically relevant biomarkers of routine. OBJECTIVE This study aims to develop a metric to quantify the regularity of activity rhythms and explore the relationship between routine and mood, as well as demographic covariates, in an outpatient psychiatric cohort. METHODS Passively sensed smartphone data from a cohort of 38 young adults from the Penn or Children's Hospital of Philadelphia Lifespan Brain Institute and Outpatient Psychiatry Clinic at the University of Pennsylvania were fitted with 2-state continuous-time hidden Markov models representing active and resting states. The regularity of routine was modeled as the hour-of-the-day random effects on the probability of state transition (ie, the association between the hour-of-the-day and state membership). A regularity score, Activity Rhythm Metric, was calculated from the continuous-time hidden Markov models and regressed on clinical and demographic covariates. RESULTS Regular activity rhythms were associated with longer sleep durations (P=.009), older age (P=.001), and mood (P=.049). CONCLUSIONS Passively sensed Activity Rhythm Metrics are an alternative to existing metrics but do not require burdensome survey-based assessments. Low-burden, passively sensed metrics based on smartphone data are promising and scalable alternatives to traditional measurements.
Collapse
Affiliation(s)
- Benny Ren
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Cedric Huchuan Xia
- Penn Lifespan Informatics and Neuroimaging Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Philip Gehrman
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
- Michael J Crescenz VA Medical Center, Philadelphia, PA, United States
| | - Ian Barnett
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| | - Theodore Satterthwaite
- Penn Lifespan Informatics and Neuroimaging Center, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, United States
| |
Collapse
|
6
|
Glennie R, Adam T, Leos‐Barajas V, Michelot T, Photopoulou T, McClintock BT. Hidden Markov Models: Pitfalls and Opportunities in Ecology. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13801] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Richard Glennie
- Centre for Research into Ecological and Environmental Modelling University of St Andrews St Andrews KY16 9LZ UK
| | - Timo Adam
- Centre for Research into Ecological and Environmental Modelling University of St Andrews St Andrews KY16 9LZ UK
| | | | - Théo Michelot
- Centre for Research into Ecological and Environmental Modelling University of St Andrews St Andrews KY16 9LZ UK
| | - Theoni Photopoulou
- Centre for Research into Ecological and Environmental Modelling University of St Andrews St Andrews KY16 9LZ UK
| | - Brett T. McClintock
- Marine Mammal Laboratory NOAA‐NMFS Alaska Fisheries Science Center Seattle USA
| |
Collapse
|
7
|
Liu Y, Lin FC, Lin JT, Li Q. Dynamic Classification of Plasmodium vivax Malaria Recurrence: An Application of Classifying Unknown Cause of Failure in Competing Risks. JOURNAL OF DATA SCIENCE : JDS 2022; 20:51-78. [PMID: 35928784 PMCID: PMC9347664 DOI: 10.6339/21-jds1026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
A standard competing risks set-up requires both time to event and cause of failure to be fully observable for all subjects. However, in application, the cause of failure may not always be observable, thus impeding the risk assessment. In some extreme cases, none of the causes of failure is observable. In the case of a recurrent episode of Plasmodium vivax malaria following treatment, the patient may have suffered a relapse from a previous infection or acquired a new infection from a mosquito bite. In this case, the time to relapse cannot be modeled when a competing risk, a new infection, is present. The efficacy of a treatment for preventing relapse from a previous infection may be underestimated when the true cause of infection cannot be classified. In this paper, we developed a novel method for classifying the latent cause of failure under a competing risks set-up, which uses not only time to event information but also transition likelihoods between covariates at the baseline and at the time of event occurrence. Our classifier shows superior performance under various scenarios in simulation experiments. The method was applied to Plasmodium vivax infection data to classify recurrent infections of malaria.
Collapse
Affiliation(s)
- Yutong Liu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Feng-Chang Lin
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Jessica T Lin
- Institute of Global Health and Infectious Diseases, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| | - Quefeng Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, U.S.A
| |
Collapse
|
8
|
Soper BC, Cadena J, Nguyen S, Chan KHR, Kiszka P, Womack L, Work M, Duggan JM, Haller ST, Hanrahan JA, Kennedy DJ, Mukundan D, Ray P. OUP accepted manuscript. J Am Med Inform Assoc 2022; 29:864-872. [PMID: 35137149 PMCID: PMC8903413 DOI: 10.1093/jamia/ocac012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 12/15/2021] [Accepted: 01/28/2022] [Indexed: 11/12/2022] Open
Abstract
Objective The study sought to investigate the disease state–dependent risk profiles of patient demographics and medical comorbidities associated with adverse outcomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. Materials and Methods A covariate-dependent, continuous-time hidden Markov model with 4 states (moderate, severe, discharged, and deceased) was used to model the dynamic progression of COVID-19 during the course of hospitalization. All model parameters were estimated using the electronic health records of 1362 patients from ProMedica Health System admitted between March 20, 2020 and December 29, 2020 with a positive nasopharyngeal PCR test for SARS-CoV-2. Demographic characteristics, comorbidities, vital signs, and laboratory test results were retrospectively evaluated to infer a patient’s clinical progression. Results The association between patient-level covariates and risk of progression was found to be disease state dependent. Specifically, while being male, being Black or having a medical comorbidity were all associated with an increased risk of progressing from the moderate disease state to the severe disease state, these same factors were associated with a decreased risk of progressing from the severe disease state to the deceased state. Discussion Recent studies have not included analyses of the temporal progression of COVID-19, making the current study a unique modeling-based approach to understand the dynamics of COVID-19 in hospitalized patients. Conclusion Dynamic risk stratification models have the potential to improve clinical outcomes not only in COVID-19, but also in a myriad of other acute and chronic diseases that, to date, have largely been assessed only by static modeling techniques.
Collapse
Affiliation(s)
- Braden C Soper
- Corresponding Author: Braden C. Soper, PhD, Computing Directorate, Lawrence Livermore National Laboratory, 7000 East Ave, Livermore, CA 94550, USA;
| | - Jose Cadena
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Sam Nguyen
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Kwan Ho Ryan Chan
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Paul Kiszka
- Information Technology Services, ProMedica Health System, Inc, Toledo, Ohio, USA
| | - Lucas Womack
- Information Technology Services, ProMedica Health System, Inc, Toledo, Ohio, USA
| | - Mark Work
- Information Technology Services, ProMedica Health System, Inc, Toledo, Ohio, USA
| | - Joan M Duggan
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Steven T Haller
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Jennifer A Hanrahan
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - David J Kennedy
- Department of Medicine, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Deepa Mukundan
- Department of Pediatrics, University of Toledo College of Medicine and Life Sciences, Toledo, Ohio, USA
| | - Priyadip Ray
- Engineering Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| |
Collapse
|
9
|
Thindwa D, Wolter N, Pinsent A, Carrim M, Ojal J, Tempia S, Moyes J, McMorrow M, Kleynhans J, von Gottberg A, French N, Cohen C, Flasche S. Estimating the contribution of HIV-infected adults to household pneumococcal transmission in South Africa, 2016–2018: A hidden Markov modelling study. PLoS Comput Biol 2021; 17:e1009680. [PMID: 34941865 PMCID: PMC8699682 DOI: 10.1371/journal.pcbi.1009680] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 11/24/2021] [Indexed: 12/17/2022] Open
Abstract
Human immunodeficiency virus (HIV) infected adults are at a higher risk of pneumococcal colonisation and disease, even while receiving antiretroviral therapy (ART). To help evaluate potential indirect effects of vaccination of HIV-infected adults, we assessed whether HIV-infected adults disproportionately contribute to household transmission of pneumococci. We constructed a hidden Markov model to capture the dynamics of pneumococcal carriage acquisition and clearance observed during a longitudinal household-based nasopharyngeal swabbing study, while accounting for sample misclassifications. Households were followed-up twice weekly for approximately 10 months each year during a three-year study period for nasopharyngeal carriage detection via real-time PCR. We estimated the effect of participant’s age, HIV status, presence of a HIV-infected adult within the household and other covariates on pneumococcal acquisition and clearance probabilities. Of 1,684 individuals enrolled, 279 (16.6%) were younger children (<5 years-old) of whom 4 (1.5%) were HIV-infected and 726 (43.1%) were adults (≥18 years-old) of whom 214 (30.4%) were HIV-infected, most (173, 81.2%) with high CD4+ count. The observed range of pneumococcal carriage prevalence across visits was substantially higher in younger children (56.9–80.5%) than older children (5–17 years-old) (31.7–50.0%) or adults (11.5–23.5%). We estimate that 14.4% (95% Confidence Interval [CI]: 13.7–15.0) of pneumococcal-negative swabs were false negatives. Daily carriage acquisition probabilities among HIV-uninfected younger children were similar in households with and without HIV-infected adults (hazard ratio: 0.95, 95%CI: 0.91–1.01). Longer average carriage duration (11.4 days, 95%CI: 10.2–12.8 vs 6.0 days, 95%CI: 5.6–6.3) and higher median carriage density (622 genome equivalents per millilitre, 95%CI: 507–714 vs 389, 95%CI: 311.1–435.5) were estimated in HIV-infected vs HIV-uninfected adults. The use of ART and antibiotics substantially reduced carriage duration in all age groups, and acquisition rates increased with household size. Although South African HIV-infected adults on ART have longer carriage duration and density than their HIV-uninfected counterparts, they show similar patterns of pneumococcal acquisition and onward transmission. We assessed the contribution of HIV-infected adults to household pneumococcal transmission by applying a hidden Markov model to pneumococcal cohort data comprising 115,595 nasopharyngeal samples from 1,684 individuals in rural and urban settings in South Africa. We estimated 14.4% of sample misclassifications (false negatives), representing 85.6% sensitivity of a test that was used to detect pneumococcus. Pneumococcal carriage prevalence and acquisition rates, and average duration were usually higher in younger or older children than adults. The use of ART and antibiotics reduced the average carriage duration across all age and HIV groups, and carriage acquisition risks increased in larger household sizes. Despite the longer average carriage duration and higher median carriage density in HIV-infected than HIV-uninfected adults, we found similar carriage acquisition and onward transmission risks in the dual groups. These findings suggest that vaccinating HIV-infected adults on ART with PCV would reduce their risk for pneumococcal disease but may add little to the indirect protection against carriage of the rest of the population.
Collapse
Affiliation(s)
- Deus Thindwa
- Centre for the Mathematical Modelling of Infectious Diseases, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Malawi Liverpool Wellcome Trust Clinical Research Programme, Blantyre, Malawi
- * E-mail:
| | - Nicole Wolter
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
- School of Pathology, University of the Witwatersrand, Johannesburg, South Africa
| | - Amy Pinsent
- Centre for the Mathematical Modelling of Infectious Diseases, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Aquarius Population Health, London, United Kingdom
| | - Maimuna Carrim
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
| | - John Ojal
- Centre for the Mathematical Modelling of Infectious Diseases, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
- KEMRI-Wellcome Trust Research Programme, Geographic Medicine Centre, Kilifi, Kenya
| | - Stefano Tempia
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
- Influenza Program, Centers for Disease Control and Prevention, Pretoria, South Africa
- Influenza Division, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America
- MassGenics, Duluth, Georgia, United States of America
| | - Jocelyn Moyes
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
| | - Meredith McMorrow
- Influenza Division, Centers for Disease Control and Prevention, Atlanta, Georgia, United States of America
| | - Jackie Kleynhans
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
- School of Public Health, Faculty of Health Science, University of the Witwatersrand, Johannesburg, South Africa
| | - Anne von Gottberg
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
- School of Pathology, University of the Witwatersrand, Johannesburg, South Africa
| | - Neil French
- Malawi Liverpool Wellcome Trust Clinical Research Programme, Blantyre, Malawi
- Institute of Infection, Veterinary and Ecological Science, Department of Clinical Infection, Microbiology, and Immunology, University of Liverpool, Liverpool, United Kingdom
| | | | - Cheryl Cohen
- Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases of the National Health Laboratory Service, Johannesburg, South Africa
- School of Public Health, Faculty of Health Science, University of the Witwatersrand, Johannesburg, South Africa
| | - Stefan Flasche
- Centre for the Mathematical Modelling of Infectious Diseases, Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
10
|
Ikesu R, Taguchi A, Hara K, Kawana K, Tsuruga T, Tomio J, Osuga Y. Prognosis of high-risk human papillomavirus-related cervical lesions: A hidden Markov model analysis of a single-center cohort in Japan. Cancer Med 2021; 11:664-675. [PMID: 34921517 PMCID: PMC8817087 DOI: 10.1002/cam4.4470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 10/14/2021] [Accepted: 11/16/2021] [Indexed: 11/11/2022] Open
Abstract
Introduction Previous studies have shown that individuals with human papillomavirus (HPV)‐related cervical lesions have different prognoses according to the HPV genotype. However, these studies failed to account for possible diagnostic misclassification. In this retrospective cohort study, we aimed to clarify the natural course of cervical lesions according to HPV genotype to account for any diagnostic misclassification. Materials and Methods Our cohort included 729 patients classified as having cervical intraepithelial neoplasia (CIN). HPV was genotyped in all patients, who were followed up or treated for cervical lesions at the University of Tokyo Hospital from October 1, 2008 to March 31, 2015. Hidden Markov models were applied to estimate the diagnostic misclassification probabilities of the current diagnostic practice (histology and cytology) and the transitions between true states. We then simulated two‐year transition probabilities between true cervical states according to HPV genotype. Results Compared with lesions in patients with other HPV genotypes, lesions in HPV 16‐positive patients were estimated to be more likely to increase in severity (i.e., CIN3/cancer); over 2 years, 17.7% (95% confidence interval [CI], 9.3%–29.3%) and 27.8% (95% CI, 16.6%–43.5%) of those with HPV 16 progressed to CIN3/cancer from the true states of CIN1 and CIN2, respectively, whereas 55%–70% of CIN1/2 patients infected with HPV 52/58 remained in the CIN1/2 category. Misclassification was estimated to occur at a rate of 3%–38% in the current diagnostic practice. Conclusion This study contributes robust evidence to current literature on cervical lesion prognosis according to HPV genotype and quantifies the diagnostic misclassification of true cervical lesions.
Collapse
Affiliation(s)
- Ryo Ikesu
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Ayumi Taguchi
- Department of Obstetrics and Gynecology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Konan Hara
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.,Department of Economics, University of Arizona, Tucson, Arizona, USA.,Hematology Division, Tokyo Metropolitan Cancer and Infectious Diseases Center, Komagome Hospital, Bunkyo-ku, Tokyo, Japan
| | - Kei Kawana
- Department of Obstetrics and Gynecology, School of Medicine, Nihon University, Itabashi-ku, Tokyo, Japan
| | - Tetsushi Tsuruga
- Department of Obstetrics and Gynecology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Jun Tomio
- Department of Public Health, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Yutaka Osuga
- Department of Obstetrics and Gynecology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
11
|
Buchatskyi LP. DETERMINING PROBABILITY OF CANCER CELL TRANSFOMATION AT HUMAN PAPILLOMAVIRUS INFECTION. BIOTECHNOLOGIA ACTA 2021. [DOI: 10.15407/biotech14.05.074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Aim. The purpose of the work was to assess the probability of cancerous transformation of cells for viruses of high and low oncogenic risk. Aim. The purpose of the work was to assess the probability of cancerous transformation of cells for viruses of high and low oncogenic risk. Results. Using normalized squared error (NSE) for viruses of high (20 strains) and low (153 strains) oncogenic risk, rank statistic of 2-exponential type was build. For productive papillomavirus infection, NSE function was determined as the growing accurate 2-exponent of a cell layer basal to the epithelial surface. Logarithm of NSE numerical values is proportional to the cell entropy that is connected with the availability of virus DNA. To calculate entropy, generalized Hartley formula was used with the informational cell of dimension d: H = NdLOG(NSE), where N is the generalized cell coordinate. Conclusions. Using a statistical ensemble of E6 proteins separately for viruses of high and low oncogenic risk made it possible to assess the probability of cancerous transformation of cells, which was proportional to the ratio of the area of entropy of cancer transformation to the area of the productive entropy region papillomavirus infection.
Collapse
|
12
|
Aleshin-Guendel S, Lange J, Goodman P, Weiss NS, Etzioni R. A Latent Disease Model to Reduce Detection Bias in Cancer Risk Prediction Studies. Eval Health Prof 2021; 44:42-49. [PMID: 33506704 PMCID: PMC8279086 DOI: 10.1177/0163278720984203] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In studies of cancer risk, detection bias arises when risk factors are associated with screening patterns, affecting the likelihood and timing of diagnosis. To eliminate detection bias in a screened cohort, we propose modeling the latent onset of cancer and estimating the association between risk factors and onset rather than diagnosis. We apply this framework to estimate the increase in prostate cancer risk associated with black race and family history using data from the SELECT prostate cancer prevention trial, in which men were screened and biopsied according to community practices. A positive family history was associated with a hazard ratio (HR) of prostate cancer onset of 1.8, lower than the corresponding HR of prostate cancer diagnosis (HR = 2.2). This result comports with a finding that men in SELECT with a family history were more likely to be biopsied following a positive PSA test than men with no family history. For black race, the HRs for onset and diagnosis were similar, consistent with similar patterns of screening and biopsy by race. If individual screening and diagnosis histories are available, latent disease modeling can be used to decouple risk of disease from risk of disease diagnosis and reduce detection bias.
Collapse
Affiliation(s)
| | - Jane Lange
- Fred Hutchinson Cancer Research Center, Seattle, WA
| | | | - Noel S Weiss
- Fred Hutchinson Cancer Research Center, Seattle, WA
- University of Washington, Department of Epidemiology
| | - Ruth Etzioni
- University of Washington, Department of Biostatistics, Seattle, WA
- Fred Hutchinson Cancer Research Center, Seattle, WA
| |
Collapse
|
13
|
Soper BC, Nygård M, Abdulla G, Meng R, Nygård JF. A hidden Markov model for population-level cervical cancer screening data. Stat Med 2020; 39:3569-3590. [PMID: 32854166 DOI: 10.1002/sim.8681] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 06/02/2020] [Accepted: 06/11/2020] [Indexed: 12/19/2022]
Abstract
The Cancer Registry of Norway has been administrating a national cervical cancer screening program since 1992 by coordinating triennial cytology exam screenings for the female population between 25 and 69 years of age. Up to 80% of cancers are prevented through mass screening, but this comes at the expense of considerable screening activity and leads to overtreatment of clinically asymptomatic precancers. In this article, we present a continuous-time, time-inhomogeneous hidden Markov model which was developed to understand the screening process and cervical cancer carcinogenesis in detail. By leveraging 1.7 million individual's multivariate time-series of medical exams performed over a 25-year period, we simultaneously estimate all model parameters. We show that an age-dependent model reflects the Norwegian screening program by comparing empirical survival curves from observed registry data and data simulated from the proposed model. The model can be generalized to include more detailed individual-level covariates as well as new types of screening exams. By utilizing individual screening histories and covariate data, the proposed model shows potential for improving strategies for cancer screening programs by personalizing recommended screening intervals.
Collapse
Affiliation(s)
- Braden C Soper
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Mari Nygård
- Research Department, Cancer Registry of Norway, Oslo, Norway
| | - Ghaleb Abdulla
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California, USA
| | - Rui Meng
- Department of Statistics, University of California, Santa Cruz, California, USA
| | - Jan F Nygård
- Registry Informatics Department, Cancer Registry of Norway, Oslo, Norway
| |
Collapse
|
14
|
|
15
|
Williams JP, Storlie CB, Therneau TM, Jr CRJ, Hannig J. A Bayesian Approach to Multistate Hidden Markov Models: Application to Dementia Progression. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2019.1594831] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Jonathan P. Williams
- Mayo Clinic, Rochester, MN
- Department of Statstics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | | | | | | | - Jan Hannig
- Department of Statstics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
16
|
Yi GY, He W, He F. Analysis of panel data under hidden mover-stayer models. Stat Med 2017; 36:3231-3243. [DOI: 10.1002/sim.7346] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Revised: 03/02/2017] [Accepted: 04/29/2017] [Indexed: 11/11/2022]
Affiliation(s)
- Grace Y. Yi
- Department of Statistics and Actuarial Science; University of Waterloo; 200 University Avenue West Waterloo N2L 3G1 Ontario Canada
| | - Wenqing He
- Department of Statistical and Actuarial Sciences; University of Western Ontario; 1151 Richmond Street North London, Ontario N6A 5B7 Canada
| | - Feng He
- Department of Statistics and Actuarial Science; University of Waterloo; 200 University Avenue West Waterloo N2L 3G1 Ontario Canada
| |
Collapse
|
17
|
Oke JL, Stratton IM, Aldington SJ, Stevens RJ, Scanlon PH. The use of statistical methodology to determine the accuracy of grading within a diabetic retinopathy screening programme. Diabet Med 2016; 33:896-903. [PMID: 26666463 PMCID: PMC5019246 DOI: 10.1111/dme.13053] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/07/2015] [Indexed: 01/06/2023]
Abstract
AIMS We aimed to use longitudinal data from an established screening programme with good quality assurance and quality control procedures and a stable well-trained workforce to determine the accuracy of grading in diabetic retinopathy screening. METHODS We used a continuous time-hidden Markov model with five states to estimate the probability of true progression or regression of retinopathy and the conditional probability of an observed grade given the true grade (misclassification). The true stage of retinopathy was modelled as a function of the duration of diabetes and HbA1c . RESULTS The modelling dataset consisted of 65 839 grades from 14 187 people. The median number [interquartile range (IQR)] of examinations was 5 (3, 6) and the median (IQR) interval between examinations was 1.04 (0.99, 1.17) years. In total, 14 227 grades (21.6%) were estimated as being misclassified, 10 592 (16.1%) represented over-grading and 3635 (5.5%) represented under-grading. There were 1935 (2.9%) misclassified referrals, 1305 were false-positive results (2.2%) and 630 were false-negative results (1.0%). Misclassification of background diabetic retinopathy as no detectable retinopathy was common (3.4% of all grades) but rarely preceded referable maculopathy or retinopathy. CONCLUSION Misclassification between lower grades of retinopathy is not uncommon but is unlikely to lead to significant delays in referring people for sight-threatening retinopathy.
Collapse
Affiliation(s)
- J L Oke
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - I M Stratton
- Gloucestershire Retinal Research Group, Gloucester, UK
| | - S J Aldington
- Gloucestershire Retinal Research Group, Gloucester, UK
| | - R J Stevens
- Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, UK
| | - P H Scanlon
- Gloucestershire Retinal Research Group, Gloucester, UK
| |
Collapse
|
18
|
Lu S. A continuous-time HMM approach to modeling the magnitude-frequency distribution of earthquakes. J Appl Stat 2016. [DOI: 10.1080/02664763.2016.1161736] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Shaochuan Lu
- School of Statistics, Beijing Normal University, Beijing, People's Republic of China
| |
Collapse
|
19
|
Benoit JS, Chan W, Luo S, Yeh HW, Doody R. A hidden Markov model approach to analyze longitudinal ternary outcomes when some observed states are possibly misclassified. Stat Med 2016; 35:1549-57. [PMID: 26782946 DOI: 10.1002/sim.6861] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2015] [Revised: 10/25/2015] [Accepted: 12/06/2015] [Indexed: 11/10/2022]
Abstract
Understanding the dynamic disease process is vital in early detection, diagnosis, and measuring progression. Continuous-time Markov chain (CTMC) methods have been used to estimate state-change intensities but challenges arise when stages are potentially misclassified. We present an analytical likelihood approach where the hidden state is modeled as a three-state CTMC model allowing for some observed states to be possibly misclassified. Covariate effects of the hidden process and misclassification probabilities of the hidden state are estimated without information from a 'gold standard' as comparison. Parameter estimates are obtained using a modified expectation-maximization (EM) algorithm, and identifiability of CTMC estimation is addressed. Simulation studies and an application studying Alzheimer's disease caregiver stress-levels are presented. The method was highly sensitive to detecting true misclassification and did not falsely identify error in the absence of misclassification. In conclusion, we have developed a robust longitudinal method for analyzing categorical outcome data when classification of disease severity stage is uncertain and the purpose is to study the process' transition behavior without a gold standard.
Collapse
Affiliation(s)
- Julia S Benoit
- Texas Institute for Measurement, Evaluation, and Statistics and Department of Basic Vision Sciences, College of Optometry, The University of Houston, Houston, TX, 77204, U.S.A.,Department of Biostatistics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, U.S.A
| | - Wenyaw Chan
- Department of Biostatistics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, U.S.A
| | - Sheng Luo
- Department of Biostatistics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, U.S.A
| | - Hung-Wen Yeh
- Department of Biostatistics, The University of Kansas Medical Center, Kansas City, KS, 66160, U.S.A
| | - Rachelle Doody
- Alzheimer's Disease and Memory Disorders Center, Department of Neurology, Baylor College of Medicine, Houston, TX, 77030, U.S.A
| |
Collapse
|
20
|
Abstract
Calculating the probability of each possible outcome for a patient at any time in the future is currently possible only in the simplest cases: short-term prediction in acute diseases of otherwise healthy persons. This problem is to some extent analogous to predicting the concentrations of species in a reactor when knowing initial concentrations and after examining reaction rates at the individual molecule level. The existing theoretical framework behind predicting contagion and the immediate outcome of acute diseases in previously healthy individuals is largely analogous to deterministic kinetics of chemical systems consisting of one or a few reactions. We show that current statistical models commonly used in chronic disease epidemiology correspond to simple stochastic treatment of single reaction systems. The general problem corresponds to stochastic kinetics of complex reaction systems. We attempt to formulate epidemiologic problems related to chronic diseases in chemical kinetics terms. We review methods that may be adapted for use in epidemiology. We show that some reactions cannot fit into the mass-action law paradigm and solutions to these systems would frequently exhibit an antiportfolio effect. We provide a complete example application of stochastic kinetics modeling for a deductive meta-analysis of two papers on atrial fibrillation incidence, prevalence, and mortality.
Collapse
|
21
|
Kong X, Wang MC, Gray R. Analysis of longitudinal multivariate outcome data from couples cohort studies: application to HPV transmission dynamics. J Am Stat Assoc 2015; 110:472-485. [PMID: 26195849 PMCID: PMC4505367 DOI: 10.1080/01621459.2014.991394] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
We consider a specific situation of correlated data where multiple outcomes are repeatedly measured on each member of a couple. Such multivariate longitudinal data from couples may exhibit multi-faceted correlations which can be further complicated if there are polygamous partnerships. An example is data from cohort studies on human papillomavirus (HPV) transmission dynamics in heterosexual couples. HPV is a common sexually transmitted disease with 14 known oncogenic types causing anogenital cancers. The binary outcomes on the multiple types measured in couples over time may introduce inter-type, intra-couple, and temporal correlations. Simple analysis using generalized estimating equations or random effects models lacks interpretability and cannot fully utilize the available information. We developed a hybrid modeling strategy using Markov transition models together with pairwise composite likelihood for analyzing such data. The method can be used to identify risk factors associated with HPV transmission and persistence, estimate difference in risks between male-to-female and female-to-male HPV transmission, compare type-specific transmission risks within couples, and characterize the inter-type and intra-couple associations. Applying the method to HPV couple data collected in a Ugandan male circumcision (MC) trial, we assessed the effect of MC and the role of gender on risks of HPV transmission and persistence.
Collapse
Affiliation(s)
- Xiangrong Kong
- Department of Epidemiology and Department of Biostatistics
| | | | - Ronald Gray
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University
| |
Collapse
|
22
|
Teeple EA, Brown ER. Adjusting for time-dependent sensitivity in an illness-death model, with application to mother-to-child transmission of HIV. Stat Med 2014; 34:1277-92. [PMID: 25546029 DOI: 10.1002/sim.6402] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2013] [Revised: 11/26/2014] [Accepted: 12/05/2014] [Indexed: 11/05/2022]
Abstract
In mother-to-child transmission of HIV, identifying infected infants relies on a diagnostic test with imperfect sensitivity that is administered at scheduled visits. Under this scenario, a participant's true state may be unknown at the start and end times of the study, and the detection of transitions into illness may be delayed or missed altogether. This could lead to biased estimates of the risk of transmission and covariate associations. When a test has imperfect sensitivity, but perfect specificity, the additional uncertainty can be captured as a random variable measuring delay in detection. The cumulative distribution then defines a time-dependent sensitivity function that increases over time. We present a maximum likelihood based illness-death model that accounts for imperfect sensitivity by including the delay as an exponential distribution. We specify transition rates as penalized B-splines to allow for nonhomogeneity of risk and discuss the model under Markov and semi-Markov assumptions. We apply this method to our motivating data set, a study of 1499 mother and infant pairs at three sites in Africa.
Collapse
Affiliation(s)
- Elizabeth A Teeple
- Fred Hutchinson Cancer Research, Center, 1100 Fairview Ave. N., M2-C200, Seattle, WA 98109, U.S.A
| | | |
Collapse
|
23
|
Infection transmission and chronic disease models in the study of infection-associated cancers. Br J Cancer 2013; 110:7-11. [PMID: 24300979 PMCID: PMC3887312 DOI: 10.1038/bjc.2013.740] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Revised: 10/16/2013] [Accepted: 10/30/2013] [Indexed: 11/16/2022] Open
Abstract
In the last three decades, the appreciation of the role of infections in cancer aetiology has greatly expanded. Among the 13 million new cancer cases that occurred worldwide in 2008, around 2 million (16%) were attributable to infections. Concurrently, the approach to prevention of infection-related cancers is shifting from cancer control to infection control, for example, vaccination and the detection of infected individuals. In support of this change, the use of infection transmission models has entered the field of infection-related cancer epidemiology. These models are useful to understand the infection transmission processes, to estimate the key parameters that govern the spread of infection, and to project the potential impact of different preventive measures. However, the concepts, terminology, and methods used to study infection transmission are not yet well known in the domain of cancer epidemiology. This review aims to concisely illustrate the main principles of transmission dynamics, the basic structure of infection transmission models, and their use in combination with empirical data. We also briefly summarise models of carcinogenesis and discuss their specificities and possible integration with models of infection natural history.
Collapse
|
24
|
Cui N, Chen Y, Small DS. Modeling parasite infection dynamics when there is heterogeneity and imperfect detectability. Biometrics 2013; 69:683-92. [PMID: 23848564 DOI: 10.1111/biom.12050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Revised: 03/01/2013] [Accepted: 04/01/2013] [Indexed: 11/29/2022]
Abstract
Understanding the infection and recovery rate from parasitic infections is valuable for public health planning. Two challenges in modeling these rates are (1) infection status is only observed at discrete times even though infection and recovery take place in continuous time and (2) detectability of infection is imperfect. We address these issues through a Bayesian hierarchical model based on a random effects Weibull distribution. The model incorporates heterogeneity of the infection and recovery rate among individuals and allows for imperfect detectability. We estimate the model by a Markov chain Monte Carlo algorithm with data augmentation. We present simulation studies and an application to an infection study about the parasite Giardia lamblia among children in Kenya.
Collapse
Affiliation(s)
- Na Cui
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 S. Wright Street, Champaign, Illinois 61820, U.S.A
| | | | | |
Collapse
|
25
|
Lange JM, Minin VN. Fitting and interpreting continuous-time latent Markov models for panel data. Stat Med 2013; 32:4581-95. [PMID: 23740756 DOI: 10.1002/sim.5861] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 05/01/2013] [Indexed: 11/11/2022]
Abstract
Multistate models characterize disease processes within an individual. Clinical studies often observe the disease status of individuals at discrete time points, making exact times of transitions between disease states unknown. Such panel data pose considerable modeling challenges. Assuming the disease process progresses accordingly, a standard continuous-time Markov chain (CTMC) yields tractable likelihoods, but the assumption of exponential sojourn time distributions is typically unrealistic. More flexible semi-Markov models permit generic sojourn distributions yet yield intractable likelihoods for panel data in the presence of reversible transitions. One attractive alternative is to assume that the disease process is characterized by an underlying latent CTMC, with multiple latent states mapping to each disease state. These models retain analytic tractability due to the CTMC framework but allow for flexible, duration-dependent disease state sojourn distributions. We have developed a robust and efficient expectation-maximization algorithm in this context. Our complete data state space consists of the observed data and the underlying latent trajectory, yielding computationally efficient expectation and maximization steps. Our algorithm outperforms alternative methods measured in terms of time to convergence and robustness. We also examine the frequentist performance of latent CTMC point and interval estimates of disease process functionals based on simulated data. The performance of estimates depends on time, functional, and data-generating scenario. Finally, we illustrate the interpretive power of latent CTMC models for describing disease processes on a dataset of lung transplant patients. We hope our work will encourage wider use of these models in the biomedical setting.
Collapse
Affiliation(s)
- Jane M Lange
- Department of Biostatistics, University of Washington, Seattle, WA, U.S.A
| | | |
Collapse
|
26
|
Cook RJ, Lawless JF. Statistical Issues in Modeling Chronic Disease in Cohort Studies. STATISTICS IN BIOSCIENCES 2013. [DOI: 10.1007/s12561-013-9087-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
27
|
Estimating incidence rates with misclassified disease status: a likelihood-based approach, with application to hepatitis C virus. Int J Infect Dis 2012; 16:e527-31. [PMID: 22543295 DOI: 10.1016/j.ijid.2012.02.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Revised: 02/20/2012] [Accepted: 02/28/2012] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND In epidemiologic research, incidence is often estimated from data arising from an imperfect diagnostic test performed at unequally spaced intervals over time. METHODS We developed a likelihood-based method to estimate incidence when disease status is measured imperfectly and assays are performed at multiple unequally spaced visits. We assumed conditional independence, no remission, known constant levels of sensitivity and specificity, and constant incidence rates over time. The method performance was evaluated by examining its bias, accuracy (i.e., mean squared error (MSE)), and coverage probability in a simulation study of 4000 datasets, and then we applied the proposed method to a study of hepatitis C virus (HCV) infection in a cohort of pregnant women in the period 1997-2006. RESULTS The simulation revealed that our method has minimal bias and low MSE, as well as good coverage probability of the resulting confidence intervals. In the application to HCV study, the standard incidence rate estimate which ignores the imperfections of the diagnostic test (number of events/person-years), was 13.7 new HCV cases per 1000 person-years (95% confidence interval 10.1, 17.4). The adjusted incidence estimates (obtained using our proposed method) ranged from 0.4 cases per 1000 person-years (when sensitivity and specificity were assumed to both be 95%) to 13.7 cases per 1000 person-years (when sensitivity and specificity were both 100%). The magnitude of difference between standard and adjusted estimates varied depending on specificity and sensitivity assumptions. Specificity had the greatest impact on the magnitude of bias. CONCLUSIONS Scientists should be aware of the impact of misclassification on incidence estimates. Appropriate study design, proper selection of the diagnostic test, and adjustment for misclassification probabilities in the analysis is necessary to obtain the most accurate incidence estimates.
Collapse
|
28
|
Yeh HW, Chan W, Symanski E. Intermittent Missing Observations in Discrete-Time Hidden Markov Models. COMMUN STAT-SIMUL C 2012. [DOI: 10.1080/03610918.2011.581778] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
29
|
Bartolomeo N, Trerotoli P, Serio G. Progression of liver cirrhosis to HCC: an application of hidden Markov model. BMC Med Res Methodol 2011; 11:38. [PMID: 21457586 PMCID: PMC3087702 DOI: 10.1186/1471-2288-11-38] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2010] [Accepted: 04/04/2011] [Indexed: 01/05/2023] Open
Abstract
Background Health service databases of administrative type can be a useful tool for the study of progression of a disease, but the data reported in such sources could be affected by misclassifications of some patients' real disease states at the time. Aim of this work was to estimate the transition probabilities through the different degenerative phases of liver cirrhosis using health service databases. Methods We employed a hidden Markov model to determine the transition probabilities between two states, and of misclassification. The covariates inserted in the model were sex, age, the presence of comorbidities correlated with alcohol abuse, the presence of diagnosis codes indicating hepatitis C virus infection, and the Charlson Index. The analysis was conducted in patients presumed to have suffered the onset of cirrhosis in 2000, observing the disease evolution and, if applicable, death up to the end of the year 2006. Results The incidence of hepatocellular carcinoma (HCC) in cirrhotic patients was 1.5% per year. The probability of developing HCC is higher in males (OR = 2.217) and patients over 65 (OR = 1.547); over 65-year-olds have a greater probability of death both while still suffering from cirrhosis (OR = 2.379) and if they have developed HCC (OR = 1.410). A more severe casemix affects the transition from HCC to death (OR = 1.714). The probability of misclassifying subjects with HCC as exclusively affected by liver cirrhosis is 14.08%. Conclusions The hidden Markov model allowing for misclassification is well suited to analyses of health service databases, since it is able to capture bias due to the fact that the quality and accuracy of the available information are not always optimal. The probability of evolution of a cirrhotic subject to HCC depends on sex and age class, while hepatitis C virus infection and comorbidities correlated with alcohol abuse do not seem to have an influence.
Collapse
Affiliation(s)
- Nicola Bartolomeo
- Department of Biomedical Science and Human Oncology, Chair of Medical Statistics, University of Bari, Bari, Italy.
| | | | | |
Collapse
|
30
|
Abstract
Continuous-time multistate models are widely used for categorical response data, particularly in the modeling of chronic diseases. However, inference is difficult when the process is only observed at discrete time points, with no information about the times or types of events between observation times, unless a Markov assumption is made. This assumption can be limiting as rates of transition between disease states might instead depend on the time since entry into the current state. Such a formulation results in a semi-Markov model. We show that the computational problems associated with fitting semi-Markov models to panel-observed data can be alleviated by considering a class of semi-Markov models with phase-type sojourn distributions. This allows methods for hidden Markov models to be applied. In addition, extensions to models where observed states are subject to classification error are given. The methodology is demonstrated on a dataset relating to development of bronchiolitis obliterans syndrome in post-lung-transplantation patients.
Collapse
Affiliation(s)
- Andrew C Titman
- Department of Mathematics and Statistics, Lancaster University, Lancaster, UK.
| | | |
Collapse
|
31
|
Titman AC. Computation of the asymptotic null distribution of goodness-of-fit tests for multi-state models. LIFETIME DATA ANALYSIS 2009; 15:519-533. [PMID: 19882350 DOI: 10.1007/s10985-009-9133-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2008] [Accepted: 10/14/2009] [Indexed: 05/28/2023]
Abstract
We develop an improved approximation to the asymptotic null distribution of the goodness-of-fit tests for panel observed multi-state Markov models (Aguirre-Hernandez and Farewell, Stat Med 21:1899-1911, 2002) and hidden Markov models (Titman and Sharples, Stat Med 27:2177-2195, 2008). By considering the joint distribution of the grouped observed transition counts and the maximum likelihood estimate of the parameter vector it is shown that the distribution can be expressed as a weighted sum of independent chi(1)(2) random variables, where the weights are dependent on the true parameters. The performance of this approximation for finite sample sizes and where the weights are calculated using the maximum likelihood estimates of the parameters is considered through simulation. In the scenarios considered, the approximation performs well and is a substantial improvement over the simple chi(2) approximation.
Collapse
Affiliation(s)
- Andrew C Titman
- Department of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YF, UK.
| |
Collapse
|
32
|
van den Hout A, Jagger C, Matthews FE. Estimating life expectancy in health and ill health by using a hidden Markov model. J R Stat Soc Ser C Appl Stat 2009. [DOI: 10.1111/j.1467-9876.2008.00659.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
33
|
Rosychuk RJ, Shofiqul Islam. Parameter estimation in a model for misclassified Markov data — a Bayesian approach. Comput Stat Data Anal 2009. [DOI: 10.1016/j.csda.2009.04.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
34
|
Abstract
Multi-state models are a popular method of describing medical processes that can be represented as discrete states or stages. They have particular use when the data are panel-observed, meaning they consist of discrete snapshots of disease status at irregular time points which may be unique to each patient. However, due to the difficulty of inference in more complicated cases, strong assumptions such as the Markov property, patient homogeneity and time homogeneity are applied. It is important that the validity of these assumptions is tested. A review of methods for diagnosing model fit for panel-observed continuous-time Markov and misclassification-type hidden Markov models is given, with illustrative application to a dataset on cardiac allograft vasculopathy progression in post-heart transplant patients.
Collapse
Affiliation(s)
- Andrew C Titman
- Department of Mathematics and Statistics, Lancaster University, UK.
| | | |
Collapse
|
35
|
Abstract
Markov models are a convenient and useful method of estimating transition rates between levels of a categorical response variable, such as a disease stage, which changes over time. In medical applications the response variable is typically observed at irregular intervals. A Pearson-type goodness-of-fit test for such models was proposed by Aguirre-Hernandez and Farewell (Statist. Med. 2002; 21:1899-1911), but this test is not applicable in the common situation where the process includes an absorbing state, such as death, for which the time of entry is known precisely nor when the data include censored state observations. This paper presents a modification to the Pearson-type test to allow for these cases. An extension of the method, to allow for the class of hidden Markov models where the response variable is subject to misclassification error, is given. The method is applied to data on cardiac allograft vasculopathy in post-heart-transplant patients.
Collapse
Affiliation(s)
- Andrew C Titman
- Medical Research Council Biostatistics Unit, Cambridge, U.K.
| | | |
Collapse
|
36
|
Hubbard RA, Inoue LYT, Fann JR. Modeling Nonhomogeneous Markov Processes via Time Transformation. Biometrics 2007; 64:843-850. [DOI: 10.1111/j.1541-0420.2007.00932.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- R. A. Hubbard
- Department of Biostatistics, University of Washington, Box 357232, Seattle, Washington 98195, U.S.A
| | - L. Y. T. Inoue
- Department of Biostatistics, University of Washington, Box 357232, Seattle, Washington 98195, U.S.A
| | - J. R. Fann
- Department of Psychiatry and Behavioral Sciences, University of Washington, Box 356560, Seattle, Washington 98195, U.S.A
| |
Collapse
|
37
|
Drovandi CC, Pettitt AN. Multivariate Markov process models for the transmission of methicillin-resistant Staphylococcus aureus in a hospital ward. Biometrics 2007; 64:851-859. [PMID: 18047536 DOI: 10.1111/j.1541-0420.2007.00933.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Methicillin-resistant Staphylococcus Aureus (MRSA) is a pathogen that continues to be of major concern in hospitals. We develop models and computational schemes based on observed weekly incidence data to estimate MRSA transmission parameters. We extend the deterministic model of McBryde, Pettitt, and McElwain (2007, Journal of Theoretical Biology 245, 470-481) involving an underlying population of MRSA colonized patients and health-care workers that describes, among other processes, transmission between uncolonized patients and colonized health-care workers and vice versa. We develop new bivariate and trivariate Markov models to include incidence so that estimated transmission rates can be based directly on new colonizations rather than indirectly on prevalence. Imperfect sensitivity of pathogen detection is modeled using a hidden Markov process. The advantages of our approach include (i) a discrete valued assumption for the number of colonized health-care workers, (ii) two transmission parameters can be incorporated into the likelihood, (iii) the likelihood depends on the number of new cases to improve precision of inference, (iv) individual patient records are not required, and (v) the possibility of imperfect detection of colonization is incorporated. We compare our approach with that used by McBryde et al. (2007) based on an approximation that eliminates the health-care workers from the model, uses Markov chain Monte Carlo and individual patient data. We apply these models to MRSA colonization data collected in a small intensive care unit at the Princess Alexandra Hospital, Brisbane, Australia.
Collapse
Affiliation(s)
- C C Drovandi
- School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane 4001, Australia
| | - A N Pettitt
- School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane 4001, Australia
| |
Collapse
|
38
|
Anisimov VV, Maas HJ, Danhof M, Della Pasqua O. Analysis of responses in migraine modelling using hidden Markov models. Stat Med 2007; 26:4163-78. [PMID: 17385187 DOI: 10.1002/sim.2852] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Markov-type models have been used in the analysis of disease progression. Although standard errors of model parameters are usually estimated, available software often does not permit the construction of confidence intervals around predictions of the dependent or response variable. A method is presented to calculate means and confidence intervals of model-predicted responses in time governed by a non-homogeneous hidden Markov model in continuous time. The Kolmogorov equations serve as the basis for the calculations. The method is realised in S-Plus and is applied to the prediction of headache responses in clinical studies of anti-migraine treatment. Means and confidence intervals are calculated by numerically solving differential equations that are non-linear in the explanatory variable. Results indicate that uncertainty on predicted drug responses is larger than that on predicted placebo responses and that pain-free responses are less precisely predicted than pain-relief responses. This is due to the uncertainty in the drug-specific parameters which is not present in predicted placebo responses.
Collapse
Affiliation(s)
- Vladimir V Anisimov
- Research Statistics Unit, Biomedical Data Sciences, GlaxoSmithKline, New Frontiers Science Park (South), Third Avenue, Harlow, Essex CM19 5AW, UK
| | | | | | | |
Collapse
|
39
|
Rosychuk RJ, Sheng X, Stuber JL. Comparison of variance estimation approaches in a two-state Markov model for longitudinal data with misclassification. Stat Med 2006; 25:1906-21. [PMID: 16220512 DOI: 10.1002/sim.2367] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We examine the behaviour of the variance-covariance parameter estimates in an alternating binary Markov model with misclassification. Transition probabilities specify the state transitions for a process that is not directly observable. The state of an observable process, which may not correctly classify the state of the unobservable process, is obtained at discrete time points. Misclassification probabilities capture the two types of classification errors. Variance components of the estimated transition parameters are calculated with three estimation procedures: observed information, jackknife, and bootstrap techniques. Simulation studies are used to compare variance estimates and reveal the effect of misclassification on transition parameter estimation. The three approaches generally provide similar variance estimates for large samples and moderate misclassification. In these situations, the resampling methods are reasonable alternatives when programming partial derivatives is not appealing. With smaller chains or higher misclassification probabilities, the bootstrap method appears to be the best choice.
Collapse
Affiliation(s)
- R J Rosychuk
- Department of Pediatrics, University of Alberta, Edmonton, Alberta, Canada T6G 2J3.
| | | | | |
Collapse
|
40
|
Moura MDG, Grossmann SDMC, Fonseca LMDS, Senna MIB, Mesquita RA. Risk factors for oral hairy leukoplakia in HIV-infected adults of Brazil. J Oral Pathol Med 2006; 35:321-6. [PMID: 16762011 DOI: 10.1111/j.1600-0714.2006.00428.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
BACKGROUND Oral hairy leukoplakia (OHL) may be an indicator of the progression of Human Immunodeficiency Virus (HIV)-induced immuno-depression, and the evaluation of risk factors leading to OHL is important in the management of these HIV-infected patients. However, there are few studies that analyze risk factors leading to OHL in the Brazilian population. The aim of this case-control study is to present data about prevalence rates and risk factors leading to OHL in a sample of HIV-infected adults in Brazil. METHODS This case-control study included 111 HIV-infected patients treated at a clinic for sexually transmitted diseases and HIV. In the initial examinations with dentists, variables were collected from all patients. Diagnosis of OHL was performed in accordance with the International Classification System and cytological features. The Fisher and the chi-squared tests were used for statistical analysis. The proportional prevalence and odds ratio were estimated. RESULTS Outcome presented a positive, statistically significant association among the presence of OHL and viral load of 3000 copies/mul or greater (P = 0.0001; odds ratio (OR) = 5.8), presence of oral candidiasis (P = 0.0000; OR = 11.1), previous use of fluconazole (P = 0.0000; OR = 24.6), and use of systemic acyclovir (P = 0.032; OR = 4.3). Antiretroviral medication presented a negative, statistically significant association with the presence of OHL (P = 0.002; OR = 8.4). CONCLUSIONS Prevalence of OHL was 28.8%. Viral load, oral candidiasis, previous use of fluconazole, and systemic acyclovir were determined to be risk factors for OHL. Antiretroviral medication proved to be protective against the development of OHL.
Collapse
Affiliation(s)
- Mariela Dutra Gontijo Moura
- Oral Pathology, Medicine and Surgery Department, Dentistry School, Federal University of Minas Gerais, Av. Antônio Carlos 6627, Pampulha, 31270-901 Belo Horizonte, MG, Brazil
| | | | | | | | | |
Collapse
|
41
|
Moura MDG, Guimarães TRM, Fonseca LMS, de Almeida Pordeus I, Mesquita RA. A random clinical trial study to assess the efficiency of topical applications of podophyllin resin (25%) versus podophyllin resin (25%) together with acyclovir cream (5%) in the treatment of oral hairy leukoplakia. ACTA ACUST UNITED AC 2006; 103:64-71. [PMID: 17178496 DOI: 10.1016/j.tripleo.2006.02.016] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2005] [Revised: 02/05/2006] [Accepted: 02/13/2006] [Indexed: 12/13/2022]
Abstract
OBJECTIVE The objective of this study was to assess the efficiency of topical applications of podophyllin resin (25%) (P) versus podophyllin resin (25%) together with acyclovir cream (5%) (PA) in the treatment of oral hairy leukoplakia (OHL) in accordance with the following criteria: (1) number of applications necessary for the total clinical resolution of OHL; (2) correlation between the decrease of lesion size and the number of applications; (3) total clinical resolution of OHL; and (4) clinical reevaluation 12 months after the end of treatment. STUDY DESIGN Forty-six OHLs were treated with P (P group) or with PA (PA group). Applications were performed weekly. Student t, Fisher exact, and Pearson correlation tests were used for statistical analysis. RESULTS All 24 lesions from the PA group presented total clinical resolution while 4 lesions from the P group did not. The P group required up to 25 applications performed weekly while the PA group required up to 18. Observed was a negative significant association between the size of the lesions and the number of applications performed weekly in the PA group. CONCLUSIONS The present study demonstrated the following: (1) P and PA topical treatments presented a similar average number of applications performed weekly; (2) both groups showed the same clinical response at 12 months post-therapy; and (3) PA presented a 100% clinical resolution and a continuous decrease in OHL size over the course of weekly applications.
Collapse
Affiliation(s)
- Mariela Dutra Gontijo Moura
- Oral Surgery, Medicine and Pathology Department, School of Dentistry, Federal University of Minas Gerais and Orestes Diniz's Treatment Center of Parasitic and Infectious Diseases, Minas Gerais, Brazil
| | | | | | | | | |
Collapse
|
42
|
Kang M, Lagakos SW. Evaluation of log-rank tests for infrequent observations from a multi-state process, with application to HPV vaccine efficacy. Stat Med 2004; 23:3681-96. [PMID: 15534891 DOI: 10.1002/sim.1916] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Genital infection by human papillomavirus (HPV) is a common sexually transmitted disease, with over 25 per cent prevalence among young women in the US. Infections are usually without symptoms and transient (or reversible), but a small proportion of infections persist and are believed to be responsible for nearly all cervical cancers and precursor lesions such as cervical intraepithelial neoplasia (CIN). Therefore, successful vaccines against persistent HPV infections could have a great impact in preventing cervical cancers. In trials being planned, ongoing, and recently completed, a log-rank or a similar test may be employed to assess a vaccine effect in comparison to placebo, with an infection 'event' defined to capture persistent but not transient infections. However, it is not clear how best to define such an event, because (1) diagnostic tests cannot distinguish a persistent from a transient infection, (2) participants are only examined periodically, and (3) there can be misclassification errors in the detection of infections. This paper evaluates several definitions of persistent infection that are based on periodically observed infection statuses by postulating a multi-state model for persistent and transient infections. The type I error and the power of tests on vaccine efficacy based on these operational definitions are then examined under various scenarios of how a vaccine might affect the infection-disease process. We find that none of the candidates performs satisfactorily, thus raising concerns that clinical trials based only on infection endpoints will not be reliable.
Collapse
Affiliation(s)
- Minhee Kang
- Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, MA 02115, USA.
| | | |
Collapse
|