1
|
Delporte M, Fieuws S, Molenberghs G, Verbeke G, Situma Wanyama S, Hatziagorou E, De Boeck C. A joint normal‐binary (probit) model. Int Stat Rev 2022. [DOI: 10.1111/insr.12532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
| | | | - Geert Molenberghs
- I‐BioStat KU Leuven Leuven B‐3000 Belgium
- I‐BioStat Universiteit Hasselt Diepenbeek B‐3590 Belgium
| | - Geert Verbeke
- I‐BioStat KU Leuven Leuven B‐3000 Belgium
- I‐BioStat Universiteit Hasselt Diepenbeek B‐3590 Belgium
| | | | - Elpis Hatziagorou
- Paediatric Pulmonology and CF Unit, Hippokration Hospital of Thessaloniki Aristotle University of Thessaloniki Thessaloniki Greece
| | | |
Collapse
|
2
|
Brobbey A, Wiebe S, Nettel-Aguirre A, Josephson CB, Williamson T, Lix LM, Sajobi TT. Repeated measures discriminant analysis using multivariate generalized estimation equations. Stat Methods Med Res 2021; 31:646-657. [PMID: 34898331 PMCID: PMC8961244 DOI: 10.1177/09622802211032705] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Discriminant analysis procedures that assume parsimonious covariance and/or means structures have been proposed for distinguishing between two or more populations in multivariate repeated measures designs. However, these procedures rely on the assumptions of multivariate normality which is not tenable in multivariate repeated measures designs which are characterized by binary, ordinal, or mixed types of response distributions. This study investigates the accuracy of repeated measures discriminant analysis (RMDA) based on the multivariate generalized estimating equations (GEE) framework for classification in multivariate repeated measures designs with the same or different types of responses repeatedly measured over time. Monte Carlo methods were used to compare the accuracy of RMDA procedures based on GEE, and RMDA based on maximum likelihood estimators (MLE) under diverse simulation conditions, which included number of repeated measure occasions, number of responses, sample size, correlation structures, and type of response distribution. RMDA based on GEE exhibited higher average classification accuracy than RMDA based on MLE especially in multivariate non-normal distributions. Three repeatedly measured responses namely severity of epilepsy, current number of anti-epileptic drugs, and parent-reported quality of life in children with epilepsy were used to demonstrate the application of these procedures.
Collapse
Affiliation(s)
- Anita Brobbey
- Department of Community Health Sciences, 2129University of Calgary, University of Calgary, Calgary, Canada
| | - Samuel Wiebe
- Department of Community Health Sciences, 2129University of Calgary, University of Calgary, Calgary, Canada.,Department of Clinical Neurosciences, 2129University of Calgary, University of Calgary, Calgary, Canada
| | - Alberto Nettel-Aguirre
- Centre for Health and Social Analytics, 8691University of Wollongong, National Institute for Applied Statistics Research Australia, University of Wollongong, Wollongong, Australia
| | - Colin Bruce Josephson
- Department of Community Health Sciences, 2129University of Calgary, University of Calgary, Calgary, Canada.,Department of Clinical Neurosciences, 2129University of Calgary, University of Calgary, Calgary, Canada
| | - Tyler Williamson
- Department of Community Health Sciences, 2129University of Calgary, University of Calgary, Calgary, Canada
| | - Lisa M Lix
- Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| | - Tolulope T Sajobi
- Department of Community Health Sciences, 2129University of Calgary, University of Calgary, Calgary, Canada.,Department of Clinical Neurosciences, 2129University of Calgary, University of Calgary, Calgary, Canada
| |
Collapse
|
3
|
Krishna Adithya V, Williams BM, Czanner S, Kavitha S, Friedman DS, Willoughby CE, Venkatesh R, Czanner G. EffUnet-SpaGen: An Efficient and Spatial Generative Approach to Glaucoma Detection. J Imaging 2021. [PMCID: PMC8321378 DOI: 10.3390/jimaging7060092] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Current research in automated disease detection focuses on making algorithms “slimmer” reducing the need for large training datasets and accelerating recalibration for new data while achieving high accuracy. The development of slimmer models has become a hot research topic in medical imaging. In this work, we develop a two-phase model for glaucoma detection, identifying and exploiting a redundancy in fundus image data relating particularly to the geometry. We propose a novel algorithm for the cup and disc segmentation “EffUnet” with an efficient convolution block and combine this with an extended spatial generative approach for geometry modelling and classification, termed “SpaGen” We demonstrate the high accuracy achievable by EffUnet in detecting the optic disc and cup boundaries and show how our algorithm can be quickly trained with new data by recalibrating the EffUnet layer only. Our resulting glaucoma detection algorithm, “EffUnet-SpaGen”, is optimized to significantly reduce the computational burden while at the same time surpassing the current state-of-art in glaucoma detection algorithms with AUROC 0.997 and 0.969 in the benchmark online datasets ORIGA and DRISHTI, respectively. Our algorithm also allows deformed areas of the optic rim to be displayed and investigated, providing explainability, which is crucial to successful adoption and implementation in clinical settings.
Collapse
Affiliation(s)
- Venkatesh Krishna Adithya
- Department of Glaucoma, Aravind Eye Care System, Thavalakuppam, Pondicherry 605007, India; (V.K.A.); (S.K.); (R.V.)
| | - Bryan M. Williams
- School of Computing and Communications, Lancaster University, Bailrigg, Lancaster LA1 4WA, UK;
| | - Silvester Czanner
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool L3 3AF, UK;
| | - Srinivasan Kavitha
- Department of Glaucoma, Aravind Eye Care System, Thavalakuppam, Pondicherry 605007, India; (V.K.A.); (S.K.); (R.V.)
| | - David S. Friedman
- Glaucoma Center of Excellence, Harvard Medical School, Boston, MA 02114, USA;
| | - Colin E. Willoughby
- Biomedical Research Institute, Ulster University, Coleraine, Co. Londonderry BT52 1SA, UK;
| | - Rengaraj Venkatesh
- Department of Glaucoma, Aravind Eye Care System, Thavalakuppam, Pondicherry 605007, India; (V.K.A.); (S.K.); (R.V.)
| | - Gabriela Czanner
- School of Computer Science and Mathematics, Liverpool John Moores University, Liverpool L3 3AF, UK;
- Correspondence:
| |
Collapse
|
4
|
Feely A, Lim LS, Jiang D, Lix LM. A population-based study to develop juvenile arthritis case definitions for administrative health data using model-based dynamic classification. BMC Med Res Methodol 2021; 21:105. [PMID: 33993875 PMCID: PMC8127203 DOI: 10.1186/s12874-021-01296-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 04/27/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Previous research has shown that chronic disease case definitions constructed using population-based administrative health data may have low accuracy for ascertaining cases of episodic diseases such as rheumatoid arthritis, which are characterized by periods of good health followed by periods of illness. No studies have considered a dynamic approach that uses statistical (i.e., probability) models for repeated measures data to classify individuals into disease, non-disease, and indeterminate categories as an alternative to deterministic (i.e., non-probability) methods that use summary data for case ascertainment. The research objectives were to validate a model-based dynamic classification approach for ascertaining cases of juvenile arthritis (JA) from administrative data, and compare its performance with a deterministic approach for case ascertainment. METHODS The study cohort was comprised of JA cases and non-JA controls 16 years or younger identified from a pediatric clinical registry in the Canadian province of Manitoba and born between 1980 and 2002. Registry data were linked to hospital records and physician billing claims up to 2018. Longitudinal discriminant analysis (LoDA) models and dynamic classification were applied to annual healthcare utilization measures. The deterministic case definition was based on JA diagnoses in healthcare use data anytime between birth and age 16 years; it required one hospitalization ever or two physician visits. Case definitions based on model-based dynamic classification and deterministic approaches were assessed on sensitivity, specificity, and positive and negative predictive values (PPV, NPV). Mean time to classification was also measured for the former. RESULTS The cohort included 797 individuals; 386 (48.4 %) were JA cases. A model-based dynamic classification approach using an annual measure of any JA-related healthcare contact had sensitivity = 0.70 and PPV = 0.82. Mean classification time was 9.21 years. The deterministic case definition had sensitivity = 0.91 and PPV = 0.92. CONCLUSIONS A model-based dynamic classification approach had lower accuracy for ascertaining JA cases than a deterministic approach. However, the dynamic approach required a shorter duration of time to produce a case definition with acceptable PPV. The choice of methods to construct case definitions and their performance may depend on the characteristics of the chronic disease under investigation.
Collapse
Affiliation(s)
- Allison Feely
- Department of Epidemiology and Cancer Registry, CancerCare Manitoba, Winnipeg, Canada
| | - Lily Sh Lim
- Department of Paediatrics, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Canada
| | - Depeng Jiang
- Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, S113-750 Bannatyne Avenue, R3E 0W3, Winnipeg, Canada
| | - Lisa M Lix
- Department of Community Health Sciences, Rady Faculty of Health Sciences, University of Manitoba, S113-750 Bannatyne Avenue, R3E 0W3, Winnipeg, Canada.
| |
Collapse
|
5
|
Watson C, Renehan AG, Geifman N. Associations of specific-age and decade recall body mass index trajectories with obesity-related cancer. BMC Cancer 2021; 21:502. [PMID: 33952200 PMCID: PMC8097878 DOI: 10.1186/s12885-021-08226-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 04/20/2021] [Indexed: 11/26/2022] Open
Abstract
Background Excess body fatness, commonly approximated by a one-off determination of body mass index (BMI), is associated with increased risk of at least 13 cancers. Modelling of longitudinal BMI data may be more informative for incident cancer associations, e.g. using latent class trajectory modelling (LCTM) may offer advantages in capturing changes in patterns with time. Here, we evaluated the variation in cancer risk with LCTMs using specific age recall versus decade recall BMI. Methods We obtained BMI profiles for participants from the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial. We developed gender-specific LCTMs using recall data from specific ages 20 and 50 years (72,513 M; 74,837 W); decade data from 30s to 70s (42,113 M; 47,352 W) and a combination of both (74,106 M, 76,245 W). Using an established methodological framework, we tested 1:7 classes for linear, quadratic, cubic and natural spline shapes, and modelled associations for obesity-related cancer (ORC) incidence using LCTM class membership. Results Different models were selected depending on the data type used. In specific age recall trajectories, only the two heaviest classes were associated with increased risk of ORC. For the decade recall data, the shapes appeared skewed by outliers in the heavier classes but an increase in ORC risk was observed. In the combined models, at older ages the BMI values were more extreme. Conclusions Specific age recall models supported the existing literature changes in BMI over time are associated with increased ORC risk. Modelling of decade recall data might yield spurious associations. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-08226-4.
Collapse
Affiliation(s)
- Charlotte Watson
- Manchester Cancer Research Centre and NIHR Manchester Biomedical Research Centre, Manchester, UK. .,Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK. .,Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK.
| | - Andrew G Renehan
- Manchester Cancer Research Centre and NIHR Manchester Biomedical Research Centre, Manchester, UK.,Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Nophar Geifman
- Centre for Health Informatics, Division of Informatics, Imaging and Data Sciences, School of Health Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| |
Collapse
|
6
|
El Saeiti R, García-Fiñana M, Hughes DM. The effect of random-effects misspecification on classification accuracy. Int J Biostat 2021; 18:279-292. [PMID: 33770823 PMCID: PMC9156334 DOI: 10.1515/ijb-2019-0159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 01/21/2021] [Accepted: 02/17/2021] [Indexed: 11/15/2022]
Abstract
Mixed models are a useful way of analysing longitudinal data. Random effects terms allow modelling of patient specific deviations from the overall trend over time. Correlation between repeated measurements are captured by specifying a joint distribution for all random effects in a model. Typically, this joint distribution is assumed to be a multivariate normal distribution. For Gaussian outcomes misspecification of the random effects distribution usually has little impact. However, when the outcome is discrete (e.g. counts or binary outcomes) generalised linear mixed models (GLMMs) are used to analyse longitudinal trends. Opinion is divided about how robust GLMMs are to misspecification of the random effects. Previous work explored the impact of random effects misspecification on the bias of model parameters in single outcome GLMMs. Accepting that these model parameters may be biased, we investigate whether this affects our ability to classify patients into clinical groups using a longitudinal discriminant analysis. We also consider multiple outcomes, which can significantly increase the dimensions of the random effects distribution when modelled simultaneously. We show that when there is severe departure from normality, more flexible mixture distributions can give better classification accuracy. However, in many cases, wrongly assuming a single multivariate normal distribution has little impact on classification accuracy.
Collapse
Affiliation(s)
- Riham El Saeiti
- Health Data Science, University of Liverpool Faculty of Health and Life Sciences, Liverpool, UK
| | - Marta García-Fiñana
- Health Data Science, University of Liverpool Faculty of Health and Life Sciences, Liverpool, UK
| | - David M Hughes
- Health Data Science, University of Liverpool Faculty of Health and Life Sciences, Liverpool, UK
| |
Collapse
|
7
|
MacCormick IJC, Williams BM, Zheng Y, Li K, Al-Bander B, Czanner S, Cheeseman R, Willoughby CE, Brown EN, Spaeth GL, Czanner G. Accurate, fast, data efficient and interpretable glaucoma diagnosis with automated spatial analysis of the whole cup to disc profile. PLoS One 2019; 14:e0209409. [PMID: 30629635 PMCID: PMC6328156 DOI: 10.1371/journal.pone.0209409] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 12/05/2018] [Indexed: 11/25/2022] Open
Abstract
Background Glaucoma is the leading cause of irreversible blindness worldwide. It is a heterogeneous group of conditions with a common optic neuropathy and associated loss of peripheral vision. Both over and under-diagnosis carry high costs in terms of healthcare spending and preventable blindness. The characteristic clinical feature of glaucoma is asymmetrical optic nerve rim narrowing, which is difficult for humans to quantify reliably. Strategies to improve and automate optic disc assessment are therefore needed to prevent sight loss. Methods We developed a novel glaucoma detection algorithm that segments and analyses colour photographs to quantify optic nerve rim consistency around the whole disc at 15-degree intervals. This provides a profile of the cup/disc ratio, in contrast to the vertical cup/disc ratio in common use. We introduce a spatial probabilistic model, to account for the optic nerve shape, we then use this model to derive a disc deformation index and a decision rule for glaucoma. We tested our algorithm on two separate image datasets (ORIGA and RIM-ONE). Results The spatial algorithm accurately distinguished glaucomatous and healthy discs on internal and external validation (AUROC 99.6% and 91.0% respectively). It achieves this using a dataset 100-times smaller than that required for deep learning algorithms, is flexible to the type of cup and disc segmentation (automated or semi-automated), utilises images with missing data, and is correlated with the disc size (p = 0.02) and the rim-to-disc at the narrowest rim (p<0.001, in external validation). Discussion The spatial probabilistic algorithm is highly accurate, highly data efficient and it extends to any imaging hardware in which the boundaries of cup and disc can be segmented, thus making the algorithm particularly applicable to research into disease mechanisms, and also glaucoma screening in low resource settings.
Collapse
Affiliation(s)
- Ian J. C. MacCormick
- Department of Eye & Vision Science, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
- Centre for Clinical Brain Sciences, University of Edinburgh, Chancellor's Building, Edinburgh, United Kingdom
| | - Bryan M. Williams
- Department of Eye & Vision Science, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
| | - Yalin Zheng
- Department of Eye & Vision Science, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
- St Paul’s Eye Unit, Royal Liverpool University Hospitals NHS Trust, Liverpool, United Kingdom
| | - Kun Li
- Medical Information Engineering Department, Taishan Medical School, TaiAn City, ShanDong Province, China
| | - Baidaa Al-Bander
- Department of Electrical Engineering and Electronics, University of Liverpool, Brownlow Hill, Liverpool, United Kingdom
| | - Silvester Czanner
- School of Computing, Mathematics and Digital Technology, Faculty of Science and Engineering, Manchester Metropolitan University, Manchester, Manchester, United Kingdom
| | - Rob Cheeseman
- St Paul’s Eye Unit, Royal Liverpool University Hospitals NHS Trust, Liverpool, United Kingdom
| | - Colin E. Willoughby
- Biomedical Sciences Research Institute, Faculty of Life & Health Sciences, Ulster University, Coleraine, Northern Ireland
- Department of Ophthalmology, Royal Victoria Hospital, Belfast, Northern Ireland
| | - Emery N. Brown
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Anesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - George L. Spaeth
- Glaucoma Research Center, Wills Eye Hospital, Philadelphia, Pennsylvania, United States of America
| | - Gabriela Czanner
- Department of Eye & Vision Science, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom
- St Paul’s Eye Unit, Royal Liverpool University Hospitals NHS Trust, Liverpool, United Kingdom
- Department of Applied Mathematics, Faculty of Engineering and Technology, Liverpool John Moores University, Liverpool, United Kingdom
- * E-mail:
| |
Collapse
|
8
|
Hughes DM, Komárek A, Czanner G, Garcia-Fiñana M. Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types. Stat Methods Med Res 2018; 27:2060-2080. [PMID: 27789653 PMCID: PMC5985589 DOI: 10.1177/0962280216674496] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
There is an emerging need in clinical research to accurately predict patients' disease status and disease progression by optimally integrating multivariate clinical information. Clinical data are often collected over time for multiple biomarkers of different types (e.g. continuous, binary and counts). In this paper, we present a flexible and dynamic (time-dependent) discriminant analysis approach in which multiple biomarkers of various types are jointly modelled for classification purposes by the multivariate generalized linear mixed model. We propose a mixture of normal distributions for the random effects to allow additional flexibility when modelling the complex correlation between longitudinal biomarkers and to robustify the model and the classification procedure against misspecification of the random effects distribution. These longitudinal models are subsequently used in a multivariate time-dependent discriminant scheme to predict, at any time point, the probability of belonging to a particular risk group. The methodology is illustrated using clinical data from patients with epilepsy, where the aim is to identify patients who will not achieve remission of seizures within a five-year follow-up period.
Collapse
Affiliation(s)
- David M Hughes
- Department of Biostatistics, University of Liverpool, UK
| | - Arnošt Komárek
- Charles University, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics, Prague, Czech Republic
| | - Gabriela Czanner
- Department of Biostatistics, University of Liverpool, UK
- Department of Eye and Vision Science, University of Liverpool, UK
| | | |
Collapse
|
9
|
Hughes DM, Komárek A, Bonnett LJ, Czanner G, García‐Fiñana M. Dynamic classification using credible intervals in longitudinal discriminant analysis. Stat Med 2017; 36:3858-3874. [PMID: 28762546 PMCID: PMC5655752 DOI: 10.1002/sim.7397] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 06/14/2017] [Indexed: 11/08/2022]
Abstract
Recently developed methods of longitudinal discriminant analysis allow for classification of subjects into prespecified prognostic groups using longitudinal history of both continuous and discrete biomarkers. The classification uses Bayesian estimates of the group membership probabilities for each prognostic group. These estimates are derived from a multivariate generalised linear mixed model of the biomarker's longitudinal evolution in each of the groups and can be updated each time new data is available for a patient, providing a dynamic (over time) allocation scheme. However, the precision of the estimated group probabilities differs for each patient and also over time. This precision can be assessed by looking at credible intervals for the group membership probabilities. In this paper, we propose a new allocation rule that incorporates credible intervals for use in context of a dynamic longitudinal discriminant analysis and show that this can decrease the number of false positives in a prognostic test, improving the positive predictive value. We also establish that by leaving some patients unclassified for a certain period, the classification accuracy of those patients who are classified can be improved, giving increased confidence to clinicians in their decision making. Finally, we show that determining a stopping rule dynamically can be more accurate than specifying a set time point at which to decide on a patient's status. We illustrate our methodology using data from patients with epilepsy and show how patients who fail to achieve adequate seizure control are more accurately identified using credible intervals compared to existing methods.
Collapse
Affiliation(s)
- David M. Hughes
- Department of BiostatisticsUniversity of LiverpoolLiverpoolU.K.
| | - Arnošt Komárek
- Department of Probability and Mathematical Statistics, Faculty of Mathematics and PhysicsCharles UniversityPragueCzech Republic
| | | | - Gabriela Czanner
- Department of BiostatisticsUniversity of LiverpoolLiverpoolU.K.
- Department of Eye and Vision ScienceUniversity of LiverpoolLiverpoolU.K.
| | | |
Collapse
|
10
|
Hughes DM, El Saeiti R, García-Fiñana M. A comparison of group prediction approaches in longitudinal discriminant analysis. Biom J 2017; 60:307-322. [PMID: 28833412 PMCID: PMC5873537 DOI: 10.1002/bimj.201700013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 05/18/2017] [Accepted: 05/25/2017] [Indexed: 01/10/2023]
Abstract
Longitudinal discriminant analysis (LoDA) can be used to classify patients into prognostic groups based on their clinical history, which often involves longitudinal measurements of various clinically relevant markers. Patients' longitudinal data is first modelled using multivariate generalised linear mixed models, allowing markers of different types (e.g. continuous, binary, counts) to be modelled simultaneously. We describe three approaches to calculating a patient's posterior group membership probabilities which have been outlined in previous studies, based on the marginal distribution of the longitudinal markers, conditional distribution and distribution of the random effects. Here we compare the three approaches, first using data from the Mayo Primary Biliary Cirrhosis study and then by way of simulation study to explore in which situations each of the three approaches is expected to give the best prediction. We demonstrate situations in which the marginal or random‐effects approach perform well, but find that the conditional approach offers little extra information to the random‐effects and marginal approaches.
Collapse
Affiliation(s)
- David M Hughes
- Department of Biostatistics, University of Liverpool, Liverpool, UK
| | - Riham El Saeiti
- Department of Biostatistics, University of Liverpool, Liverpool, UK.,Department of Statistics, University of Benghazi, Benghazi, Libya
| | | |
Collapse
|
11
|
Ivanova A, Molenberghs G, Verbeke G. Fast and highly efficient pseudo-likelihood methodology for large and complex ordinal data. Stat Methods Med Res 2015; 26:2758-2779. [PMID: 26446001 DOI: 10.1177/0962280215608213] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In longitudinal studies, continuous, binary, categorical, and survival outcomes are often jointly collected, possibly with some observations missing. However, when it comes to modeling responses, the ordinal ones have received less attention in the literature. In a longitudinal or hierarchical context, the univariate proportional odds mixed model (POMM) can be regarded as an instance of the generalized linear mixed model (GLMM). When the response of the joint multivariate model encompass ordinal responses, the complexity further increases. An additional problem of model fitting is the size of the collected data. Pseudo-likelihood based methods for pairwise fitting, for partitioned samples and, as introduced in this paper, pairwise fitting within partitioned samples allow joint modeling of even larger numbers of responses. We show that that pseudo-likelihood methodology allows for highly efficient and fast inferences in high-dimensional large datasets.
Collapse
Affiliation(s)
- Anna Ivanova
- 1 I-BioStat, KU Leuven, University of Leuven, Leuven, Belgium
| | - Geert Molenberghs
- 1 I-BioStat, KU Leuven, University of Leuven, Leuven, Belgium.,2 I-BioStat, Universiteit Hasselt, Hasselt, Belgium
| | - Geert Verbeke
- 1 I-BioStat, KU Leuven, University of Leuven, Leuven, Belgium.,2 I-BioStat, Universiteit Hasselt, Hasselt, Belgium
| |
Collapse
|
12
|
Ivanova A, Molenberghs G, Verbeke G. Mixed models approaches for joint modeling of different types of responses. J Biopharm Stat 2015; 26:601-18. [PMID: 26098411 DOI: 10.1080/10543406.2015.1052487] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In many biomedical studies, one jointly collects longitudinal continuous, binary, and survival outcomes, possibly with some observations missing. Random-effects models, sometimes called shared-parameter models or frailty models, received a lot of attention. In such models, the corresponding variance components can be employed to capture the association between the various sequences. In some cases, random effects are considered common to various sequences, perhaps up to a scaling factor; in others, there are different but correlated random effects. Even though a variety of data types has been considered in the literature, less attention has been devoted to ordinal data. For univariate longitudinal or hierarchical data, the proportional odds mixed model (POMM) is an instance of the generalized linear mixed model (GLMM; Breslow and Clayton, 1993). Ordinal data are conveniently replaced by a parsimonious set of dummies, which in the longitudinal setting leads to a repeated set of dummies. When ordinal longitudinal data are part of a joint model, the complexity increases further. This is the setting considered in this paper. We formulate a random-effects based model that, in addition, allows for overdispersion. Using two case studies, it is shown that the combination of random effects to capture association with further correction for overdispersion can improve the model's fit considerably and that the resulting models allow to answer research questions that could not be addressed otherwise. Parameters can be estimated in a fairly straightforward way, using the SAS procedure NLMIXED.
Collapse
Affiliation(s)
- Anna Ivanova
- a Leuven Statistics Research Centre , KU Leuven, Leuven , Belgium.,b I-BioStat , KU Leuven, Leuven , Belgium
| | - Geert Molenberghs
- b I-BioStat , KU Leuven, Leuven , Belgium.,c I-BioStat, Universiteit Hasselt , Hasselt , Belgium
| | - Geert Verbeke
- b I-BioStat , KU Leuven, Leuven , Belgium.,c I-BioStat, Universiteit Hasselt , Hasselt , Belgium
| |
Collapse
|
13
|
Serrat C, Rué M, Armero C, Piulachs X, Perpiñán H, Forte A, Páez Á, Gómez G. Frequentist and Bayesian approaches for a joint model for prostate cancer risk and longitudinal prostate-specific antigen data. J Appl Stat 2015. [DOI: 10.1080/02664763.2014.999032] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|