1
|
Iacovino ML, Celant S, Tomassini L, Arenare L, Caglio A, Canciello A, Salerno F, Olimpieri PP, Di Segni S, Sferrazza A, Piccirillo MC, Beretta GD, Pinto C, Blasi L, Cinieri S, Cavanna L, Di Maio M, Russo P, Perrone F. Comparison of baseline patient characteristics in Italian oncology drug monitoring registries and clinical trials: a real-world cross-sectional study. Lancet Reg Health Eur 2024; 41:100912. [PMID: 38665620 PMCID: PMC11041834 DOI: 10.1016/j.lanepe.2024.100912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/05/2024] [Accepted: 04/08/2024] [Indexed: 04/28/2024]
Abstract
Background Generalizability of registrative clinical trials to real-world clinical practice is influenced by comparability of patients in the two settings. We compared characteristics of cancer patients in registrative trials with real-world clinical practice in Italy. Methods Data on age, sex and performance status (PS) were derived from web-based monitoring registries developed by Italian Medicines Agency (AIFA) and corresponding registrative trials reported in the European Public Assessment Reports (EPAR) of European Medicines Agency (EMA). Weighted means were calculated in registries and trials and differences were described. Multivariate analysis was performed using Principal Component Analysis and Cluster Analysis. Findings From January, 2013 to April, 2023, 419,461 unique pairs of patients and therapeutic indications were recorded in 129 AIFA registries. Within 140 related trials, 87,452 patients had been enrolled. Median age and rate of elderly (≥65 years old) patients were higher in monitoring registries than in clinical trials [mean difference of median age 5.3 years, p < 0.001; mean difference of elderly rate 17.17% (95% CI 1.06, 1.48)]. Overall, rate of female patients was not different between registries and trials [mean difference -0.55% (95% CI -1.06, -0.05)]. Mean rate of patients with deteriorated PS was low both in trials (3.1%) and in registries (4.3%) with a mean difference of 1.27% (95% CI 1.06, 1.48). Two clusters were identified with multivariate analysis: one including more registries (higher median age and elderly rate, lower female rate, higher rate of deteriorated patients), the other more trials (lower median age and elderly rate, higher female rate, lower rate of deteriorated patients). Interpretation This study supports that cancer patients enrolled in trials do only partially represent those who have been treated in Italy in clinical practice. Inclusiveness of registrative trials should be increased to ensure generalizability of results to real-world population. Funding Partially supported by Italian Ministry of Health.
Collapse
Affiliation(s)
| | | | | | - Laura Arenare
- National Cancer Institute, IRCCS Fondazione G.Pascale, Naples, Italy
| | - Andrea Caglio
- Department of Oncology, University, Ordine Mauriziano Hospital Umberto I, Turin, Italy
| | - Andrea Canciello
- National Cancer Institute, IRCCS Fondazione G.Pascale, Naples, Italy
| | - Flavio Salerno
- Department of Oncology, University, Ordine Mauriziano Hospital Umberto I, Turin, Italy
| | | | | | | | | | | | - Carmine Pinto
- Medical Oncology, Comprehensive Cancer Centre, AUSL-IRCCS di Reggio Emilia, Italy
| | - Livio Blasi
- Medical Oncology, Civic Hospital Cristina Benfratelli, Palermo, Italy
| | - Saverio Cinieri
- Medical Oncology and Breast Unit, Perrino Hospital, Brindisi, Italy
| | - Luigi Cavanna
- Medical Oncology and Hematology, Civil Hospital, Piacenza, Italy
| | - Massimo Di Maio
- Department of Oncology, University of Turin, AOU Città della Salute e della Scienza di Torino, Turin, Italy
| | | | - Francesco Perrone
- National Cancer Institute, IRCCS Fondazione G.Pascale, Naples, Italy
| |
Collapse
|
2
|
Steingrimsson JA, Barker DH, Bie R, Dahabreh IJ. Systematically missing data in causally interpretable meta-analysis. Biostatistics 2024; 25:289-305. [PMID: 36977366 PMCID: PMC11017122 DOI: 10.1093/biostatistics/kxad006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 02/15/2023] [Accepted: 03/13/2023] [Indexed: 03/30/2023] Open
Abstract
Causally interpretable meta-analysis combines information from a collection of randomized controlled trials to estimate treatment effects in a target population in which experimentation may not be possible but from which covariate information can be obtained. In such analyses, a key practical challenge is the presence of systematically missing data when some trials have collected data on one or more baseline covariates, but other trials have not, such that the covariate information is missing for all participants in the latter. In this article, we provide identification results for potential (counterfactual) outcome means and average treatment effects in the target population when covariate data are systematically missing from some of the trials in the meta-analysis. We propose three estimators for the average treatment effect in the target population, examine their asymptotic properties, and show that they have good finite-sample performance in simulation studies. We use the estimators to analyze data from two large lung cancer screening trials and target population data from the National Health and Nutrition Examination Survey (NHANES). To accommodate the complex survey design of the NHANES, we modify the methods to incorporate survey sampling weights and allow for clustering.
Collapse
Affiliation(s)
- Jon A Steingrimsson
- Department of Biostatistics, Brown University, 121 South Main Street, Providence, RI 02903, USA
| | - David H Barker
- Department of Psychiatry, Rhode Island Hospital, Providence, RI 02904, USA
| | - Ruofan Bie
- Department of Biostatistics, Brown University, 121 South Main Street, Providence, RI 02903, USA
| | - Issa J Dahabreh
- Departments of Epidemiology and Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA and CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
3
|
Keyes KM, Pakserian D, Rudolph KE, Salum G, Stuart EA. Population Neuroscience: Understanding Concepts of Generalizability and Transportability and Their Application to Improving the Public's Health. Curr Top Behav Neurosci 2024. [PMID: 38589636 DOI: 10.1007/7854_2024_465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
In population neuroscience, samples are not often selected with equal or known probability from an underlying population of interest; in other words, samples are not often formally representative of a specified underlying population. This chapter provides an overview of an epidemiological approach to considering the implications of selective participation on the value of our results for population health. We discuss definitions of generalizability and transportability, given the growing recognition that generalizability and transportability are central for interpreting data that are aiming to be population-based. We provide evidence that differences in the prevalence of effect measure modifiers between a study sample and a target population will lead to a lack of generalizability and transportability. We provide an example of an association between a poly-genetic risk score and depression, showing how an internally valid association can differ based on the prevalence of effect measure modifiers. We show that when estimating associations, inferences from a study sample to a population can depend on clearly defining a target population. Given that representative sampling from explicitly defined target populations may not be feasible or realistic in many situations, especially given the sample sizes needed for statistical power for many exposures of interest (and especially when interactions are being tested), researchers should be well versed in tools available to enhance the interpretability of samples regarding target populations.
Collapse
Affiliation(s)
- Katherine M Keyes
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, NY, USA.
| | | | - Kara E Rudolph
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, NY, USA
| | - Giovanni Salum
- Child and Adolescent Mental Health Initiative, Child Mind Institute & Stavros Niarchos Foundation, New York, NY, USA
| | - Elizabeth A Stuart
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| |
Collapse
|
4
|
Rossi L, Fiorentino MC, Mancini A, Paolanti M, Rosati R, Zingaretti P. Generalizability and robustness evaluation of attribute-based zero-shot learning. Neural Netw 2024; 175:106278. [PMID: 38581809 DOI: 10.1016/j.neunet.2024.106278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 02/15/2024] [Accepted: 03/26/2024] [Indexed: 04/08/2024]
Abstract
In the field of deep learning, large quantities of data are typically required to effectively train models. This challenge has given rise to techniques like zero-shot learning (ZSL), which trains models on a set of "seen" classes and evaluates them on a set of "unseen" classes. Although ZSL has shown considerable potential, particularly with the employment of generative methods, its generalizability to real-world scenarios remains uncertain. The hypothesis of this work is that the performance of ZSL models is systematically influenced by the chosen "splits"; in particular, the statistical properties of the classes and attributes used in training. In this paper, we test this hypothesis by introducing the concepts of generalizability and robustness in attribute-based ZSL and carry out a variety of experiments to stress-test ZSL models against different splits. Our aim is to lay the groundwork for future research on ZSL models' generalizability, robustness, and practical applications. We evaluate the accuracy of state-of-the-art models on benchmark datasets and identify consistent trends in generalizability and robustness. We analyze how these properties vary based on the dataset type, differentiating between coarse- and fine-grained datasets, and our findings indicate significant room for improvement in both generalizability and robustness. Furthermore, our results demonstrate the effectiveness of dimensionality reduction techniques in improving the performance of state-of-the-art models in fine-grained datasets.
Collapse
Affiliation(s)
- Luca Rossi
- Dipartimento di Ingegneria dell'Informazione, Università Politecnica delle Marche, Via Brecce Bianche 12, 60131, Ancona, Italy.
| | - Maria Chiara Fiorentino
- Dipartimento di Ingegneria dell'Informazione, Università Politecnica delle Marche, Via Brecce Bianche 12, 60131, Ancona, Italy.
| | - Adriano Mancini
- Dipartimento di Ingegneria dell'Informazione, Università Politecnica delle Marche, Via Brecce Bianche 12, 60131, Ancona, Italy.
| | - Marina Paolanti
- Dipartimento di Scienze politiche, della Comunicazione e delle Relazioni Internazionali, Università di Macerata, 62100, Macerata, Italy.
| | - Riccardo Rosati
- Dipartimento di Ingegneria dell'Informazione, Università Politecnica delle Marche, Via Brecce Bianche 12, 60131, Ancona, Italy.
| | - Primo Zingaretti
- Dipartimento di Ingegneria dell'Informazione, Università Politecnica delle Marche, Via Brecce Bianche 12, 60131, Ancona, Italy.
| |
Collapse
|
5
|
Samadi ME, Guzman-Maldonado J, Nikulina K, Mirzaieazar H, Sharafutdinov K, Fritsch SJ, Schuppert A. A hybrid modeling framework for generalizable and interpretable predictions of ICU mortality across multiple hospitals. Sci Rep 2024; 14:5725. [PMID: 38459085 PMCID: PMC10923850 DOI: 10.1038/s41598-024-55577-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 02/26/2024] [Indexed: 03/10/2024] Open
Abstract
The development of reliable mortality risk stratification models is an active research area in computational healthcare. Mortality risk stratification provides a standard to assist physicians in evaluating a patient's condition or prognosis objectively. Particular interest lies in methods that are transparent to clinical interpretation and that retain predictive power once validated across diverse datasets they were not trained on. This study addresses the challenge of consolidating numerous ICD codes for predictive modeling of ICU mortality, employing a hybrid modeling approach that integrates mechanistic, clinical knowledge with mathematical and machine learning models . A tree-structured network connecting independent modules that carry clinical meaning is implemented for interpretability. Our training strategy utilizes graph-theoretic methods for data analysis, aiming to identify the functions of individual black-box modules within the tree-structured network by harnessing solutions from specific max-cut problems. The trained model is then validated on external datasets from different hospitals, demonstrating successful generalization capabilities, particularly in binary-feature datasets where label assessment involves extrapolation.
Collapse
Affiliation(s)
- Moein E Samadi
- Institute for Computational Biomedicine, RWTH Aachen University, Aachen, Germany.
| | | | - Kateryna Nikulina
- Institute for Computational Biomedicine, RWTH Aachen University, Aachen, Germany
| | - Hedieh Mirzaieazar
- Institute for Computational Biomedicine, RWTH Aachen University, Aachen, Germany
| | | | - Sebastian Johannes Fritsch
- Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany
- Jülich Supercomputing Centre, Forschungszentrum Jülich, Jülich, Germany
- Center for Advanced Simulation and Analytics (CASA), Forschungszentrum Jülich, Jülich, Germany
| | - Andreas Schuppert
- Institute for Computational Biomedicine, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
6
|
Tervo-Clemmens B, Karim ZA, Khan SZ, Ravindranath O, Somerville LH, Schuster RM, Gilman JM, Evins AE. The Developmental Timing but Not Magnitude of Adolescent Risk-Taking Propensity Is Consistent Across Social, Environmental, and Psychological Factors. J Adolesc Health 2024; 74:613-616. [PMID: 38085210 DOI: 10.1016/j.jadohealth.2023.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 10/03/2023] [Accepted: 11/02/2023] [Indexed: 02/05/2024]
Abstract
PURPOSE Risk-taking is thought to peak during adolescence, but most prior studies have relied on small convenience samples lacking participant diversity. This study tested the generalizability of adolescent self-reported risk-taking propensity across a comprehensive set of participant-level social, environmental, and psychological factors. METHODS Data (N = 1,005,421) from the National Survey on Drug Use and Health were used to test the developmental timing and magnitude of risk-taking propensity and its link to alcohol and cannabis use across 19 subgroups defined via sex, race/ethnicity, socioeconomic status, population density, religious affiliation, and mental health. RESULTS The developmental timing of a lifespan peak in risk-taking propensity during adolescence (15-18 years old) generalized across nearly all levels of social, environmental, and psychological factors, whereas the magnitude of this peak widely varied. Nearly all adolescents with regular substance use reported higher levels of risk-taking propensity. DISCUSSION Results support a broad generalizability of adolescence as the peak lifespan period of self-reported risk-taking but emphasize the importance of participant-level factors in determining the specific magnitude of reported risk-taking.
Collapse
Affiliation(s)
- Brenden Tervo-Clemmens
- Department of Psychiatry, Masssachusetts General Hospital, Harvard Medical School, Boston, Massachusetts; Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, Minnesota.
| | | | | | - Orma Ravindranath
- Department of Psychology, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Leah H Somerville
- Department of Psychology and Center for Brain Science, Harvard University, Cambridge, Massachusetts
| | - Randi M Schuster
- Department of Psychiatry, Masssachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Jodi M Gilman
- Department of Psychiatry, Masssachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - A Eden Evins
- Department of Psychiatry, Masssachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
7
|
Davidashvilly S, Cardei M, Hssayeni M, Chi C, Ghoraani B. Deep neural networks for wearable sensor-based activity recognition in Parkinson's disease: investigating generalizability and model complexity. Biomed Eng Online 2024; 23:17. [PMID: 38336781 PMCID: PMC10858599 DOI: 10.1186/s12938-024-01214-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
BACKGROUND The research gap addressed in this study is the applicability of deep neural network (NN) models on wearable sensor data to recognize different activities performed by patients with Parkinson's Disease (PwPD) and the generalizability of these models to PwPD using labeled healthy data. METHODS The experiments were carried out utilizing three datasets containing wearable motion sensor readings on common activities of daily living. The collected readings were from two accelerometer sensors. PAMAP2 and MHEALTH are publicly available datasets collected from 10 and 9 healthy, young subjects, respectively. A private dataset of a similar nature collected from 14 PwPD patients was utilized as well. Deep NN models were implemented with varying levels of complexity to investigate the impact of data augmentation, manual axis reorientation, model complexity, and domain adaptation on activity recognition performance. RESULTS A moderately complex model trained on the augmented PAMAP2 dataset and adapted to the Parkinson domain using domain adaptation achieved the best activity recognition performance with an accuracy of 73.02%, which was significantly higher than the accuracy of 63% reported in previous studies. The model's F1 score of 49.79% significantly improved compared to the best cross-testing of 33.66% F1 score with only data augmentation and 2.88% F1 score without data augmentation or domain adaptation. CONCLUSION These findings suggest that deep NN models originating on healthy data have the potential to recognize activities performed by PwPD accurately and that data augmentation and domain adaptation can improve the generalizability of models in the healthy-to-PwPD transfer scenario. The simple/moderately complex architectures tested in this study could generalize better to the PwPD domain when trained on a healthy dataset compared to the most complex architectures used. The findings of this study could contribute to the development of accurate wearable-based activity monitoring solutions for PwPD, improving clinical decision-making and patient outcomes based on patient activity levels.
Collapse
Affiliation(s)
- Shelly Davidashvilly
- Electrical and Computer Engineering, Florida Atlantic University, Boca Raton, FL, 33431, US
| | - Maria Cardei
- Electrical and Computer Engineering, Florida Atlantic University, Boca Raton, FL, 33431, US
- Biomedical Engineering, University of Florida, Gainesville, FL, US
| | - Murtadha Hssayeni
- Electrical and Computer Engineering, Florida Atlantic University, Boca Raton, FL, 33431, US
- Computer Engineering, University of Technology, Baghdad, Iraq
| | - Christopher Chi
- Electrical and Computer Engineering, Florida Atlantic University, Boca Raton, FL, 33431, US
| | - Behnaz Ghoraani
- Electrical and Computer Engineering, Florida Atlantic University, Boca Raton, FL, 33431, US.
| |
Collapse
|
8
|
Tian W, Zhang Z, Bouffard D, Wu H, Xin K, Gu X, Liao Z. Enhancing interpretability and generalizability of deep learning-based emulator in three-dimensional lake hydrodynamics using Koopman operator and transfer learning: Demonstrated on the example of lake Zurich. Water Res 2024; 249:120996. [PMID: 38103441 DOI: 10.1016/j.watres.2023.120996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 11/02/2023] [Accepted: 12/07/2023] [Indexed: 12/19/2023]
Abstract
Three-dimensional lake hydrodynamic model is a powerful tool widely used to assess hydrological condition changes of lake. However, its computational cost becomes problematic when forecasting the state of large lakes or using high-resolution simulation in small-to-medium size lakes. One possible solution is to employ a data-driven emulator, such as a deep learning (DL) based emulator, to replace the original model for fast computing. However, existing DL-based emulators are often black-box and data-dependent models, causing poor interpretability and generalizability in practical applications. In this study, a data-driven emulator is established using deep neural network (DNN) to replace the original model for fast computing of three-dimensional lake hydrodynamics. Then, the Koopman operator and transfer learning (TL) are employed to enhance the interpretability and generalizability of the emulator. Finally, the generalizability of DL-based emulators is comprehensively analyzed through linear regression and correlation analysis. These methods are tested against an existing hydrodynamic model of Lake Zurich (Switzerland) whose data was provided by an open-source web-based platform called Meteolakes/Alplakes. According to the results, (1) The DLEDMD offers better interpretability than DNN because its Koopman operator reveals the linear structure behind the hydrodynamics; (2) The generalization of the DL-based emulators in three-dimensional lake hydrodynamics are influenced by the similarity between the training and testing data; (3) TL effectively improves the generalizability of the DL-based emulators.
Collapse
Affiliation(s)
- Wenchong Tian
- College of Environmental Science and Engineering, Tongji University, 200092 Shanghai, China; Key Laboratory of Yangtze River Water Environment, Ministry of Education, Tongji University, 200092 Shanghai, China; Key Laboratory of Urban Water Supply, Water Saving and Water Environment Governance in the Yangtze River Delta of Ministry of Water Resources, Shanghai 200092, P.R. China
| | - Zhiyu Zhang
- College of Environmental Science and Engineering, Tongji University, 200092 Shanghai, China; Key Laboratory of Yangtze River Water Environment, Ministry of Education, Tongji University, 200092 Shanghai, China
| | - Damien Bouffard
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Surface Waters Research and Management, 6047 Kastanienbaum, Switzerland; Faculty of Geosciences and Environment, Institute of Earth Surface Dynamics, University of Lausanne, Geopolis, Mouline, CH-1015 Lausanne, Switzerland
| | - Hao Wu
- School of Mathematical Sciences, Institute of Natural Sciences, and MOE-LSC, Shanghai Jiao Tong University, Shanghai, China; School of Mathematical Sciences, Tongji University, Shanghai, China
| | - Kunlun Xin
- College of Environmental Science and Engineering, Tongji University, 200092 Shanghai, China
| | - Xianyong Gu
- National Engineering Research Center of Dredging Technology and Equipment Co., Ltd., 201208 Shanghai, China
| | - Zhenliang Liao
- College of Environmental Science and Engineering, Tongji University, 200092 Shanghai, China; Key Laboratory of Yangtze River Water Environment, Ministry of Education, Tongji University, 200092 Shanghai, China.
| |
Collapse
|
9
|
Ingrasciotta Y, Spini A, L'Abbate L, Fiore ES, Carollo M, Ientile V, Isgrò V, Cavazzana A, Biasi V, Rossi P, Ejlli L, Belleudi V, Poggi F, Sapigni E, Puccini A, Ancona D, Stella P, Pollina Addario S, Allotta A, Leoni O, Zanforlini M, Tuccori M, Gini R, Trifirò G. Comparing clinical trial population representativeness to real-world users of 17 biologics approved for immune-mediated inflammatory diseases: An external validity analysis of 66,639 biologic users from the Italian VALORE project. Pharmacol Res 2024; 200:107074. [PMID: 38232909 DOI: 10.1016/j.phrs.2024.107074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 01/11/2024] [Accepted: 01/11/2024] [Indexed: 01/19/2024]
Abstract
To date, no population-based studies have specifically explored the external validity of pivotal randomized clinical trials (RCTs) of biologics simultaneously for a broad spectrum of immuno-mediated inflammatory diseases (IMIDs). The aims of this study were, firstly, to compare the patients' characteristics and median treatment duration of biologics approved for IMIDs between RCTs' and real-world setting (RW); secondly, to assess the extent of biologic users treated for IMIDs in the real-world setting that would not have been eligible for inclusion into pivotal RCT for each indication of use. Using the Italian VALORE distributed database (66,639 incident biologic users), adult patients with IMIDs treated with biologics in the Italian real-world setting were substantially older (mean age ± SD: 50 ± 15 years) compared to those enrolled in pivotal RCTs (45 ± 15 years). In the real-world setting, certolizumab pegol was more commonly used by adult women with psoriasis/ankylosing spondylitis (F/M ratio: 1.8-1.9) compared to RCTs (F/M ratio: 0.5-0.6). The median treatment duration (weeks) of incident biologic users in RW was significantly higher than the duration of pivotal RCTs in almost all indications for use and most biologics (4-100 vs. 6-167). Furthermore, almost half (46.4%) of biologic users from RW settings would have been ineligible for inclusion in the respective indication-specific pivotal RCTs. The main reasons were: advanced age, recent history of cancer and presence of other concomitant IMIDs. These findings suggest that post-marketing surveillance of biologics should be prioritized for those patients.
Collapse
Affiliation(s)
- Ylenia Ingrasciotta
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy
| | - Andrea Spini
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy
| | - Luca L'Abbate
- University of Messina, Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Messina, Italy
| | - Elena Sofia Fiore
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy
| | - Massimo Carollo
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy
| | - Valentina Ientile
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy
| | - Valentina Isgrò
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy
| | | | | | - Paola Rossi
- Direzione Centrale Salute Regione Friuli-Venezia Giulia, Trieste, Italy
| | - Lucian Ejlli
- Direzione Centrale Salute Regione Friuli-Venezia Giulia, Trieste, Italy
| | - Valeria Belleudi
- Lazio Regional Health Service, Department of Epidemiology, Rome, Italy
| | - Francesca Poggi
- Lazio Regional Health Service, Department of Epidemiology, Rome, Italy
| | - Ester Sapigni
- Emilia-Romagna Health Department, Hospital Assistance Service, Drug and Medical Device Area, Bologna, Italy
| | - Aurora Puccini
- Emilia-Romagna Health Department, Hospital Assistance Service, Drug and Medical Device Area, Bologna, Italy
| | | | | | | | - Alessandra Allotta
- Epidemiologic Observatory of the Sicily Regional Health Service, Palermo, Italy
| | - Olivia Leoni
- Lombardy Regional Centre of Pharmacovigilance and Regional Epidemiologic Observatory, Milan, Italy
| | | | - Marco Tuccori
- University Hospital of Pisa, Unit of Adverse Drug Reaction Monitoring, Italy
| | - Rosa Gini
- Agenzia Regionale di Sanità Toscana, Florence, Italy
| | - Gianluca Trifirò
- University of Verona, Department of Diagnostics and Public Health, Verona, Italy.
| |
Collapse
|
10
|
Mansolf M, Blackwell CK, Cella D, Lai JS. Assessing the interchangeability of linked scores in multivariable statistical analyses. Qual Life Res 2024:10.1007/s11136-023-03592-x. [PMID: 38294666 DOI: 10.1007/s11136-023-03592-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/18/2023] [Indexed: 02/01/2024]
Abstract
PURPOSE Using the lens of classical test theory, we examine a linkage's generalizability with respect to use in multivariable analyses, including multiple regression and structural equation modeling, rather than comparison of established subpopulations as is most common in the literature. METHODS To aid in this evaluation, we present a structural-equation-modeling based statistical method to examine the suitability of a given linkage for use cases involving continuous and categorical variables external to the linkage itself. RESULTS Using the PROMIS® Parent Proxy and Early Childhood Global Health measures, we show that, although a high correlation between the scores (here, r = .829) may imply a general suitability for linking, a more detailed investigation of content, measurement structure, and results of the proposed methodology reveal important differences between the measures which can compromise interchangeability in certain use cases. CONCLUSION In addition to the statistical quality of a linkage, users of linking methodology should also assess the question of whether the linkage is appropriate to apply to particular use cases of interest.
Collapse
Affiliation(s)
- Maxwell Mansolf
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, 625 N. Michigan Ave Fl 27, Chicago, IL, 60611, USA.
| | - Courtney K Blackwell
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, 625 N. Michigan Ave Fl 27, Chicago, IL, 60611, USA
| | - David Cella
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, 625 N. Michigan Ave Fl 27, Chicago, IL, 60611, USA
| | - Jin-Shei Lai
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, 625 N. Michigan Ave Fl 27, Chicago, IL, 60611, USA
| |
Collapse
|
11
|
Ghaderi H, Foreman B, Reddy CK, Subbian V. Discovery of Generalizable TBI Phenotypes Using Multivariate Time-Series Clustering. ArXiv 2024:arXiv:2401.08002v1. [PMID: 38313201 PMCID: PMC10836078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/06/2024]
Abstract
Traumatic Brain Injury (TBI) presents a broad spectrum of clinical presentations and outcomes due to its inherent heterogeneity, leading to diverse recovery trajectories and varied therapeutic responses. While many studies have delved into TBI phenotyping for distinct patient populations, identifying TBI phenotypes that consistently generalize across various settings and populations remains a critical research gap. Our research addresses this by employing multivariate time-series clustering to unveil TBI's dynamic intricates. Utilizing a self-supervised learning-based approach to clustering multivariate time-Series data with missing values (SLAC-Time), we analyzed both the research-centric TRACK-TBI and the real-world MIMIC-IV datasets. Remarkably, the optimal hyperparameters of SLAC-Time and the ideal number of clusters remained consistent across these datasets, underscoring SLAC-Time's stability across heterogeneous datasets. Our analysis revealed three generalizable TBI phenotypes (α, β, and γ), each exhibiting distinct non-temporal features during emergency department visits, and temporal feature profiles throughout ICU stays. Specifically, phenotype α represents mild TBI with a remarkably consistent clinical presentation. In contrast, phenotype β signifies severe TBI with diverse clinical manifestations, and phenotype γ represents a moderate TBI profile in terms of severity and clinical diversity. Age is a significant determinant of TBI outcomes, with older cohorts recording higher mortality rates. Importantly, while certain features varied by age, the core characteristics of TBI manifestations tied to each phenotype remain consistent across diverse populations.
Collapse
Affiliation(s)
- Hamid Ghaderi
- Department of Systems and Industrial Engineering, University of Arizona, Tucson, AZ, USA
| | - Brandon Foreman
- College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Chandan K. Reddy
- Department of Computer Science, Virginia Tech, Arlington, VA, USA
| | - Vignesh Subbian
- Department of Systems and Industrial Engineering, University of Arizona, Tucson, AZ, USA
- Department of Biomedical Engineering, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
12
|
Luo Y, Chen W, Zhan L, Qiu J, Jia T. Multi-feature concatenation and multi-classifier stacking: An interpretable and generalizable machine learning method for MDD discrimination with rsfMRI. Neuroimage 2024; 285:120497. [PMID: 38142755 DOI: 10.1016/j.neuroimage.2023.120497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 11/21/2023] [Accepted: 12/11/2023] [Indexed: 12/26/2023] Open
Abstract
Major depressive disorder (MDD) is a serious and heterogeneous psychiatric disorder that needs accurate diagnosis. Resting-state functional MRI (rsfMRI), which captures multiple perspectives on brain structure, function, and connectivity, is increasingly applied in the diagnosis and pathological research of MDD. Different machine learning algorithms are then developed to exploit the rich information in rsfMRI and discriminate MDD patients from normal controls. Despite recent advances reported, the MDD discrimination accuracy has room for further improvement. The generalizability and interpretability of the discrimination method are not sufficiently addressed either. Here, we propose a machine learning method (MFMC) for MDD discrimination by concatenating multiple features and stacking multiple classifiers. MFMC is tested on the REST-meta-MDD data set that contains 2428 subjects collected from 25 different sites. MFMC yields 96.9% MDD discrimination accuracy, demonstrating a significant improvement over existing methods. In addition, the generalizability of MFMC is validated by the good performance when the training and testing subjects are from independent sites. The use of XGBoost as the meta classifier allows us to probe the decision process of MFMC. We identify 13 feature values related to 9 brain regions including the posterior cingulate gyrus, superior frontal gyrus orbital part, and angular gyrus, which contribute most to the classification and also demonstrate significant differences at the group level. The use of these 13 feature values alone can reach 87% of MFMC's full performance when taking all feature values. These features may serve as clinically useful diagnostic and prognostic biomarkers for MDD in the future.
Collapse
Affiliation(s)
- Yunsong Luo
- College of Computer and Information Science, Southwest University, Chongqing, 400715, PR China.
| | - Wenyu Chen
- College of Computer and Information Science, Southwest University, Chongqing, 400715, PR China.
| | - Ling Zhan
- College of Computer and Information Science, Southwest University, Chongqing, 400715, PR China.
| | - Jiang Qiu
- Key Laboratory of Cognition and Personality (SWU), Ministry of Education, Chongqing, 400715, PR China; School of Psychology, Southwest University (SWU), Chongqing, 400715, PR China; Southwest University Branch, Collaborative Innovation Center of Assessment Toward Basic Education Quality at Beijing Normal University, Chongqing, 400715, PR China.
| | - Tao Jia
- College of Computer and Information Science, Southwest University, Chongqing, 400715, PR China.
| |
Collapse
|
13
|
Elliott MR, Carroll O, Grieve R, Carpenter J. Improving transportability of randomized controlled trial inference using robust prediction methods. Stat Methods Med Res 2023; 32:2365-2385. [PMID: 37936293 DOI: 10.1177/09622802231210944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2023]
Abstract
Randomized trials have been the gold standard for assessing causal effects since their introduction by Fisher in the 1920s, since they can eliminate both observed and unobserved confounding. Estimates of causal effects at the population level from randomized controlled trials can still be biased if there are both effect modification and systematic differences between the trial sample and the ultimate population of inference with respect to these modifiers. Recent advances in the survey statistics literature to improve inference in nonprobability samples by using information from probability samples can provide an avenue for improving population causal inference in randomized controlled trials when relevant probability samples of the patient population are available. We review some recent work in "transporting" causal effect estimates from trials to populations, focusing on the setting where there is a "benchmark" or population-representative sample along with the RCT sample. We then propose estimators using either inverse probability weighting (IPWT) or prediction that can accommodate unequal probability of selection in the "benchmark" or population, and use Bayesian additive regression trees for both inverse probability of treatment weighting and prediction estimation that do not require specification of functional form or interaction. We also consider how the assumption of ignorability may be assessed from observed data and propose a sensitivity analysis under the failure of this assumption. We compare our proposed approach with existing methods in simulation and apply these alternative approaches to a study of pulmonary artery catheterization in critically ill patients. We also suggest next steps for future work.
Collapse
Affiliation(s)
- Michael R Elliott
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Orlagh Carroll
- Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK
| | - Richard Grieve
- Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK
| | - James Carpenter
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| |
Collapse
|
14
|
O'Hare KJM, Linscott RJ. Measurement invariance of brief forms of the Schizotypal Personality Questionnaire across convenience versus random samples. Schizophr Res 2023; 262:76-83. [PMID: 37931562 DOI: 10.1016/j.schres.2023.10.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 09/25/2023] [Accepted: 10/28/2023] [Indexed: 11/08/2023]
Abstract
Schizotypy, a multifaceted personality construct that represents liability for schizophrenia, is generally measured with self-report questionnaires that have been developed and validated in samples of undergraduate students. Given that understanding schizotypy in non-clinical samples is essential for furthering our understanding of schizophrenia-spectrum psychopathologies, it is critical to test whether non-clinically identified undergraduate and other convenience samples respond to schizotypy scales in the same way as random samples of the general population. Here, 651 undergraduates, 350 MTurk workers, and two randomly selected high school samples (n = 177, n = 551) completed brief versions of the Schizotypal Personality Questionnaire (SPQ-BR or SPQ-BRU). Multigroup confirmatory factor analysis was used to test whether measurement invariance was present across samples. Tests were made for all samples together and for each pair of samples. Results showed that a first-order nine-factor model fit the data well, and this factor structure displayed configural and metric invariance across the four samples. This suggests that schizotypy has the same factor structure, and the SPQ-BR/BRU is measuring the same construct across the different groups. However, when all groups were compared, results indicated a lack of scalar invariance across these samples, suggesting mean comparisons may be inappropriate across different sample types. However, when randomly selected high school students were compared with undergraduate students, scalar invariance was present. This suggests that factors such as culture and form type may be driving invariance, rather than sampling method (convenience vs general population).
Collapse
Affiliation(s)
- Kirstie J M O'Hare
- Department of Psychology, University of Otago, Dunedin, New Zealand; Discipline of Psychiatry and Mental Health, University of New South Wales, Sydney, New South Wales, Australia
| | | |
Collapse
|
15
|
Sudarshan NJ, Bowden SC. Common Factor Structure of the Ten Subtest Wechsler Adult Intelligence Scale-Fourth Edition in a Clinical Sample and 15 Subtest Version in the Standardization Sample. Arch Clin Neuropsychol 2023; 38:1646-1658. [PMID: 37222085 PMCID: PMC10681435 DOI: 10.1093/arclin/acad035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2023] [Indexed: 05/25/2023] Open
Abstract
OBJECTIVE The 10 core subtests of the Wechsler Adult Intelligence Scale-IV (WAIS-IV) suffice to produce the 4 index scores for clinical assessments. Factor analytic studies with the full complement of 15 subtests reveal a 5-factor structure that aligns with Cattell-Horn-Carroll taxonomy of cognitive abilities. The current study investigates the validity of 5-factor structure in a clinical setting with reduced number of 10 subtests. METHOD Confirmatory factor analytic models were fitted to a clinical neurosciences archival data set (n_Male = 166, n_Female = 155) and to 9 age-group samples of the WAIS-IV standardization data (n = 200 for each group). The clinical and the standardization samples differed as (a) the former comprised scores from patients, aged 16 to 91, with disparate neurological diagnosis whereas the latter was demographically stratified, (b) only the 10 core subtests in the former but all 15 subtests in the latter were administered, and (c) the former had missing data, but the latter was complete. RESULT Despite empirical constraints to eliciting 5 factors with only 10 indicators, the well-fitting, 5-factor (acquired knowledge, fluid intelligence, short-term memory, visual processing, and processing speed) measurement model evinced metric invariance between the clinical and standardization samples. CONCLUSION The same cognitive constructs are measured on the same metrics in every sample examined and provide no reason to reject the assumption that the 5 underlying latent abilities of the 15 subtest version in the standardization samples can also be inferred from the 10 subtest version in clinical populations.
Collapse
Affiliation(s)
- Navaneetham J Sudarshan
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, Victoria, Australia
| | - Stephen C Bowden
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, Victoria, Australia
- Department of Clinical Neurosciences, St. Vincent’s Hospital, Fitzroy, Victoria, Australia
| |
Collapse
|
16
|
Zsidai B, Hilkert AS, Kaarre J, Narup E, Senorski EH, Grassi A, Ley C, Longo UG, Herbst E, Hirschmann MT, Kopf S, Seil R, Tischer T, Samuelsson K, Feldt R. A practical guide to the implementation of AI in orthopaedic research - part 1: opportunities in clinical application and overcoming existing challenges. J Exp Orthop 2023; 10:117. [PMID: 37968370 PMCID: PMC10651597 DOI: 10.1186/s40634-023-00683-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 10/21/2023] [Indexed: 11/17/2023] Open
Abstract
Artificial intelligence (AI) has the potential to transform medical research by improving disease diagnosis, clinical decision-making, and outcome prediction. Despite the rapid adoption of AI and machine learning (ML) in other domains and industry, deployment in medical research and clinical practice poses several challenges due to the inherent characteristics and barriers of the healthcare sector. Therefore, researchers aiming to perform AI-intensive studies require a fundamental understanding of the key concepts, biases, and clinical safety concerns associated with the use of AI. Through the analysis of large, multimodal datasets, AI has the potential to revolutionize orthopaedic research, with new insights regarding the optimal diagnosis and management of patients affected musculoskeletal injury and disease. The article is the first in a series introducing fundamental concepts and best practices to guide healthcare professionals and researcher interested in performing AI-intensive orthopaedic research studies. The vast potential of AI in orthopaedics is illustrated through examples involving disease- or injury-specific outcome prediction, medical image analysis, clinical decision support systems and digital twin technology. Furthermore, it is essential to address the role of human involvement in training unbiased, generalizable AI models, their explainability in high-risk clinical settings and the implementation of expert oversight and clinical safety measures for failure. In conclusion, the opportunities and challenges of AI in medicine are presented to ensure the safe and ethical deployment of AI models for orthopaedic research and clinical application. Level of evidence IV.
Collapse
Affiliation(s)
- Bálint Zsidai
- Sahlgrenska Sports Medicine Center, Gothenburg, Sweden.
- Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.
| | - Ann-Sophie Hilkert
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Medfield Diagnostics AB, Gothenburg, Sweden
| | - Janina Kaarre
- Sahlgrenska Sports Medicine Center, Gothenburg, Sweden
- Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Orthopaedic Surgery, UPMC Freddie Fu Sports Medicine Center, University of Pittsburgh, Pittsburgh, USA
| | - Eric Narup
- Sahlgrenska Sports Medicine Center, Gothenburg, Sweden
- Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Eric Hamrin Senorski
- Sahlgrenska Sports Medicine Center, Gothenburg, Sweden
- Department of Health and Rehabilitation, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Sportrehab Sports Medicine Clinic, Gothenburg, Sweden
| | - Alberto Grassi
- Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- IIa Clinica Ortopedica E Traumatologica, IRCCS Istituto Ortopedico Rizzoli, Bologna, Italy
| | - Christophe Ley
- Department of Mathematics, University of Luxembourg, Esch-Sur-Alzette, Luxembourg
| | - Umile Giuseppe Longo
- Department of Orthopaedic and Trauma Surgery, Campus Bio-Medico University, Rome, Italy
| | - Elmar Herbst
- Department of Trauma, Hand and Reconstructive Surgery, University Hospital Münster, Münster, Germany
| | - Michael T Hirschmann
- Department of Orthopedic Surgery and Traumatology, Head Knee Surgery and DKF Head of Research, Kantonsspital Baselland, 4101, Bruderholz, Switzerland
| | - Sebastian Kopf
- Center of Orthopaedics and Traumatology, University Hospital Brandenburg a.d.H., Brandenburg Medical School Theodor Fontane, 14770, Brandenburg a.d.H., Germany
- Faculty of Health Sciences Brandenburg, Brandenburg Medical School Theodor Fontane, 14770, Brandenburg a.d.H., Germany
| | - Romain Seil
- Department of Orthopaedic Surgery, Centre Hospitalier Luxembourg and Luxembourg Institute of Health, Luxembourg, Luxembourg
| | - Thomas Tischer
- Clinic for Orthopaedics and Trauma Surgery, Malteser Waldkrankenhaus St. Marien, Erlangen, Germany
| | - Kristian Samuelsson
- Sahlgrenska Sports Medicine Center, Gothenburg, Sweden
- Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Orthopaedics, Sahlgrenska University Hospital, Mölndal, Sweden
| | - Robert Feldt
- Department of Orthopaedics, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| |
Collapse
|
17
|
Rajaraman S, Yang F, Zamzmi G, Xue Z, Antani S. Can Deep Adult Lung Segmentation Models Generalize to the Pediatric Population? Expert Syst Appl 2023; 229:120531. [PMID: 37397242 PMCID: PMC10310063 DOI: 10.1016/j.eswa.2023.120531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Lung segmentation in chest X-rays (CXRs) is an important prerequisite for improving the specificity of diagnoses of cardiopulmonary diseases in a clinical decision support system. Current deep learning models for lung segmentation are trained and evaluated on CXR datasets in which the radiographic projections are captured predominantly from the adult population. However, the shape of the lungs is reported to be significantly different across the developmental stages from infancy to adulthood. This might result in age-related data domain shifts that would adversely impact lung segmentation performance when the models trained on the adult population are deployed for pediatric lung segmentation. In this work, our goal is to (i) analyze the generalizability of deep adult lung segmentation models to the pediatric population and (ii) improve performance through a stage-wise, systematic approach consisting of CXR modality-specific weight initializations, stacked ensembles, and an ensemble of stacked ensembles. To evaluate segmentation performance and generalizability, novel evaluation metrics consisting of mean lung contour distance (MLCD) and average hash score (AHS) are proposed in addition to the multi-scale structural similarity index measure (MS-SSIM), the intersection of union (IoU), Dice score, 95% Hausdorff distance (HD95), and average symmetric surface distance (ASSD). Our results showed a significant improvement (p < 0.05) in cross-domain generalization through our approach. This study could serve as a paradigm to analyze the cross-domain generalizability of deep segmentation models for other medical imaging modalities and applications.
Collapse
Affiliation(s)
- Sivaramakrishnan Rajaraman
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Feng Yang
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Ghada Zamzmi
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Zhiyun Xue
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Sameer Antani
- Computational Health Research Branch, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
18
|
Nilsson A, Björk J, Strömberg U, Bonander C. Can non-participants in a follow-up be used to draw conclusions about incidences and prevalences in the full population invited at baseline? An investigation based on the Swedish MDC cohort. BMC Med Res Methodol 2023; 23:228. [PMID: 37821822 PMCID: PMC10568880 DOI: 10.1186/s12874-023-02053-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 10/01/2023] [Indexed: 10/13/2023] Open
Abstract
BACKGROUND Participants in epidemiological cohorts may not be representative of the full invited population, limiting the generalizability of prevalence and incidence estimates. We propose that this problem can be remedied by exploiting data on baseline participants who refused to participate in a re-examination, as such participants may be more similar to baseline non-participants than what baseline participants who agree to participate in the re-examination are. METHODS We compared background characteristics, mortality, and disease incidences across the full population invited to the Malmö Diet and Cancer (MDC) study, the baseline participants, the baseline non-participants, the baseline participants who participated in a re-examination, and the baseline participants who did not participate in the re-examination. We then considered two models for estimating characteristics and outcomes in the full population: one ("the substitution model") assuming that the baseline non-participants were similar to the baseline participants who refused to participate in the re-examination, and one ("the extrapolation model") assuming that differences between the full group of baseline participants and the baseline participants who participated in the re-examination could be extended to infer results in the full population. Finally, we compared prevalences of baseline risk factors including smoking, risky drinking, overweight, and obesity across baseline participants, baseline participants who participated in the re-examination, and baseline participants who did not participate in the re-examination, and used the above models to estimate the prevalences of these factors in the full invited population. RESULTS Compared to baseline non-participants, baseline participants were less likely to be immigrants, had higher socioeconomic status, and lower mortality and disease incidences. Baseline participants not participating in the re-examination generally resembled the full population. The extrapolation model often generated characteristics and incidences even more similar to the full population. The prevalences of risk factors, particularly smoking, were estimated to be substantially higher in the full population than among the baseline participants. CONCLUSIONS Participants in epidemiological cohorts such as the MDC study are unlikely to be representative of the full invited population. Exploiting data on baseline participants who did not participate in a re-examination can be a simple and useful way to improve the generalizability of prevalence and incidence estimates.
Collapse
Affiliation(s)
- Anton Nilsson
- Epidemiology, Population Studies and Infrastructures (EPI@LUND), Tornblad Institute, Lund University, Biskopsgatan 9, Hämtställe 21, 22362, Lund, Sweden.
| | - Jonas Björk
- Epidemiology, Population Studies and Infrastructures (EPI@LUND), Tornblad Institute, Lund University, Biskopsgatan 9, Hämtställe 21, 22362, Lund, Sweden
- Clinical Studies Sweden, Forum South, Skåne University Hospital, Lund, Sweden
| | - Ulf Strömberg
- Health Economics and Policy, School of Public Health & Community Medicine, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Carl Bonander
- Health Economics and Policy, School of Public Health & Community Medicine, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden
- Centre for Societal Risk Research, Karlstad University, Karlstad, Sweden
| |
Collapse
|
19
|
Buckley PR, Murry VM, Gust CJ, Ladika A, Pampel FC. Racial and Ethnic Representation in Preventive Intervention Research: a Methodological Study. Prev Sci 2023; 24:1261-1274. [PMID: 37386352 DOI: 10.1007/s11121-023-01564-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/07/2023] [Indexed: 07/01/2023]
Abstract
Individuals who are Asian or Asian American, Black or African American, Native American or American Indian or Alaska Native, Native Hawaiian or Pacific Islander, and Hispanic or Latino (i.e., presently considered racial ethnic minoritized groups in the USA) lacked equal access to resources for mitigating risk during COVID-19, which highlighted public health disparities and exacerbated inequities rooted in structural racism that have contributed to many injustices, such as failing public school systems and unsafe neighborhoods. Minoritized groups are also vulnerable to climate change wherein the most severe harms disproportionately fall upon underserved communities. While systemic changes are needed to address these pervasive syndemic conditions, immediate efforts involve examining strategies to promote equitable health and well-being-which served as the impetus for this study. We conducted a descriptive analysis on the prevalence of culturally tailored interventions and reporting of sample characteristics among 885 programs with evaluations published from 2010 to 2021 and recorded in the Blueprints for Healthy Youth Development registry. Inferential analyses also examined (1) reporting time trends and (2) the relationship between study quality (i.e., strong methods, beneficial effects) and culturally tailored programs and racial ethnic enrollment. Two percent of programs were developed for Black or African American youth, and 4% targeted Hispanic or Latino populations. For the 77% of studies that reported race, most enrollees were White (35%) followed by Black or African American (28%), and 31% collapsed across race or categorized race with ethnicity. In the 64% of studies that reported ethnicity, 32% of enrollees were Hispanic or Latino. Reporting has not improved, and there was no relationship between high-quality studies and programs developed for racial ethnic youth, or samples with high proportions of racial ethnic enrollees. Research gaps on racial ethnic groups call for clear reporting and better representation to reduce disparities and improve the utility of interventions.
Collapse
Affiliation(s)
- Pamela R Buckley
- Institute of Behavioral Science, University of Colorado Boulder, Boulder, USA.
| | - Velma McBride Murry
- Departments of Health Policy & Human and Organizational Development, Vanderbilt University Medical Center and Vanderbilt University, Nashville, USA
| | - Charleen J Gust
- Institute of Behavioral Science, University of Colorado Boulder, Boulder, USA
| | - Amanda Ladika
- Institute of Behavioral Science, University of Colorado Boulder, Boulder, USA
| | - Fred C Pampel
- Institute of Behavioral Science, University of Colorado Boulder, Boulder, USA
| |
Collapse
|
20
|
Nilsson A, Strömberg U, Björk J, Forsberg A, Fritzell K, Kemp Gudmundsdottir KR, Engdahl J, Bonander C. Examining the continuum of resistance model in two population-based screening studies in Sweden. Prev Med Rep 2023; 35:102317. [PMID: 37519442 PMCID: PMC10372382 DOI: 10.1016/j.pmedr.2023.102317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 06/20/2023] [Accepted: 07/08/2023] [Indexed: 08/01/2023] Open
Abstract
In studies recruited on a voluntary basis, lack of representativity may impair the ability to generalize findings to the target population. Previous studies, primarily based on surveys, have suggested that generalizability may be improved by exploiting data on individuals who agreed to participate only after receiving one or several reminders, as such individuals may be more similar to non-participants than what early participants are. Assessing this idea in the context of screenings, we compared sociodemographic characteristics and health across early, late, and non-participants in two large population-based screening studies in Sweden: STROKESTOP II (screening for atrial fibrillation; 6,867 participants) and SCREESCO (screening for colorectal cancer; 39,363 participants). We also explored the opportunities to reproduce the distributions of characteristics in the full invited populations, either by assuming that the non-participants were similar to the late participants, or by applying a linear extrapolation model based on both early and late participants. Findings showed that early and late participants exhibited similar characteristics along most dimensions, including civil status, education, income, and health examination results. Both these types of participants in turn differed from the non-participants, with fewer married, lower educational attainments, and lower incomes. Compared to early participants, late participants were more likely to be born outside of Sweden and to have comorbidities, with non-participants similar or even more so. The two empirical models improved representativity in some cases, but not always. Overall, we found mixed support that data on late participation may be useful for improving representativeness of screening studies.
Collapse
Affiliation(s)
- Anton Nilsson
- Epidemiology, Population Studies and Infrastructures (EPI@LUND), Lund University, Lund, Sweden
| | - Ulf Strömberg
- Region Halland, Halmstad, Sweden
- Health Economics and Policy, School of Public Health & Community Medicine, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden
| | - Jonas Björk
- Epidemiology, Population Studies and Infrastructures (EPI@LUND), Lund University, Lund, Sweden
- Clinical Studies Sweden, Forum South, Skåne University Hospital, Lund, Sweden
| | - Anna Forsberg
- Department of Medicine, Solna, Karolinska Institutet, Stockholm, Sweden
| | - Kaisa Fritzell
- Department of Neurobiology, Care Sciences and Society, Division of Nursing, Karolinska Institutet, Stockholm, Sweden
- The Hereditary Cancer Clinic, Theme Cancer, Karolinska University Hospital, Stockholm, Sweden
| | | | - Johan Engdahl
- Department of Clinical Sciences, Danderyd Hospital, Karolinska Institutet, Stockholm, Sweden
| | - Carl Bonander
- Health Economics and Policy, School of Public Health & Community Medicine, Institute of Medicine, University of Gothenburg, Gothenburg, Sweden
- Centre for Societal Risk Research, Karlstad University, Sweden
| |
Collapse
|
21
|
Lin C, Bulls LS, Tepfer LJ, Vyas AD, Thornton MA. Advancing Naturalistic Affective Science with Deep Learning. Affect Sci 2023; 4:550-562. [PMID: 37744976 PMCID: PMC10514024 DOI: 10.1007/s42761-023-00215-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 08/03/2023] [Indexed: 09/26/2023]
Abstract
People express their own emotions and perceive others' emotions via a variety of channels, including facial movements, body gestures, vocal prosody, and language. Studying these channels of affective behavior offers insight into both the experience and perception of emotion. Prior research has predominantly focused on studying individual channels of affective behavior in isolation using tightly controlled, non-naturalistic experiments. This approach limits our understanding of emotion in more naturalistic contexts where different channels of information tend to interact. Traditional methods struggle to address this limitation: manually annotating behavior is time-consuming, making it infeasible to do at large scale; manually selecting and manipulating stimuli based on hypotheses may neglect unanticipated features, potentially generating biased conclusions; and common linear modeling approaches cannot fully capture the complex, nonlinear, and interactive nature of real-life affective processes. In this methodology review, we describe how deep learning can be applied to address these challenges to advance a more naturalistic affective science. First, we describe current practices in affective research and explain why existing methods face challenges in revealing a more naturalistic understanding of emotion. Second, we introduce deep learning approaches and explain how they can be applied to tackle three main challenges: quantifying naturalistic behaviors, selecting and manipulating naturalistic stimuli, and modeling naturalistic affective processes. Finally, we describe the limitations of these deep learning methods, and how these limitations might be avoided or mitigated. By detailing the promise and the peril of deep learning, this review aims to pave the way for a more naturalistic affective science.
Collapse
Affiliation(s)
- Chujun Lin
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Landry S. Bulls
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Lindsey J. Tepfer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Amisha D. Vyas
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| | - Mark A. Thornton
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH USA
| |
Collapse
|
22
|
Magoc T, Allen KS, McDonnell C, Russo JP, Cummins J, Vest JR, Harle CA. Generalizability and portability of natural language processing system to extract individual social risk factors. Int J Med Inform 2023; 177:105115. [PMID: 37302362 DOI: 10.1016/j.ijmedinf.2023.105115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 05/15/2023] [Accepted: 05/30/2023] [Indexed: 06/13/2023]
Abstract
OBJECTIVE The objective of this study is to validate and report on portability and generalizability of a Natural Language Processing (NLP) method to extract individual social factors from clinical notes, which was originally developed at a different institution. MATERIALS AND METHODS A rule-based deterministic state machine NLP model was developed to extract financial insecurity and housing instability using notes from one institution and was applied on all notes written during 6 months at another institution. 10% of positively-classified notes by NLP and the same number of negatively-classified notes were manually annotated. The NLP model was adjusted to accommodate notes at the new site. Accuracy, positive predictive value, sensitivity, and specificity were calculated. RESULTS More than 6 million notes were processed at the receiving site by the NLP model, which resulted in about 13,000 and 19,000 classified as positive for financial insecurity and housing instability, respectively. The NLP model showed excellent performance on the validation dataset with all measures over 0.87 for both social factors. DISCUSSION Our study illustrated the need to accommodate institution-specific note-writing templates as well as clinical terminology of emergent diseases when applying NLP model for social factors. A state machine is relatively simple to port effectively across institutions. Our study. showed superior performance to similar generalizability studies for extracting social factors. CONCLUSION Rule-based NLP model to extract social factors from clinical notes showed strong portability and generalizability across organizationally and geographically distinct institutions. With only relatively simple modifications, we obtained promising performance from an NLP-based model.
Collapse
Affiliation(s)
- Tanja Magoc
- College of Medicine, University of Florida, Gainesville, FL, USA.
| | - Katie S Allen
- Regenstrief Institute, Inc., Indianapolis, IN, USA; Richard M. Fairbanks School of Public Health, IUPUI, Indianapolis, IN, USA
| | - Cara McDonnell
- College of Medicine, University of Florida, Gainesville, FL, USA
| | - Jean-Paul Russo
- College of Medicine, University of Florida, Gainesville, FL, USA; Miller School of Medicine, University of Miami, Miami, FL, USA
| | | | - Joshua R Vest
- Regenstrief Institute, Inc., Indianapolis, IN, USA; Richard M. Fairbanks School of Public Health, IUPUI, Indianapolis, IN, USA
| | - Christopher A Harle
- Regenstrief Institute, Inc., Indianapolis, IN, USA; Richard M. Fairbanks School of Public Health, IUPUI, Indianapolis, IN, USA
| |
Collapse
|
23
|
Ong SWX, Tong SYC, Daneman N. Are we enrolling the right patients? A scoping review of external validity and generalizability of clinical trials in bloodstream infections. Clin Microbiol Infect 2023:S1198-743X(23)00402-0. [PMID: 37633330 DOI: 10.1016/j.cmi.2023.08.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 08/15/2023] [Accepted: 08/20/2023] [Indexed: 08/28/2023]
Abstract
BACKGROUND Having a representative population in randomized clinical trials (RCTs) improves external validity and generalizability of trial results. There are limited data examining differences between RCT-enrolled and real-world populations in bloodstream infections (BSI). OBJECTIVES We conducted a scoping review aiming to review studies assessing generalizability of BSI RCT populations, to identify sub-groups that have been systematically under-represented and to explore approaches to improve external validity of future RCTs. SOURCES MEDLINE, Embase, and Cochrane Library databases were searched for terms related to external validity or generalizability, BSI, and clinical trials in papers published up to 1 August 2023. Studies comparing enrolled versus nonenrolled patients, or papers discussing external validity or generalizability in the context of BSI RCTs were included. CONTENT Sixteen papers were included in the final review. Five compared RCT-enrolled and nonenrolled participants from the same source population. There were significant differences between the two groups in all studies, with nonenrolled patients having a greater comorbidity burden and consistently worse outcomes including mortality. We identified several barriers to improving generalizability of RCT populations and outlined potential approaches to reduce these barriers, such as alternative/simplified consent processes, streamlining eligibility criteria and follow-up procedures, quota-based sampling techniques, and ensuring diversity in site and study team selection. IMPLICATIONS Study cohorts in BSI RCTs are not representative of the general BSI patient population. As we increasingly adopt large pragmatic trials in infectious diseases, it is important to recognize the importance of maximizing generalizability to ensure that our research findings are of direct relevance to our patients.
Collapse
Affiliation(s)
- Sean W X Ong
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada; Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, Australia; Sunnybrook Health Sciences Centre, Toronto, Canada.
| | - Steven Y C Tong
- Department of Infectious Diseases, University of Melbourne, Peter Doherty Institute for Infection and Immunity, Melbourne, Australia; Victorian Infectious Diseases Service, Royal Melbourne Hospital, Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Nick Daneman
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada; Sunnybrook Health Sciences Centre, Toronto, Canada
| |
Collapse
|
24
|
Zhou S, Wang N, Wang L, Sun J, Blaes A, Liu H, Zhang R. A cross-institutional evaluation on breast cancer phenotyping NLP algorithms on electronic health records. Comput Struct Biotechnol J 2023; 22:32-40. [PMID: 37680211 PMCID: PMC10480628 DOI: 10.1016/j.csbj.2023.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 08/15/2023] [Accepted: 08/21/2023] [Indexed: 09/09/2023] Open
Abstract
Objective Transformer-based language models are prevailing in the clinical domain due to their excellent performance on clinical NLP tasks. The generalizability of those models is usually ignored during the model development process. This study evaluated the generalizability of CancerBERT, a Transformer-based clinical NLP model, along with classic machine learning models, i.e., conditional random field (CRF), bi-directional long short-term memory CRF (BiLSTM-CRF), across different clinical institutes through a breast cancer phenotype extraction task. Materials and methods Two clinical corpora of breast cancer patients were collected from the electronic health records from the University of Minnesota (UMN) and Mayo Clinic (MC), and annotated following the same guideline. We developed three types of NLP models (i.e., CRF, BiLSTM-CRF and CancerBERT) to extract cancer phenotypes from clinical texts. We evaluated the generalizability of models on different test sets with different learning strategies (model transfer vs locally trained). The entity coverage score was assessed with their association with the model performances. Results We manually annotated 200 and 161 clinical documents at UMN and MC. The corpora of the two institutes were found to have higher similarity between the target entities than the overall corpora. The CancerBERT models obtained the best performances among the independent test sets from two clinical institutes and the permutation test set. The CancerBERT model developed in one institute and further fine-tuned in another institute achieved reasonable performance compared to the model developed on local data (micro-F1: 0.925 vs 0.932). Conclusions The results indicate the CancerBERT model has superior learning ability and generalizability among the three types of clinical NLP models for our named entity recognition task. It has the advantage to recognize complex entities, e.g., entities with different labels.
Collapse
Affiliation(s)
- Sicheng Zhou
- Institute for Health Informatics, University of Minnesota, Minneapolis, MN, USA
| | - Nan Wang
- School of Statistics, University of Minnesota, Minneapolis, MN, USA
| | - Liwei Wang
- Department of AI and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Ju Sun
- Department of Computer Science & Engineering, University of Minnesota, Minneapolis, MN, USA
| | - Anne Blaes
- Department of Medicine, University of Minnesota, Minneapolis, MN, USA
| | - Hongfang Liu
- Department of AI and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
25
|
Nakayama LF, Zago Ribeiro L, de Oliveira JAE, de Matos JCRG, Mitchell WG, Malerbi FK, Celi LA, Regatieri CVS. Fairness and generalizability of OCT normative databases: a comparative analysis. Int J Retina Vitreous 2023; 9:48. [PMID: 37605208 PMCID: PMC10440930 DOI: 10.1186/s40942-023-00459-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 03/26/2023] [Indexed: 08/23/2023] Open
Abstract
PURPOSE In supervised Machine Learning algorithms, labels and reports are important in model development. To provide a normality assessment, the OCT has an in-built normative database that provides a color base scale from the measurement database comparison. This article aims to evaluate and compare normative databases of different OCT machines, analyzing patient demographic, contrast inclusion and exclusion criteria, diversity index, and statistical approach to assess their fairness and generalizability. METHODS Data were retrieved from Cirrus, Avanti, Spectralis, and Triton's FDA-approval and equipment manual. The following variables were compared: number of eyes and patients, inclusion and exclusion criteria, statistical approach, sex, race and ethnicity, age, participant country, and diversity index. RESULTS Avanti OCT has the largest normative database (640 eyes). In every database, the inclusion and exclusion criteria were similar, including adult patients and excluding pathological eyes. Spectralis has the largest White (79.7%) proportionately representation, Cirrus has the largest Asian (24%), and Triton has the largest Black (22%) patient representation. In all databases, the statistical analysis applied was Regression models. The sex diversity index is similar in all datasets, and comparable to the ten most populous contries. Avanti dataset has the highest diversity index in terms of race, followed by Cirrus, Triton, and Spectralis. CONCLUSION In all analyzed databases, the data framework is static, with limited upgrade options and lacking normative databases for new modules. As a result, caution in OCT normality interpretation is warranted. To address these limitations, there is a need for more diverse, representative, and open-access datasets that take into account patient demographics, especially considering the development of supervised Machine Learning algorithms in healthcare.
Collapse
Affiliation(s)
- Luis Filipe Nakayama
- Laboratory of Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA, 02139, United States of America.
- Department of Ophthalmology, São Paulo Federal University, Sao Paulo, SP, Brazil.
| | - Lucas Zago Ribeiro
- Department of Ophthalmology, São Paulo Federal University, Sao Paulo, SP, Brazil
| | | | - João Carlos Ramos Gonçalves de Matos
- Laboratory of Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA, 02139, United States of America
- University of Porto, Porto, Portugal
| | | | | | - Leo Anthony Celi
- Laboratory of Computational Physiology, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA, 02139, United States of America
- Department of Biostatistics, United States of America, Harvard TH Chan School of Public Health, Boston, MA, United States of America
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, United States of America
| | | |
Collapse
|
26
|
Tik N, Gal S, Madar A, Ben-David T, Bernstein-Eliav M, Tavor I. Generalizing prediction of task-evoked brain activity across datasets and populations. Neuroimage 2023; 276:120213. [PMID: 37268097 DOI: 10.1016/j.neuroimage.2023.120213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 05/28/2023] [Accepted: 05/30/2023] [Indexed: 06/04/2023] Open
Abstract
Predictions of task-based functional magnetic resonance imaging (fMRI) from task-free resting-state (rs) fMRI have gained popularity over the past decade. This method holds a great promise for studying individual variability in brain function without the need to perform highly demanding tasks. However, in order to be broadly used, prediction models must prove to generalize beyond the dataset they were trained on. In this work, we test the generalizability of prediction of task-fMRI from rs-fMRI across sites, MRI vendors and age-groups. Moreover, we investigate the data requirements for successful prediction. We use the Human Connectome Project (HCP) dataset to explore how different combinations of training sample sizes and number of fMRI datapoints affect prediction success in various cognitive tasks. We then apply models trained on HCP data to predict brain activations in data from a different site, a different MRI vendor (Phillips vs. Siemens scanners) and a different age group (children from the HCP-development project). We demonstrate that, depending on the task, a training set of approximately 20 participants with 100 fMRI timepoints each yields the largest gain in model performance. Nevertheless, further increasing sample size and number of timepoints results in significantly improved predictions, until reaching approximately 450-600 training participants and 800-1000 timepoints. Overall, the number of fMRI timepoints influences prediction success more than the sample size. We further show that models trained on adequate amounts of data successfully generalize across sites, vendors and age groups and provide predictions that are both accurate and individual-specific. These findings suggest that large-scale publicly available datasets may be utilized to study brain function in smaller, unique samples.
Collapse
Affiliation(s)
- Niv Tik
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Shachar Gal
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Asaf Madar
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Tamar Ben-David
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Michal Bernstein-Eliav
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Ido Tavor
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel; Strauss Center for Computational Neuroimaging, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
27
|
Missiou A, Ntalaouti E, Lionis C, Evangelou E, Tatsioni A. Underreporting contextual factors preclude the applicability appraisal in primary care randomized controlled trials. J Clin Epidemiol 2023; 160:24-32. [PMID: 37311513 DOI: 10.1016/j.jclinepi.2023.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 05/21/2023] [Accepted: 06/06/2023] [Indexed: 06/15/2023]
Abstract
OBJECTIVES To assess applicability reporting in randomized controlled trials (RCTs) conducted in primary care (PC). STUDY DESIGN AND SETTING We used a random sample of PC RCTs published between 2000 and 2020 to assess applicability. We extracted data related to setting, population, intervention (including implementation), comparator, outcomes, and context. Based on data availability, we assessed whether the five predefined applicability questions were adequately addressed by each PC RCT. RESULTS Adequately described elements that were reported frequently (>50%) included the responsible organization for intervention provision (97, 93.3%), study population characteristics (94, 90.4%), intervention implementation including monitoring and evaluation (92, 88.5%), intervention components (89, 85.6%), time frame (82, 78.8%), baseline prevalence (58, 55.8%), and the type of setting and location (53, 51%). Elements that were often underreported included contextual factors, that is, evidence of differential effects across sociodemographic or other groupings (2, 1.9%), intervention components tailored for specific settings (7, 6.7%), health system structure (32, 30.8%), factors affecting implementation (40, 38.5%) and organization structure (50, 48.1%). The proportion of trials that adequately addressed each applicability question ranged between 1% and 20.2%, while none RCT could address all of them. CONCLUSION Underreporting contextual factors jeopardize the appraisal of applicability in PC RCTs.
Collapse
Affiliation(s)
- Aristea Missiou
- Research Unit for General Medicine and Primary Health Care, Faculty of Medicine, School of Health Sciences, University of Ioannina, Ioannina, Greece
| | - Eleni Ntalaouti
- Research Unit for General Medicine and Primary Health Care, Faculty of Medicine, School of Health Sciences, University of Ioannina, Ioannina, Greece
| | - Christos Lionis
- Clinic of Social and Family Medicine, School of Medicine, University of Crete, Crete, Greece; Department of Health, Medicine and Care, General Practice, Linköping University, Linköping, Sweden
| | - Evangelos Evangelou
- Department of Hygiene and Epidemiology, Faculty of Medicine, School of Health Sciences, University of Ioannina, Ioannina, Greece; Department of Epidemiology and Biostatistics, Imperial College London, London, UK
| | - Athina Tatsioni
- Research Unit for General Medicine and Primary Health Care, Faculty of Medicine, School of Health Sciences, University of Ioannina, Ioannina, Greece.
| |
Collapse
|
28
|
Steingrimsson JA. Extending prediction models for use in a new target population with failure time outcomes. Biostatistics 2023; 24:728-742. [PMID: 35389429 DOI: 10.1093/biostatistics/kxac011] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 03/14/2022] [Accepted: 03/21/2022] [Indexed: 07/20/2023] Open
Abstract
Prediction models are often built and evaluated using data from a population that differs from the target population where model-derived predictions are intended to be used in. In this article, we present methods for evaluating model performance in the target population when some observations are right censored. The methods assume that outcome and covariate data are available from a source population used for model development and covariates, but no outcome data, are available from the target population. We evaluate the finite sample performance of the proposed estimators using simulations and apply the methods to transport a prediction model built using data from a lung cancer screening trial to a nationally representative population of participants eligible for lung cancer screening.
Collapse
Affiliation(s)
- Jon A Steingrimsson
- Department of Biostatistics, Brown University, 121 South Main Street, Providence, RI 02903, USA
| |
Collapse
|
29
|
Hong H, Liu L, Mojtabai R, Stuart EA. Calibrated meta-analysis to estimate the efficacy of mental health treatments in target populations: an application to paliperidone trials for treatment of schizophrenia. BMC Med Res Methodol 2023; 23:150. [PMID: 37365521 DOI: 10.1186/s12874-023-01958-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 05/25/2023] [Indexed: 06/28/2023] Open
Abstract
BACKGROUNDS Meta-analyses can be a powerful tool but need to calibrate potential unrepresentativeness of the included trials to a target population. Estimating target population average treatment effects (TATE) in meta-analyses is important to understand how treatments perform in well-defined target populations. This study estimated TATE of paliperidone palmitate in patients with schizophrenia using meta-analysis with individual patient trial data and target population data. METHODS We conducted a meta-analysis with data from four randomized clinical trials and target population data from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study. Efficacy was measured using the Positive and Negative Syndrome Scale (PANSS). Weights to equate the trial participants and target population were calculated by comparing baseline characteristics between the trials and CATIE. A calibrated weighted meta-analysis with random effects was performed to estimate the TATE of paliperidone compared to placebo. RESULTS A total of 1,738 patients were included in the meta-analysis along with 1,458 patients in CATIE. After weighting, the covariate distributions of the trial participants and target population were similar. Compared to placebo, paliperidone palmitate was associated with a significant reduction of the PANSS total score under both unweighted (mean difference 9.07 [4.43, 13.71]) and calibrated weighted (mean difference 6.15 [2.22, 10.08]) meta-analysis. CONCLUSIONS The effect of paliperidone palmitate compared with placebo is slightly smaller in the target population than that estimated directly from the unweighted meta-analysis. Representativeness of samples of trials included in a meta-analysis to a target population should be assessed and incorporated properly to obtain the most reliable evidence of treatment effects in target populations.
Collapse
Affiliation(s)
- Hwanhee Hong
- Department of Biostatistics and Bioinformatics, School of Medicine, Duke University, 2424 Erwin Road, Ste 1105, Durham, NC, 27705, USA.
| | - Lu Liu
- Department of Biostatistics and Bioinformatics, School of Medicine, Duke University, 2424 Erwin Road, Ste 1105, Durham, NC, 27705, USA
| | - Ramin Mojtabai
- Department of Mental Health, Bloomberg School of Public Health, Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD, 21205, USA
| | - Elizabeth A Stuart
- Department of Mental Health, Bloomberg School of Public Health, Johns Hopkins University, 615 N. Wolfe Street, Baltimore, MD, 21205, USA
| |
Collapse
|
30
|
Huo T, Glueck DH, Shenkman EA, Muller KE. Stratified split sampling of electronic health records. BMC Med Res Methodol 2023; 23:128. [PMID: 37231360 DOI: 10.1186/s12874-023-01938-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 05/04/2023] [Indexed: 05/27/2023] Open
Abstract
Although superficially similar to data from clinical research, data extracted from electronic health records may require fundamentally different approaches for model building and analysis. Because electronic health record data is designed for clinical, rather than scientific use, researchers must first provide clear definitions of outcome and predictor variables. Yet an iterative process of defining outcomes and predictors, assessing association, and then repeating the process may increase Type I error rates, and thus decrease the chance of replicability, defined by the National Academy of Sciences as the chance of "obtaining consistent results across studies aimed at answering the same scientific question, each of which has obtained its own data."[1] In addition, failure to account for subgroups may mask heterogeneous associations between predictor and outcome by subgroups, and decrease the generalizability of the findings. To increase chances of replicability and generalizability, we recommend using a stratified split sample approach for studies using electronic health records. A split sample approach divides the data randomly into an exploratory set for iterative variable definition, iterative analyses of association, and consideration of subgroups. The confirmatory set is used only to replicate results found in the first set. The addition of the word 'stratified' indicates that rare subgroups are oversampled randomly by including them in the exploratory sample at higher rates than appear in the population. The stratified sampling provides a sufficient sample size for assessing heterogeneity of association by testing for effect modification by group membership. An electronic health record study of the associations between socio-demographic factors and uptake of hepatic cancer screening, and potential heterogeneity of association in subgroups defined by gender, self-identified race and ethnicity, census-tract level poverty and insurance type illustrates the recommended approach.
Collapse
Affiliation(s)
- Tianyao Huo
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, 2004 Mowry Road; Room 2236-5, PO Box 100177, Gainesville, FL, 32608, USA
| | - Deborah H Glueck
- Department of Pediatrics, School of Medicine, University of Colorado, 12474 E. 19th Avenue, Building 402, Room 219 Main Stop F426, Aurora, CO, 80045, USA
| | - Elizabeth A Shenkman
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, 2004 Mowry Road; Room 2245, PO Box 100177, Gainesville, FL, 32608, USA
| | - Keith E Muller
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, 2004 Mowry Road; Room 2244, PO Box 100177, Gainesville, FL, 32608, USA.
| |
Collapse
|
31
|
Malik HB, Norman JB. Best Practices and Methodological Strategies for Addressing Generalizability in Neuropsychological Assessment. J Pediatr Neuropsychol 2023; 9:47-63. [PMID: 37250805 PMCID: PMC10182845 DOI: 10.1007/s40817-023-00145-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 04/11/2023] [Accepted: 04/15/2023] [Indexed: 05/31/2023]
Abstract
Generalizability considerations are widely discussed and a core foundation for understanding when and why treatment effects will replicate across sample demographics. However, guidelines on assessing and reporting generalizability-related factors differ across fields and are inconsistently applied. This paper synthesizes obstacles and best practices to apply recent work on measurement and sample diversity. We present a brief history of how knowledge in psychology has been constructed, with implications for who has been historically prioritized in research. We then review how generalizability remains a contemporary threat to neuropsychological assessment and outline best practices for researchers and clinical neuropsychologists. In doing so, we provide concrete tools to evaluate whether a given assessment is generalizable across populations and assist researchers in effectively testing and reporting treatment differences across sample demographics.
Collapse
Affiliation(s)
- Hinza B. Malik
- Department of Psychology, University of North Carolina Wilmington, 601 South College Road, Wilmington, NC 28403-5612 USA
| | - Jasmine B. Norman
- Department of Psychology, University of North Carolina Wilmington, 601 South College Road, Wilmington, NC 28403-5612 USA
| |
Collapse
|
32
|
Zhang J, Ma X, Zhang J, Sun D, Zhou X, Mi C, Wen H. Insights into geospatial heterogeneity of landslide susceptibility based on the SHAP-XGBoost model. J Environ Manage 2023; 332:117357. [PMID: 36731409 DOI: 10.1016/j.jenvman.2023.117357] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 01/05/2023] [Accepted: 01/22/2023] [Indexed: 06/18/2023]
Abstract
The spatial heterogeneity of landslide influencing factors is the main reason for the poor generalizability of the susceptibility evaluation model. This study aimed to construct a comprehensive explanatory framework for landslide susceptibility evaluation models based on the SHAP (SHapley Additive explanation)-XGBoost (eXtreme Gradient Boosting) algorithm, analyze the regional characteristics and spatial heterogeneity of landslide influencing factors, and discuss the heterogeneity of the generalizability of the models under different landscapes. Firstly, we selected different regions in typical mountainous hilly region and constructed a geospatial database containing 12 landslide influencing factors such as elevation, annual average rainfall, slope, lithology, and NDVI through field surveys, satellite images, and a literature review. Subsequently, the landslide susceptibility evaluation model was constructed based on the XGBoost algorithm and spatial database, and the prediction results of the landslide susceptibility evaluation model were explained based on regional topography, geology, and hydrology using the SHAP algorithm. Finally, the model was generalized and applied to regions with both similar and very different topography, geology, meteorology, and vegetation, to explore the spatial heterogeneity of the generalizability of the model. The following conclusions were drawn: the spatial distribution of landslides is heterogeneous and complex, and the contribution of each influencing factor on the occurrence of landslides has obvious regional characteristics and spatial heterogeneity. The generalizability of the landslide susceptibility evaluation model is spatially heterogeneous and has better generalizability to regions with similar regional characteristics. Further explanation of the XGBoost landslide susceptibility evaluation model using the SHAP method allows quantitative analysis of the differences in how much various factors contribute to disasters due to spatial heterogeneity, from the perspective of global and local evaluation units. In summary, the integrated explanatory framework based on the SHAP-XGBoost model can quantify the contribution of influencing factors on landslide occurrence at both global and local levels, which is conducive to the construction and improvement of the influencing factor system of landslide susceptibility in different regions. It can also provide a reference for predicting potential landslide hazard-prone areas and for Explainable Artificial Intelligence (XAI) research.
Collapse
Affiliation(s)
- Junyi Zhang
- Chongqing Normal University, Chongqing Key Laboratory of Surface Process and Environment Remote Sensing in the Three Gorges Reservoir Area, Chongqing, 401331, China
| | - Xianglong Ma
- Chongqing Normal University, Chongqing University Key Laboratory of GIS Application Research; Chongqing, 401331, China
| | - Jialan Zhang
- Chongqing University, Key Laboratory of New Technology for Construction of Cities in Mountain Area; Chongqing, 400045, China.
| | - Deliang Sun
- Chongqing Normal University, Chongqing University Key Laboratory of GIS Application Research; Chongqing, 401331, China
| | - Xinzhi Zhou
- Chongqing University, Key Laboratory of New Technology for Construction of Cities in Mountain Area; Chongqing, 400045, China
| | - Changlin Mi
- Linyi Natural Resources Development Service Center of Shandong Province; Shandong, 276007, China
| | - Haijia Wen
- Chongqing University, Key Laboratory of New Technology for Construction of Cities in Mountain Area; Chongqing, 400045, China
| |
Collapse
|
33
|
Yang Y, Sánchez-Tójar A, O'Dea RE, Noble DWA, Koricheva J, Jennions MD, Parker TH, Lagisz M, Nakagawa S. Publication bias impacts on effect size, statistical power, and magnitude (Type M) and sign (Type S) errors in ecology and evolutionary biology. BMC Biol 2023; 21:71. [PMID: 37013585 PMCID: PMC10071700 DOI: 10.1186/s12915-022-01485-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Accepted: 11/29/2022] [Indexed: 04/05/2023] Open
Abstract
Collaborative efforts to directly replicate empirical studies in the medical and social sciences have revealed alarmingly low rates of replicability, a phenomenon dubbed the 'replication crisis'. Poor replicability has spurred cultural changes targeted at improving reliability in these disciplines. Given the absence of equivalent replication projects in ecology and evolutionary biology, two inter-related indicators offer the opportunity to retrospectively assess replicability: publication bias and statistical power. This registered report assesses the prevalence and severity of small-study (i.e., smaller studies reporting larger effect sizes) and decline effects (i.e., effect sizes decreasing over time) across ecology and evolutionary biology using 87 meta-analyses comprising 4,250 primary studies and 17,638 effect sizes. Further, we estimate how publication bias might distort the estimation of effect sizes, statistical power, and errors in magnitude (Type M or exaggeration ratio) and sign (Type S). We show strong evidence for the pervasiveness of both small-study and decline effects in ecology and evolution. There was widespread prevalence of publication bias that resulted in meta-analytic means being over-estimated by (at least) 0.12 standard deviations. The prevalence of publication bias distorted confidence in meta-analytic results, with 66% of initially statistically significant meta-analytic means becoming non-significant after correcting for publication bias. Ecological and evolutionary studies consistently had low statistical power (15%) with a 4-fold exaggeration of effects on average (Type M error rates = 4.4). Notably, publication bias reduced power from 23% to 15% and increased type M error rates from 2.7 to 4.4 because it creates a non-random sample of effect size evidence. The sign errors of effect sizes (Type S error) increased from 5% to 8% because of publication bias. Our research provides clear evidence that many published ecological and evolutionary findings are inflated. Our results highlight the importance of designing high-power empirical studies (e.g., via collaborative team science), promoting and encouraging replication studies, testing and correcting for publication bias in meta-analyses, and adopting open and transparent research practices, such as (pre)registration, data- and code-sharing, and transparent reporting.
Collapse
Affiliation(s)
- Yefeng Yang
- Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.
- Department of Biosystems Engineering, Zhejiang University, Hangzhou, 310058, China.
| | | | - Rose E O'Dea
- School of Ecosystem and Forest Sciences, University of Melbourne, Parkville, Australia
| | - Daniel W A Noble
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Julia Koricheva
- Department of Biological Sciences, Royal Holloway University of London, Egham, Surrey, TW20 0EX, UK
| | - Michael D Jennions
- Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra, ACT, Australia
| | - Timothy H Parker
- Department of Biology, Whitman College, Walla Walla, WA, 99362, USA
| | - Malgorzata Lagisz
- Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Shinichi Nakagawa
- Evolution & Ecology Research Centre and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.
| |
Collapse
|
34
|
Okada G, Yoshioka T, Yamashita A, Itai E, Yokoyama S, Kamishikiryo T, Shinzato H, Masuda Y, Mitsuyama Y, Kan S, Kurata A, Takamura M, Yoshino A, Mantani A, Yamamoto O, Yokota N, Tamura T, Jitsuiki H, Kawato M, Yamashita O, Sakai Y, Okamoto Y. Verification of the brain network marker of major depressive disorder: Test-retest reliability and anterograde generalization performance for newly acquired data. J Affect Disord 2023; 326:262-6. [PMID: 36717028 DOI: 10.1016/j.jad.2023.01.087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 12/23/2022] [Accepted: 01/04/2023] [Indexed: 02/01/2023]
Abstract
BACKGROUND Recently, we developed a generalizable brain network marker for the diagnosis of major depressive disorder (MDD) across multiple imaging sites using resting-state functional magnetic resonance imaging. Here, we applied this brain network marker to newly acquired data to verify its test-retest reliability and anterograde generalization performance for new patients. METHODS We tested the sensitivity and specificity of our brain network marker of MDD using data acquired from 43 new patients with MDD as well as new data from 33 healthy controls (HCs) who participated in our previous study. To examine the test-retest reliability of our brain network marker, we evaluated the intraclass correlation coefficients (ICCs) between the brain network marker-based classifier's output (probability of MDD) in two sets of HC data obtained at an interval of approximately 1 year. RESULTS Test-retest correlation between the two sets of the classifier's output (probability of MDD) from HCs exhibited moderate reliability with an ICC of 0.45 (95 % confidence interval,0.13-0.68). The classifier distinguished patients with MDD and HCs with an accuracy of 69.7 % (sensitivity, 72.1 %; specificity, 66.7 %). LIMITATIONS The data of patients with MDD in this study were cross-sectional, and the clinical significance of the marker, such as whether it is a state or trait marker of MDD and its association with treatment responsiveness, remains unclear. CONCLUSIONS The results of this study reaffirmed the test-retest reliability and generalization performance of our brain network marker for the diagnosis of MDD.
Collapse
|
35
|
Basnight-Brown D, Janssen SMJ, Thomas AK. Exploration of human cognitive universals and human cognitive diversity. Mem Cognit 2023; 51:505-8. [PMID: 36859524 DOI: 10.3758/s13421-023-01410-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/21/2023] [Indexed: 03/03/2023]
Abstract
In this editorial, the editors briefly introduce the aims of the Special Issue. If the goal of the scientific field of Cognitive Psychology is to improve our understanding of human cognition, then research needs to be conducted on a much broader slice of humanity than it has mostly been doing. The first aim of this Special Issue was to examine cognitive processes in populations that are different from the typical Western young adult samples often used in previously published studies. Studies in this issue therefore included both non-WEIRD participants as well as WEIRD participants who process information using different sensory experiences (e.g., individuals who are deaf). The second aim was to amplify - where possible - the research of scholars from less well-represented regions. The authors of the studies were affiliated with a diverse range of academic institutes and frequently included partnerships between Western and non-Western investigators.
Collapse
|
36
|
Seaborn K, Barbareschi G, Chandra S. Not Only WEIRD but "Uncanny"? A Systematic Review of Diversity in Human-Robot Interaction Research. Int J Soc Robot 2023:1-30. [PMID: 37359427 PMCID: PMC9993363 DOI: 10.1007/s12369-023-00968-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/19/2023] [Indexed: 03/29/2023]
Abstract
Critical voices within and beyond the scientific community have pointed to a grave matter of concern regarding who is included in research and who is not. Subsequent investigations have revealed an extensive form of sampling bias across a broad range of disciplines that conduct human subjects research called "WEIRD": Western, Educated, Industrial, Rich, and Democratic. Recent work has indicated that this pattern exists within human-computer interaction (HCI) research, as well. How then does human-robot interaction (HRI) fare? And could there be other patterns of sampling bias at play, perhaps those especially relevant to this field of study? We conducted a systematic review of the premier ACM/IEEE International Conference on Human-Robot Interaction (2006-2022) to discover whether and how WEIRD HRI research is. Importantly, we expanded our purview to other factors of representation highlighted by critical work on inclusion and intersectionality as potentially underreported, overlooked, and even marginalized factors of human diversity. Findings from 827 studies across 749 papers confirm that participants in HRI research also tend to be drawn from WEIRD populations. Moreover, we find evidence of limited, obscured, and possible misrepresentation in participant sampling and reporting along key axes of diversity: sex and gender, race and ethnicity, age, sexuality and family configuration, disability, body type, ideology, and domain expertise. We discuss methodological and ethical implications for recruitment, analysis, and reporting, as well as the significance for HRI as a base of knowledge.
Collapse
|
37
|
Cook RR, Foot C, Arah OA, Humphreys K, Rudolph KE, Luo SX, Tsui JI, Levander XA, Korthuis PT. Estimating the impact of stimulant use on initiation of buprenorphine and extended-release naltrexone in two clinical trials and real-world populations. Addict Sci Clin Pract 2023; 18:11. [PMID: 36788634 PMCID: PMC9930351 DOI: 10.1186/s13722-023-00364-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 02/01/2023] [Indexed: 02/16/2023] Open
Abstract
BACKGROUND Co-use of stimulants and opioids is rapidly increasing. Randomized clinical trials (RCTs) have established the efficacy of medications for opioid use disorder (MOUD), but stimulant use may decrease the likelihood of initiating MOUD treatment. Furthermore, trial participants may not represent "real-world" populations who would benefit from treatment. METHODS We conducted a two-stage analysis. First, associations between stimulant use (time-varying urine drug screens for cocaine, methamphetamine, or amphetamines) and initiation of buprenorphine or extended-release naltrexone (XR-NTX) were estimated across two RCTs (CTN-0051 X:BOT and CTN-0067 CHOICES) using adjusted Cox regression models. Second, results were generalized to three target populations who would benefit from MOUD: Housed adults identifying the need for OUD treatment, as characterized by the National Survey on Drug Use and Health (NSDUH); adults entering OUD treatment, as characterized by Treatment Episodes Dataset (TEDS); and adults living in rural regions of the U.S. with high rates of injection drug use, as characterized by the Rural Opioids Initiative (ROI). Generalizability analyses adjusted for differences in demographic characteristics, substance use, housing status, and depression between RCT and target populations using inverse probability of selection weighting. RESULTS Analyses included 673 clinical trial participants, 139 NSDUH respondents (weighted to represent 661,650 people), 71,751 TEDS treatment episodes, and 1,933 ROI participants. The majority were aged 30-49 years, male, and non-Hispanic White. In RCTs, stimulant use reduced the likelihood of MOUD initiation by 32% (adjusted HR [aHR] = 0.68, 95% CI 0.49-0.94, p = 0.019). Stimulant use associations were slightly attenuated and non-significant among housed adults needing treatment (25% reduction, aHR = 0.75, 0.48-1.18, p = 0.215) and adults entering OUD treatment (28% reduction, aHR = 0.72, 0.51-1.01, p = 0.061). The association was more pronounced, but still non-significant among rural people injecting drugs (39% reduction, aHR = 0.61, 0.35-1.06, p = 0.081). Stimulant use had a larger negative impact on XR-NTX initiation compared to buprenorphine, especially in the rural population (76% reduction, aHR = 0.24, 0.08-0.69, p = 0.008). CONCLUSIONS Stimulant use is a barrier to buprenorphine or XR-NTX initiation in clinical trials and real-world populations that would benefit from OUD treatment. Interventions to address stimulant use among patients with OUD are urgently needed, especially among rural people injecting drugs, who already suffer from limited access to MOUD.
Collapse
Affiliation(s)
- R R Cook
- Section of Addiction Medicine, Department of Medicine, Oregon Health & Science University, Sam Jackson Hall, Suite 3370, 3245 SW Pavilion Loop, Portland, OR, 97239, USA.
| | - C Foot
- Section of Addiction Medicine, Department of Medicine, Oregon Health & Science University, Sam Jackson Hall, Suite 3370, 3245 SW Pavilion Loop, Portland, OR, 97239, USA
| | - O A Arah
- Department of Epidemiology, Fielding School of Public Health, University of California, Los Angeles (UCLA), Los Angeles, CA, USA
- Division of Physical Sciences, Department of Statistics, UCLA College, Los Angeles, CA, USA
- Research Unit for Epidemiology, Department of Public Health, Aarhus University, Aarhus, Denmark
| | - K Humphreys
- Center for Innovation to Implementation, VA Palo Alto Health Care System, Palo Alto, CA, USA
- Department of Psychiatry and Behavioral Sciences, Stanford University, Palo Alto, CA, USA
| | - K E Rudolph
- Department of Epidemiology, School of Public Health, Columbia University, New York, NY, USA
| | - S X Luo
- Division on Substance Use Disorders, Department of Psychiatry, Columbia University, New York, USA
| | - J I Tsui
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - X A Levander
- Section of Addiction Medicine, Department of Medicine, Oregon Health & Science University, Sam Jackson Hall, Suite 3370, 3245 SW Pavilion Loop, Portland, OR, 97239, USA
| | - P T Korthuis
- Section of Addiction Medicine, Department of Medicine, Oregon Health & Science University, Sam Jackson Hall, Suite 3370, 3245 SW Pavilion Loop, Portland, OR, 97239, USA
| |
Collapse
|
38
|
Robertson SE, Steingrimsson JA, Dahabreh IJ. Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population. Eur J Epidemiol 2023; 38:123-133. [PMID: 36626100 PMCID: PMC10986821 DOI: 10.1007/s10654-022-00901-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 07/11/2022] [Indexed: 01/11/2023]
Abstract
Most work on extending (generalizing or transporting) inferences from a randomized trial to a target population has focused on estimating average treatment effects (i.e., averaged over the target population's covariate distribution). Yet, in the presence of strong effect modification by baseline covariates, the average treatment effect in the target population may be less relevant for guiding treatment decisions. Instead, the conditional average treatment effect (CATE) as a function of key effect modifiers may be a more useful estimand. Recent work on estimating target population CATEs using baseline covariate, treatment, and outcome data from the trial and covariate data from the target population only allows for the examination of heterogeneity over distinct subgroups. We describe flexible pseudo-outcome regression modeling methods for estimating target population CATEs conditional on discrete or continuous baseline covariates when the trial is embedded in a sample from the target population (i.e., in nested trial designs). We construct pointwise confidence intervals for the CATE at a specific value of the effect modifiers and uniform confidence bands for the CATE function. Last, we illustrate the methods using data from the Coronary Artery Surgery Study (CASS) to estimate CATEs given history of myocardial infarction and baseline ejection fraction value in the target population of all trial-eligible patients with stable ischemic heart disease.
Collapse
Affiliation(s)
- Sarah E Robertson
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Jon A Steingrimsson
- Department of Biostatistics, Brown University School of Public Health, Providence, RI, USA
| | - Issa J Dahabreh
- CAUSALab, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
39
|
Gard AM, Hyde LW, Heeringa SG, West BT, Mitchell C. Why weight? Analytic approaches for large-scale population neuroscience data. Dev Cogn Neurosci 2023; 59:101196. [PMID: 36630774 PMCID: PMC9843279 DOI: 10.1016/j.dcn.2023.101196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 12/30/2022] [Accepted: 01/05/2023] [Indexed: 01/09/2023] Open
Abstract
Population-based neuroimaging studies that feature complex sampling designs enable researchers to generalize their results more widely. However, several theoretical and analytical questions pose challenges to researchers interested in these data. The following is a resource for researchers interested in using population-based neuroimaging data. We provide an overview of sampling designs and describe the differences between traditional model-based analyses and survey-oriented design-based analyses. To elucidate key concepts, we leverage data from the Adolescent Brain Cognitive Development℠ Study (ABCD Study®), a population-based sample of 11,878 9-10-year-olds in the United States. Analyses revealed modest sociodemographic discrepancies between the target population of 9-10-year-olds in the U.S. and both the recruited ABCD sample and the analytic sample with usable structural and functional imaging data. In evaluating the associations between socioeconomic resources (i.e., constructs that are tightly linked to recruitment biases) and several metrics of brain development, we show that model-based approaches over-estimated the associations of household income and under-estimated the associations of caregiver education with total cortical volume and surface area. Comparable results were found in models predicting neural function during two fMRI task paradigms. We conclude with recommendations for ABCD Study® users and users of population-based neuroimaging cohorts more broadly.
Collapse
Affiliation(s)
- Arianna M Gard
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA; Department of Psychology, Neuroscience and Cognitive Neuroscience Program, University of Maryland, College Park, MD, USA.
| | - Luke W Hyde
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA; Department of Psychology, University of Michigan, Ann Arbor, MI, USA
| | - Steven G Heeringa
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Brady T West
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Colter Mitchell
- Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
40
|
Jujjavarapu C, Suri P, Pejaver V, Friedly J, Gold LS, Meier E, Cohen T, Mooney SD, Heagerty PJ, Jarvik JG. Predicting decompression surgery by applying multimodal deep learning to patients' structured and unstructured health data. BMC Med Inform Decis Mak 2023; 23:2. [PMID: 36609379 PMCID: PMC9824905 DOI: 10.1186/s12911-022-02096-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 12/29/2022] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Low back pain (LBP) is a common condition made up of a variety of anatomic and clinical subtypes. Lumbar disc herniation (LDH) and lumbar spinal stenosis (LSS) are two subtypes highly associated with LBP. Patients with LDH/LSS are often started with non-surgical treatments and if those are not effective then go on to have decompression surgery. However, recommendation of surgery is complicated as the outcome may depend on the patient's health characteristics. We developed a deep learning (DL) model to predict decompression surgery for patients with LDH/LSS. MATERIALS AND METHOD We used datasets of 8387 and 8620 patients from a prospective study that collected data from four healthcare systems to predict early (within 2 months) and late surgery (within 12 months after a 2 month gap), respectively. We developed a DL model to use patients' demographics, diagnosis and procedure codes, drug names, and diagnostic imaging reports to predict surgery. For each prediction task, we evaluated the model's performance using classical and generalizability evaluation. For classical evaluation, we split the data into training (80%) and testing (20%). For generalizability evaluation, we split the data based on the healthcare system. We used the area under the curve (AUC) to assess performance for each evaluation. We compared results to a benchmark model (i.e. LASSO logistic regression). RESULTS For classical performance, the DL model outperformed the benchmark model for early surgery with an AUC of 0.725 compared to 0.597. For late surgery, the DL model outperformed the benchmark model with an AUC of 0.655 compared to 0.635. For generalizability performance, the DL model outperformed the benchmark model for early surgery. For late surgery, the benchmark model outperformed the DL model. CONCLUSIONS For early surgery, the DL model was preferred for classical and generalizability evaluation. However, for late surgery, the benchmark and DL model had comparable performance. Depending on the prediction task, the balance of performance may shift between DL and a conventional ML method. As a result, thorough assessment is needed to quantify the value of DL, a relatively computationally expensive, time-consuming and less interpretable method.
Collapse
Affiliation(s)
- Chethan Jujjavarapu
- Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Box 358047, Seattle, WA, 98195, USA
| | - Pradeep Suri
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Rehabilitation Medicine, University of Washington, 1959 NE Pacific St, Seattle, WA, 98195, USA
| | - Vikas Pejaver
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Janna Friedly
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Rehabilitation Medicine, University of Washington, 1959 NE Pacific St, Seattle, WA, 98195, USA
| | - Laura S Gold
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Radiology, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA
| | - Eric Meier
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA
- Department of Biostatistics, University of Washington, Box 357232, Seattle, WA, 98195-7232, USA
- Center for Biomedical Statistics, University of Washington, Seattle, WA, USA
| | - Trevor Cohen
- Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Box 358047, Seattle, WA, 98195, USA
| | - Sean D Mooney
- Department of Biomedical Informatics and Medical Education, School of Medicine, University of Washington, Box 358047, Seattle, WA, 98195, USA
| | - Patrick J Heagerty
- Department of Biostatistics, University of Washington, Box 357232, Seattle, WA, 98195-7232, USA
- Center for Biomedical Statistics, University of Washington, Seattle, WA, USA
| | - Jeffrey G Jarvik
- Clinical Learning, Evidence and Research Center, University of Washington, 4333 Brooklyn Ave NE, Seattle, WA, 98105, USA.
- Department of Radiology, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA.
- Department of Neurological Surgery, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195, USA.
- Department of Health Services, University of Washington, Box 357660, Seattle, WA, 98195-7660, USA.
| |
Collapse
|
41
|
Wang P, Giovannucci EL. Are exposure-disease relationships assessed in cohorts of health professionals generalizable?: a comparative analysis based on WCRF/AICR systematic literature reviews. Cancer Causes Control 2023; 34:39-45. [PMID: 36197566 DOI: 10.1007/s10552-022-01633-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 09/15/2022] [Indexed: 01/10/2023]
Abstract
PURPOSE Although cohorts of health professionals are not representative of the general US population, the generalizability of exposure-disease relationships identified in these cohorts has not been extensively evaluated. Our objective was to compare the associations of risk factors with cancer risk obtained in the Nurses' Health Study (NHS), Nurses' Health Study II (NHSII), and the Health Professionals Follow-Up Study (HPFS) with those from meta-analyses of cohort studies. METHODS Data were extracted from the most recent systematic literature reviews conducted by the World Cancer Fund/American Institute of Cancer Research (WCRF/AICR). We examined risk factors with "convincing," "probable," or "limited-suggestive" evidence for 17 cancer types. Cohort-specific results for NHS, NHSII, and HPFS and corresponding sex-specific pooled meta-analysis results were obtained when available. We compared associations for continuous variables and inspected potential non-linearity in the dose-response meta-analyses. RESULTS Data for 88 comparisons across 11 cancer types were available. For most risk factors, we observed a close resemblance between the cohort-specific and corresponding sex-specific pooled associations. The 45 comparisons for factors considered as "convincing" or "probable" invariably exhibited similar associations in direction and magnitude. In 44 of the 45, the 95% CI from the NHS, NHSII, or HPFS captured the pooled estimate. In the one exception, the difference was 0.01. CONCLUSION The NHS, NHSII, and HPFS studies are not representative of the general US population concerning sociodemographic and behavioral factors. However, the generalizability of the exposure-disease relationship assessed in these cohorts is not impaired by these factors.
Collapse
Affiliation(s)
- Peilu Wang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Edward L Giovannucci
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA. .,Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| |
Collapse
|
42
|
Jalusic KO, Ellenberger D, Stahmann A, Berger K. Adverse events in MS patients fulfilling or not inclusion criteria of the respective clinical trial - The problem of generalizability. Mult Scler Relat Disord 2023; 69:104422. [PMID: 36455503 DOI: 10.1016/j.msard.2022.104422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 11/16/2022] [Accepted: 11/18/2022] [Indexed: 11/21/2022]
Abstract
BACKGROUND The aim of this study was to evaluate how many MS patients treated with an approved DMD in routine care would have fulfilled the inclusion and exclusion criteria of phase III clinical trial and would therefore be eligible for the respective drug trial. Further, adverse events and disease progression for these patients were compared. METHODS A comparison of patients fulfilling phase III clinical trial inclusion and exclusion criteria and those who do not with regard to sociodemographic and clinical characteristics, adverse events and disease progression. Database was the REGIMS register, a national, prospective, observational, clinical multicentre registry. 1248 MS Patients were included. RESULTS 27.2% patients would have been eligible for inclusion into a phase III clinical trial of their indication. Patients who did not meet the criterion age are more likely to have a serious adverse event (SAE), whereas patients who did not fulfil the criterion relapse had a significant lower occurrence of an adverse event (AE). Non-fulfilment of other inclusion criteria (EDSS Score; medication history and MS type) did not show any significant differences in drug safety variables, AE and SAE. CONCLUSION Our results suggest that a low transferability of phase III clinical trial criteria, to patients in routine care with the exception of age, does not imply a higher risk with regard to adverse and serious adverse events.
Collapse
Affiliation(s)
- K O Jalusic
- University of Muenster, Institute of Epidemiology and Social Medicine, Muenster, Germany.
| | - D Ellenberger
- MS Forschungs- und Projektentwicklungs-gGmbH, German MS Register, Hannover, Germany
| | - A Stahmann
- MS Forschungs- und Projektentwicklungs-gGmbH, German MS Register, Hannover, Germany
| | - K Berger
- University of Muenster, Institute of Epidemiology and Social Medicine, Muenster, Germany
| | | |
Collapse
|
43
|
Krantz MF, Hjorthøj C, Ellersgaard D, Hemager N, Christiani C, Spang KS, Burton BK, Gregersen M, Søndergaard A, Greve A, Ohland J, Mortensen PB, Plessen KJ, Bliksted V, Jepsen JRM, Thorup AAE, Mors O, Nordentoft M. Examining selection bias in a population-based cohort study of 522 children with familial high risk of schizophrenia or bipolar disorder, and controls: The Danish High Risk and Resilience Study VIA 7. Soc Psychiatry Psychiatr Epidemiol 2023; 58:113-140. [PMID: 36087138 DOI: 10.1007/s00127-022-02338-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 07/08/2022] [Indexed: 01/20/2023]
Abstract
PURPOSE Knowledge about representativity of familial high-risk studies of schizophrenia and bipolar disorder is essential to generalize study conclusions. The Danish High Risk and Resilience Study (VIA 7), a population-based case-control familial high-risk study, creates a unique opportunity for combining assessment and register data to examine cohort representativity. METHODS Through national registers, we identified the population of 11,959 children of parents with schizophrenia (FHR-SZ) or bipolar disorder (FHR-BP) and controls from which the 522 children participating in The VIA 7 Study (202 FHR-SZ, 120 FHR-BP and 200 controls) were selected. Socio-economic and health data were obtained to compare high-risk groups and controls, and participants versus non-participants. Selection bias impact on results was analyzed through inverse probability weights. RESULTS In the total sample of 11,959 children, FHR-SZ and FHR-BP children had more socio-economic and health disadvantages than controls (p < 0.001 for most). VIA 7 non-participants had a poorer function, e.g. more paternal somatic and mental illness (p = 0.02 and p = 0.04 for FHR-SZ), notifications of concern (FHR-BP and PBC p < 0.001), placements out of home (p = 0.03 for FHR-SZ), and lower level of education (p ≤ 0.01 for maternal FHR-SZ and FHR-BP, p = 0.001 for paternal FHR-BP). Inverse probability weighted analyses of results generated from the VIA Study showed minor changes in study findings after adjustment for the found selection bias. CONCLUSIONS Familial high-risk families have multiple socio-economic and health disadvantages. In The VIA 7 Study, although comparable regarding mental illness severity after their child's birth, socioeconomic and health disadvantages are more profound amongst non-participants than amongst participants.
Collapse
Affiliation(s)
- Mette Falkenberg Krantz
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark. .,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark. .,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.
| | - Carsten Hjorthøj
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,Department of Public Health, Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
| | - Ditte Ellersgaard
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Nicoline Hemager
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Camilla Christiani
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Katrine Søborg Spang
- Research Unit at Child and Adolescent Mental Health Center Copenhagen, Gentofte Hospitalsvej 3A, opg. 3A, 1. sal, 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Birgitte Klee Burton
- Research Unit at Child and Adolescent Mental Health Center Copenhagen, Gentofte Hospitalsvej 3A, opg. 3A, 1. sal, 2900, Hellerup, Denmark
| | - Maja Gregersen
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Anne Søndergaard
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Aja Greve
- iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,The Psychosis Research Unit, Aarhus University Hospital, Psychiatry, Palle Juul-Jensens Boulevard 175, Aarhus N, 8200, Arhus, Denmark.,Department of Clinical Medicine, Faculty of Health and Medical Services, Aarhus University, Palle Juul-Jensens Boulevard 82, Aarhus N, 8200, Arhus, Denmark
| | - Jessica Ohland
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Preben Bo Mortensen
- iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,Department of Economics and Business Economics, National Centre for Register-Based Research, Aarhus University, Fuglesangs Allé 26, Bygning R2640-R2641, Aarhus V, 8210, Arhus, Denmark
| | - Kerstin Jessica Plessen
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,Division of Child and Adolescent Psychiatry, Department of Psychiatry, Lausanne University Hospital, Avenue d'Echallens 9, 1004, Lausanne, Switzerland
| | - Vibeke Bliksted
- iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,The Psychosis Research Unit, Aarhus University Hospital, Psychiatry, Palle Juul-Jensens Boulevard 175, Aarhus N, 8200, Arhus, Denmark.,Department of Clinical Medicine, Faculty of Health and Medical Services, Aarhus University, Palle Juul-Jensens Boulevard 82, Aarhus N, 8200, Arhus, Denmark
| | - Jens Richardt Møllegaard Jepsen
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Research Unit at Child and Adolescent Mental Health Center Copenhagen, Gentofte Hospitalsvej 3A, opg. 3A, 1. sal, 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,Mental Health Services, Capital Region of Denmark, Center for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Center Glostrup, Nordstjernevej 41, 2600, Glostrup, Denmark
| | - Anne A E Thorup
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark.,Research Unit at Child and Adolescent Mental Health Center Copenhagen, Gentofte Hospitalsvej 3A, opg. 3A, 1. sal, 2900, Hellerup, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| | - Ole Mors
- iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark.,The Psychosis Research Unit, Aarhus University Hospital, Psychiatry, Palle Juul-Jensens Boulevard 175, Aarhus N, 8200, Arhus, Denmark.,Department of Clinical Medicine, Faculty of Health and Medical Services, Aarhus University, Palle Juul-Jensens Boulevard 82, Aarhus N, 8200, Arhus, Denmark
| | - Merete Nordentoft
- CORE- Copenhagen Research Center for Mental Health, Mental Health Center Copenhagen, The Danish High Risk and Resilience Study VIA 7 and VIA 11, Capital Region of Denmark, Copenhagen University Hospital, Gentofte Hospitalsvej 15, opg. 15, 1. Sal., 2900, Hellerup, Denmark.,Faculty of Health and Medical Sciences, Institute of Clinical Medicine, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen, Denmark.,iPSYCH -The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Fuglesangs Allé 26, Aarhus N, 8210, Arhus, Denmark
| |
Collapse
|
44
|
Khan KS, Bueno Cavanillas A, Zamora J. [Systematic reviews in five steps: V. Interpreting the findings]. Semergen 2023; 49:101854. [PMID: 36410229 DOI: 10.1016/j.semerg.2022.101854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/10/2022] [Accepted: 09/17/2022] [Indexed: 11/19/2022]
Abstract
The last step in a systematic review is the interpretation of the findings. The important findings need to be explicitly identified. A level of strength of evidence should be assigned to support each key finding, based on factors such as study design, methodological quality and risk of publication bias. Variations in the magnitude of associations observed also need to be explored. The aim of this analysis is to determine in which clinical groups the intervention is more or less effective, the impact of exposure is greater or lesser, or a diagnostic test is more useful. At this stage, for better interpretation of the findings, the magnitude of the association can be estimated either globally or stratified according to the characteristics of the participants. All this is helpful in formulating recommendations for clinical practice and policy.
Collapse
Affiliation(s)
- K S Khan
- Departamento de Medicina Preventiva y Salud Pública, Universidad de Granada, Granada, España; CIBER de Epidemiología y Salud Pública (CIBERESP), Madrid, España
| | - A Bueno Cavanillas
- Departamento de Medicina Preventiva y Salud Pública, Universidad de Granada, Granada, España; CIBER de Epidemiología y Salud Pública (CIBERESP), Madrid, España.
| | - J Zamora
- CIBER de Epidemiología y Salud Pública (CIBERESP), Madrid, España; Unidad de Bioestadística Clínica, Hospital Ramón y Cajal, Madrid, España; Institute of Metabolism and Systems Research, Universidad de Birmingham, Birmingham, Reino Unido
| |
Collapse
|
45
|
Müller L, Kloeckner R, Mildenberger P, Pinto Dos Santos D. [Validation and implementation of artificial intelligence in radiology : Quo vadis in 2022?]. Radiologie (Heidelb) 2022; 63:381-386. [PMID: 36510007 DOI: 10.1007/s00117-022-01097-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/17/2022] [Indexed: 12/14/2022]
Abstract
BACKGROUND The hype around artificial intelligence (AI) in radiology continues and the number of approved AI tools is growing steadily. Despite the great potential, integration into clinical routine in radiology remains limited. In addition, the large number of individual applications poses a challenge for clinical routine, as individual applications have to be selected for different questions and organ systems, which increases the complexity and time required. OBJECTIVES This review will discuss the current status of validation and implementation of AI tools in clinical routine, and identify possible approaches for an improved assessment of the generalizability of results of AI tools. MATERIALS AND METHODS A literature search in various literature and product databases as well as publications, position papers, and reports from various stakeholders was conducted for this review. RESULTS Scientific evidence and independent validation studies are available for only a few commercial AI tools and the generalizability of the results often remains questionable. CONCLUSIONS One challenge is the multitude of offerings for individual, specific application areas by a large number of manufacturers, making integration into the existing site-specific IT infrastructure more difficult. Furthermore, remuneration for the use of AI tools in clinical routine by health insurance companies in Germany is lacking. But in order for reimbursement to be granted, the clinical utility of new applications must first be proven. Such proof, however, is lacking for most applications.
Collapse
Affiliation(s)
- Lukas Müller
- Klinik und Poliklinik für Diagnostische und Interventionelle Radiologie, Universitätsmedizin Mainz, Langenbeckstr. 1, 55131, Mainz, Deutschland.
| | - Roman Kloeckner
- Institut für Interventionelle Radiologie, Universitätsklinikum Schleswig-Holstein - Campus Lübeck, Lübeck, Deutschland
| | - Peter Mildenberger
- Klinik und Poliklinik für Diagnostische und Interventionelle Radiologie, Universitätsmedizin Mainz, Langenbeckstr. 1, 55131, Mainz, Deutschland
| | - Daniel Pinto Dos Santos
- Institut für Diagnostische und Interventionelle Radiologie, Uniklinik Köln, Köln, Deutschland.,Institut für Diagnostische und Interventionelle Radiologie, Universitätsklinikum Frankfurt, Frankfurt am Main, Deutschland
| |
Collapse
|
46
|
Fink DS, Stohl M, Mannes ZL, Shmulewitz D, Wall M, Gutkind S, Olfson M, Gradus J, Keyhani S, Maynard C, Keyes KM, Sherman S, Martins S, Saxon AJ, Hasin DS. Comparing mental and physical health of U.S. veterans by VA healthcare use: implications for generalizability of research in the VA electronic health records. BMC Health Serv Res 2022; 22:1500. [PMID: 36494829 PMCID: PMC9733218 DOI: 10.1186/s12913-022-08899-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 11/28/2022] [Indexed: 12/13/2022] Open
Abstract
OBJECTIVE The Department of Veterans Affairs' (VA) electronic health records (EHR) offer a rich source of big data to study medical and health care questions, but patient eligibility and preferences may limit generalizability of findings. We therefore examined the representativeness of VA veterans by comparing veterans using VA healthcare services to those who do not. METHODS We analyzed data on 3051 veteran participants age ≥ 18 years in the 2019 National Health Interview Survey. Weighted logistic regression was used to model participant characteristics, health conditions, pain, and self-reported health by past year VA healthcare use and generate predicted marginal prevalences, which were used to calculate Cohen's d of group differences in absolute risk by past-year VA healthcare use. RESULTS Among veterans, 30.4% had past-year VA healthcare use. Veterans with lower income and members of racial/ethnic minority groups were more likely to report past-year VA healthcare use. Health conditions overrepresented in past-year VA healthcare users included chronic medical conditions (80.6% vs. 69.4%, d = 0.36), pain (78.9% vs. 65.9%; d = 0.35), mental distress (11.6% vs. 5.9%; d = 0.47), anxiety (10.8% vs. 4.1%; d = 0.67), and fair/poor self-reported health (27.9% vs. 18.0%; d = 0.40). CONCLUSIONS Heterogeneity in veteran sociodemographic and health characteristics was observed by past-year VA healthcare use. Researchers working with VA EHR data should consider how the patient selection process may relate to the exposures and outcomes under study. Statistical reweighting may be needed to generalize risk estimates from the VA EHR data to the overall veteran population.
Collapse
Affiliation(s)
- David S. Fink
- grid.413734.60000 0000 8499 1112New York State Psychiatric Institute, New York, NY USA
| | - Malka Stohl
- grid.413734.60000 0000 8499 1112New York State Psychiatric Institute, New York, NY USA
| | - Zachary L. Mannes
- grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Dvora Shmulewitz
- grid.413734.60000 0000 8499 1112New York State Psychiatric Institute, New York, NY USA ,grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Melanie Wall
- grid.413734.60000 0000 8499 1112New York State Psychiatric Institute, New York, NY USA ,grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Sarah Gutkind
- grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Mark Olfson
- grid.413734.60000 0000 8499 1112New York State Psychiatric Institute, New York, NY USA ,grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Jaimie Gradus
- grid.189504.10000 0004 1936 7558Boston University School of Public Health, Boston, MA USA
| | - Salomeh Keyhani
- Veteran Affairs, San Francisco, VA USA ,grid.266102.10000 0001 2297 6811University of California, San Francisco, CA USA
| | - Charles Maynard
- grid.413919.70000 0004 0420 6540Veteran Affairs, Puget Sound Health Care System, Seattle, WA USA ,grid.34477.330000000122986657University of Washington, Seattle, WA USA
| | - Katherine M. Keyes
- grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Scott Sherman
- grid.137628.90000 0004 1936 8753New York University, New York, NY USA
| | - Silvia Martins
- grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA
| | - Andrew J. Saxon
- grid.413919.70000 0004 0420 6540Veteran Affairs, Puget Sound Health Care System, Seattle, WA USA ,grid.34477.330000000122986657University of Washington, Seattle, WA USA
| | - Deborah S. Hasin
- grid.413734.60000 0000 8499 1112New York State Psychiatric Institute, New York, NY USA ,grid.21729.3f0000000419368729Columbia University Mailman School of Public Health, New York, NY USA ,grid.239585.00000 0001 2285 2675Department of Psychiatry, Columbia University Medical Center, 1051 Riverside Dr., Unit 123, New York, NY 10032 USA
| |
Collapse
|
47
|
Nadeem SA, Comellas AP, Hoffman EA, Saha PK. Airway Detection in COPD at Low-Dose CT Using Deep Learning and Multiparametric Freeze and Grow. Radiol Cardiothorac Imaging 2022; 4:e210311. [PMID: 36601453 PMCID: PMC9806731 DOI: 10.1148/ryct.210311] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 06/17/2023]
Abstract
PURPOSE To present and validate a fully automated airway detection method at low-dose CT in patients with chronic obstructive pulmonary disease (COPD). MATERIALS AND METHODS In this retrospective study, deep learning (DL) and freeze-and-grow (FG) methods were optimized and applied to automatically detect airways at low-dose CT. Four data sets were used: two data sets consisting of matching standard- and low-dose CT scans from the Genetic Epidemiology of COPD (COPDGene) phase II (2014-2017) cohort (n = 2 × 236; mean age ± SD, 70 years ± 9; 123 women); one data set consisting of low-dose CT scans from the COPDGene phase III (2018-2020) cohort (n = 335; mean age ± SD, 73 years ± 8; 173 women); and one data set consisting of low-dose, anonymized CT scans from the 2003 Dutch-Belgian Randomized Lung Cancer Screening trial (n = 55) acquired by using different CT scanners. Performance measures for different methods were computed and compared by using the Wilcoxon signed rank test. RESULTS At low-dose CT, 56 294 of 62 480 (90.1%) airways of the reference total airway count (TAC) and 32 109 of 37 864 (84.8%) airways of the peripheral TAC (TACp), detected at standard-dose CT, were detected. Significant losses (P < .001) of 14 526 of 76 453 (19.0%) airways and 884 of 6908 (12.8%) airways in the TAC and 12 256 of 43 462 (28.2%) airways and 699 of 3882 (18.0%) airways in the TACp were observed, respectively, for the multiprotocol and multiscanner data without retraining. When using the automated low-dose CT method, TAC values of 347, 342, 323, and 266 and TACp values of 205, 202, 289, and 141 were observed for those who have never smoked and participants at Global Initiative for Chronic Obstructive Lung Disease stages 0, 1, and 2, respectively, which were superior to the respective values previously reported for matching groups when using a semiautomated method at standard-dose CT. CONCLUSION A low-cost, automated CT-based airway detection method was suitable for investigation of airway phenotypes at low-dose CT.Keywords: Airway, Airway Count, Airway Detection, Chronic Obstructive Pulmonary Disease, CT, Deep Learning, Generalizability, Low-Dose CT, Segmentation, Thorax, LungClinical trial registration no. NCT00608764 Supplemental material is available for this article. © RSNA, 2022.
Collapse
|
48
|
Patel AU, Mohanty SK, Parwani AV. Applications of Digital and Computational Pathology and Artificial Intelligence in Genitourinary Pathology Diagnostics. Surg Pathol Clin 2022; 15:759-785. [PMID: 36344188 DOI: 10.1016/j.path.2022.08.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
As machine learning (ML) solutions for genitourinary pathology image analysis are fostered by a progressively digitized laboratory landscape, these integrable modalities usher in a revolution in histopathological diagnosis. As technology advances, limitations stymying clinical artificial intelligence (AI) will not be extinguished without thorough validation and interrogation of ML tools by pathologists and regulatory bodies alike. ML solutions deployed in clinical settings for applications in prostate pathology yield promising results. Recent breakthroughs in clinical artificial intelligence for genitourinary pathology demonstrate unprecedented generalizability, heralding prospects for a future in which AI-driven assistive solutions may be seen as laboratory faculty, rather than novelty.
Collapse
Affiliation(s)
- Ankush Uresh Patel
- Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA
| | - Sambit K Mohanty
- Surgical and Molecular Pathology, Advanced Medical Research Institute, Plot No. 1, Near Jayadev Vatika Park, Khandagiri, Bhubaneswar, Odisha 751019. https://twitter.com/SAMBITKMohanty1
| | - Anil V Parwani
- Department of Pathology, The Ohio State University, Cooperative Human Tissue Network (CHTN) Midwestern Division Polaris Innovation Centre, 2001 Polaris Parkway Suite 1000, Columbus, OH 43240, USA.
| |
Collapse
|
49
|
van Klaveren D, Zanos TP, Nelson J, Levy TJ, Park JG, Retel Helmrich IRA, Rietjens JAC, Basile MJ, Hajizadeh N, Lingsma HF, Kent DM. Prognostic models for COVID-19 needed updating to warrant transportability over time and space. BMC Med 2022; 20:456. [PMID: 36424619 PMCID: PMC9686462 DOI: 10.1186/s12916-022-02651-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/04/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Supporting decisions for patients who present to the emergency department (ED) with COVID-19 requires accurate prognostication. We aimed to evaluate prognostic models for predicting outcomes in hospitalized patients with COVID-19, in different locations and across time. METHODS We included patients who presented to the ED with suspected COVID-19 and were admitted to 12 hospitals in the New York City (NYC) area and 4 large Dutch hospitals. We used second-wave patients who presented between September and December 2020 (2137 and 3252 in NYC and the Netherlands, respectively) to evaluate models that were developed on first-wave patients who presented between March and August 2020 (12,163 and 5831). We evaluated two prognostic models for in-hospital death: The Northwell COVID-19 Survival (NOCOS) model was developed on NYC data and the COVID Outcome Prediction in the Emergency Department (COPE) model was developed on Dutch data. These models were validated on subsequent second-wave data at the same site (temporal validation) and at the other site (geographic validation). We assessed model performance by the Area Under the receiver operating characteristic Curve (AUC), by the E-statistic, and by net benefit. RESULTS Twenty-eight-day mortality was considerably higher in the NYC first-wave data (21.0%), compared to the second-wave (10.1%) and the Dutch data (first wave 10.8%; second wave 10.0%). COPE discriminated well at temporal validation (AUC 0.82), with excellent calibration (E-statistic 0.8%). At geographic validation, discrimination was satisfactory (AUC 0.78), but with moderate over-prediction of mortality risk, particularly in higher-risk patients (E-statistic 2.9%). While discrimination was adequate when NOCOS was tested on second-wave NYC data (AUC 0.77), NOCOS systematically overestimated the mortality risk (E-statistic 5.1%). Discrimination in the Dutch data was good (AUC 0.81), but with over-prediction of risk, particularly in lower-risk patients (E-statistic 4.0%). Recalibration of COPE and NOCOS led to limited net benefit improvement in Dutch data, but to substantial net benefit improvement in NYC data. CONCLUSIONS NOCOS performed moderately worse than COPE, probably reflecting unique aspects of the early pandemic in NYC. Frequent updating of prognostic models is likely to be required for transportability over time and space during a dynamic pandemic.
Collapse
Affiliation(s)
- David van Klaveren
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands. .,Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA.
| | - Theodoros P Zanos
- Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jason Nelson
- Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| | - Todd J Levy
- Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jinny G Park
- Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| | - Isabel R A Retel Helmrich
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
| | - Judith A C Rietjens
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
| | - Melissa J Basile
- Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
| | - Negin Hajizadeh
- Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
| | - Hester F Lingsma
- Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
| | - David M Kent
- Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| |
Collapse
|
50
|
Maleki F, Ovens K, Gupta R, Reinhold C, Spatz A, Forghani R. Generalizability of Machine Learning Models: Quantitative Evaluation of Three Methodological Pitfalls. Radiol Artif Intell 2022; 5:e220028. [PMID: 36721408 PMCID: PMC9885377 DOI: 10.1148/ryai.220028] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Revised: 10/10/2022] [Accepted: 10/24/2022] [Indexed: 11/17/2022]
Abstract
Purpose To investigate the impact of the following three methodological pitfalls on model generalizability: (a) violation of the independence assumption, (b) model evaluation with an inappropriate performance indicator or baseline for comparison, and (c) batch effect. Materials and Methods The authors used retrospective CT, histopathologic analysis, and radiography datasets to develop machine learning models with and without the three methodological pitfalls to quantitatively illustrate their effect on model performance and generalizability. F1 score was used to measure performance, and differences in performance between models developed with and without errors were assessed using the Wilcoxon rank sum test when applicable. Results Violation of the independence assumption by applying oversampling, feature selection, and data augmentation before splitting data into training, validation, and test sets seemingly improved model F1 scores by 71.2% for predicting local recurrence and 5.0% for predicting 3-year overall survival in head and neck cancer and by 46.0% for distinguishing histopathologic patterns in lung cancer. Randomly distributing data points for a patient across datasets superficially improved the F1 score by 21.8%. High model performance metrics did not indicate high-quality lung segmentation. In the presence of a batch effect, a model built for pneumonia detection had an F1 score of 98.7% but correctly classified only 3.86% of samples from a new dataset of healthy patients. Conclusion Machine learning models developed with these methodological pitfalls, which are undetectable during internal evaluation, produce inaccurate predictions; thus, understanding and avoiding these pitfalls is necessary for developing generalizable models.Keywords: Random Forest, Diagnosis, Prognosis, Convolutional Neural Network (CNN), Medical Image Analysis, Generalizability, Machine Learning, Deep Learning, Model Evaluation Supplemental material is available for this article. Published under a CC BY 4.0 license.
Collapse
|