1
|
Kim J, Cai ZR, Chen ML, Onyeka S, Ko JM, Linos E. Telehealth Utilization and Associations in the United States During the Third Year of the COVID-19 Pandemic: Population-Based Survey Study in 2022. JMIR Public Health Surveill 2024; 10:e51279. [PMID: 38669075 PMCID: PMC11087857 DOI: 10.2196/51279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 12/28/2023] [Accepted: 03/05/2024] [Indexed: 04/28/2024] Open
Abstract
BACKGROUND The COVID-19 pandemic rapidly changed the landscape of clinical practice in the United States; telehealth became an essential mode of health care delivery, yet many components of telehealth use remain unknown years after the disease's emergence. OBJECTIVE We aim to comprehensively assess telehealth use and its associated factors in the United States. METHODS This cross-sectional study used a nationally representative survey (Health Information National Trends Survey) administered to US adults (≥18 years) from March 2022 through November 2022. To assess telehealth adoption, perceptions of telehealth, satisfaction with telehealth, and the telehealth care purpose, we conducted weighted descriptive analyses. To identify the subpopulations with low adoption of telehealth, we developed a weighted multivariable logistic regression model. RESULTS Among a total of 6252 survey participants, 39.3% (2517/6252) reported telehealth use in the past 12 months (video: 1110/6252, 17.8%; audio: 876/6252, 11.6%). The most prominent reason for not using telehealth was due to telehealth providers failing to offer this option (2200/3529, 63%). The most common reason for respondents not using offered telehealth services was a preference for in-person care (527/578, 84.4%). Primary motivations to use telehealth were providers' recommendations (1716/2517, 72.7%) and convenience (1516/2517, 65.6%), mainly for acute minor illness (600/2397, 29.7%) and chronic condition management (583/2397, 21.4%), yet care purposes differed by age, race/ethnicity, and income. The satisfaction rate was predominately high, with no technical problems (1829/2517, 80.5%), comparable care quality to that of in-person care (1779/2517, 75%), and no privacy concerns (1958/2517, 83.7%). Younger individuals (odd ratios [ORs] 1.48-2.23; 18-64 years vs ≥75 years), women (OR 1.33, 95% CI 1.09-1.61), Hispanic individuals (OR 1.37, 95% CI 1.05-1.80; vs non-Hispanic White), those with more education (OR 1.72, 95% CI 1.03-2.87; at least a college graduate vs less than high school), unemployed individuals (OR 1.25, 95% CI 1.02-1.54), insured individuals (OR 1.83, 95% CI 1.25-2.69), or those with poor general health status (OR 1.66, 95% CI 1.30-2.13) had higher odds of using telehealth. CONCLUSIONS To our best knowledge, this is among the first studies to examine patient factors around telehealth use, including motivations to use, perceptions of, satisfaction with, and care purpose of telehealth, as well as sociodemographic factors associated with telehealth adoption using a nationally representative survey. The wide array of descriptive findings and identified associations will help providers and health systems understand the factors that drive patients toward or away from telehealth visits as the technology becomes more routinely available across the United States, providing future directions for telehealth use and telehealth research.
Collapse
Affiliation(s)
- Jiyeong Kim
- Stanford Center for Digital Health, School of Medicine, Stanford University, Stanford, CA, United States
| | - Zhuo Ran Cai
- Stanford Center for Digital Health, School of Medicine, Stanford University, Stanford, CA, United States
| | - Michael L Chen
- Stanford Center for Digital Health, School of Medicine, Stanford University, Stanford, CA, United States
| | - Sonia Onyeka
- Stanford Center for Digital Health, School of Medicine, Stanford University, Stanford, CA, United States
| | - Justin M Ko
- Stanford Center for Digital Health, School of Medicine, Stanford University, Stanford, CA, United States
| | - Eleni Linos
- Stanford Center for Digital Health, School of Medicine, Stanford University, Stanford, CA, United States
| |
Collapse
|
2
|
Balaban N, Mohyuddin GR, Kashi A, Massarweh A, Markel G, Bomze D, Goldstein DA, Meirson T. Projecting complete redaction of clinical trial protocols (RAPTURE): redacted cross sectional study. BMJ 2023; 383:e077329. [PMID: 38097263 PMCID: PMC10719744 DOI: 10.1136/bmj-2023-077329] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/07/2023] [Indexed: 12/18/2023]
Abstract
OBJECTIVES To characterise redactions in clinical trials and estimate a time when all protocols are fully removed (RAPTURE). DESIGN Redacted cross sectional study. SETTING Published phase 3 randomised controlled trials from 1 January 2010 to ██████████████. PARTICIPANTS New England Journal of Medicine, ██████████, and Journal of the American Medical Association. MAIN OUTCOME MEASURES █████ ████████ ██████████████ ██████ ██████████ ████████ ████████ ██████████ ███████████ ████████████ ████████████ ████████████████████████ ██████████████████ RESULTS: ████████████████████ met the inclusion criteria, with 268 (56.7%) research protocols available and accessible. The rate of redactions in protocols has increased from 0 in 2010 to 60.8% in 2021 (P<0.001). The degree of data redaction has also increased, with the average cumulative redactions among industry funded trials rising from 0 in 2010 to 3.5 pages in 2021 (P<0.001). Modelling predicts that RAPTURE is expected to occur between 2073 and 2136. Redactions featured predominantly in ████████ sponsored trials and mostly occurred in the statistical design. CONCLUSIONS This study highlights the rise in protocol redactions and predicts that, ██████████████████████████████████████████ will be entirely redacted between 2073 and 2136. A legitimate rationale for the redactions could ███ be found. A multipronged strategy against protocol redactions is required to maintain the integrity of science. AVAILABILITY This paper is partially redacted, but for the sake of ███████████, a version without any redactions can be found in the supplementary material.
Collapse
Affiliation(s)
- Nir Balaban
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ghulam Rehman Mohyuddin
- Division of Hematology, Huntsman Cancer Institute, University of Utah, Salt Lake City, UT, USA
| | - Adi Kashi
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amir Massarweh
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| | - Gal Markel
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
- Samueli Integrative Cancer Pioneering Institute, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| | - David Bomze
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Daniel A Goldstein
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| | - Tomer Meirson
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Davidoff Cancer Center, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
- Samueli Integrative Cancer Pioneering Institute, Rabin Medical Center-Beilinson Hospital, Petah Tikva, Israel
| |
Collapse
|
3
|
Montero O, Hedeland M, Balgoma D. Trials and tribulations of statistical significance in biochemistry and omics. Trends Biochem Sci 2023; 48:503-512. [PMID: 36842858 DOI: 10.1016/j.tibs.2023.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 01/12/2023] [Accepted: 01/31/2023] [Indexed: 02/26/2023]
Abstract
Over recent years many statisticians and researchers have highlighted that statistical inference would benefit from a better use and understanding of hypothesis testing, p-values, and statistical significance. We highlight three recommendations in the context of biochemical sciences. First recommendation: to improve the biological interpretation of biochemical data, do not use p-values (or similar test statistics) as thresholded values to select biomolecules. Second recommendation: to improve comparison among studies and to achieve robust knowledge, perform complete reporting of data. Third recommendation: statistical analyses should be reported completely with exact numbers (not as asterisks or inequalities). Owing to the high number of variables, a better use of statistics is of special importance in omic studies.
Collapse
Affiliation(s)
- Olimpio Montero
- Unidad de Excelencia, Instituto de Biología y Genética Molecular (IBGM), Universidad de Valladolid, Consejo Superior de Investigaciones Científicas (CSIC), Valladolid, Spain
| | - Mikael Hedeland
- Analytical Pharmaceutical Chemistry, Department of Medicinal Chemistry, Uppsala University, Sweden
| | - David Balgoma
- Unidad de Excelencia, Instituto de Biología y Genética Molecular (IBGM), Universidad de Valladolid, Consejo Superior de Investigaciones Científicas (CSIC), Valladolid, Spain; Analytical Pharmaceutical Chemistry, Department of Medicinal Chemistry, Uppsala University, Sweden.
| |
Collapse
|
4
|
Ciubotariu II, Bosch G. Teaching students to R3eason, not merely to solve problem sets: The role of philosophy and visual data communication in accessible data science education. PLoS Comput Biol 2023; 19:e1011160. [PMID: 37289659 PMCID: PMC10249832 DOI: 10.1371/journal.pcbi.1011160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023] Open
Abstract
Much guidance on statistical training in STEM fields has been focused largely on the undergraduate cohort, with graduate education often being absent from the equation. Training in quantitative methods and reasoning is critical for graduate students in biomedical and science programs to foster reproducible and responsible research practices. We argue that graduate student education should more center around fundamental reasoning and integration skills rather than mainly on listing 1 statistical test method after the other without conveying the bigger context picture or critical argumentation skills that will enable student to improve research integrity through rigorous practice. Herein, we describe the approach we take in a quantitative reasoning course in the R3 program at the Johns Hopkins Bloomberg School of Public Health, with an error-focused lens, based on visualization and communication competencies. Specifically, we take this perspective stemming from the discussed causes of irreproducibility and apply it specifically to the many aspects of good statistical practice in science, ranging from experimental design to data collection and analysis, and conclusions drawn from the data. We also provide tips and guidelines for the implementation and adaptation of our course material to various graduate biomedical and STEM science programs.
Collapse
Affiliation(s)
- Ilinca I. Ciubotariu
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, United States of America
- Department of Molecular Microbiology and Immunology, Johns Hopkins Bloomberg School of Public Health, R Center for Innovation in Science Education, Baltimore, Maryland, United States of America
| | - Gundula Bosch
- Department of Molecular Microbiology and Immunology, Johns Hopkins Bloomberg School of Public Health, R Center for Innovation in Science Education, Baltimore, Maryland, United States of America
| |
Collapse
|
5
|
Palella M, Giustolisi FM, Modica Fiascaro A, Fichera M, Palmieri A, Cannarella R, Calogero AE, Ferrante M, Fiore M. Risk and Prognosis of Thyroid Cancer in Patients with Graves' Disease: An Umbrella Review. Cancers (Basel) 2023; 15:2724. [PMID: 37345061 DOI: 10.3390/cancers15102724] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Revised: 05/08/2023] [Accepted: 05/10/2023] [Indexed: 06/23/2023] Open
Abstract
Graves' disease (GD) is an autoimmune disease considered the most common cause of hyperthyroidism. Some studies have investigated its relationship with the risk and prognosis of developing thyroid cancer. Considering that there is no consensus on the relationship between GD and thyroid cancer risk, this umbrella review aimed to summarize the epidemiologic evidence and evaluate its strength and validity on the associations of GD with thyroid cancer risk and its prognosis. This umbrella review was performed using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We systematically searched PubMed and Scopus from January 2012 to December 2022. The strength of the epidemiological evidence was graded as high, moderate, or weak by the Measurement Tool to Assess Systematic Reviews (AMSTAR-2). "Strong" evidence was found for the risk of thyroid cancer in GD patients with thyroid nodular disease (OR: 5.30; 95% CI 2.43-12) and for the risk of mortality from thyroid cancer in these patients (OR 2.93, 95% CI 1.17-7.37, p = 0.02), particularly in Europe (OR 4.89; 95% CI 1.52-16). The results of this umbrella review should be interpreted with caution; as the evidence comes mostly from retrospective studies, potential concerns are selection and recall bias, and whether the empirically observed association reflects a causal relationship remains an open question.
Collapse
Affiliation(s)
- Marco Palella
- Department of Medical, Medical Specialization School in Hygiene and Preventive Medicine, Surgical Sciences and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| | - Francesca Maria Giustolisi
- Department of Medical, Medical Specialization School in Hygiene and Preventive Medicine, Surgical Sciences and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| | - Adriana Modica Fiascaro
- Department of Medical, Medical Specialization School in Hygiene and Preventive Medicine, Surgical Sciences and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| | - Martina Fichera
- Department of Medical, Medical Specialization School in Hygiene and Preventive Medicine, Surgical Sciences and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| | - Antonella Palmieri
- Department of Medical, Medical Specialization School in Hygiene and Preventive Medicine, Surgical Sciences and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| | - Rossella Cannarella
- Department of Clinical and Experimental Medicine, University of Catania, 95123 Catania, Italy
- Glickman Urological & Kidney Institute, Cleveland Clinic Foundation, Cleveland, OH 44195, USA
| | - Aldo E Calogero
- Department of Clinical and Experimental Medicine, University of Catania, 95123 Catania, Italy
| | - Margherita Ferrante
- Department of Medical, Surgical and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| | - Maria Fiore
- Department of Medical, Surgical and Advanced Technologies "G.F. Ingrassia", University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
| |
Collapse
|
6
|
On p-Values and Statistical Significance. J Clin Med 2023; 12:jcm12030900. [PMID: 36769547 PMCID: PMC9917591 DOI: 10.3390/jcm12030900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 01/20/2023] [Indexed: 01/24/2023] Open
Abstract
At the beginning of our research training, we learned about hypothesis testing, p-values, and statistical inference [...].
Collapse
|
7
|
Kim HE, Wallace J, Sohn W. Factors Affecting Masticatory Performance of Older Adults Are Sex-Dependent: A Cross-Sectional Study. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:15742. [PMID: 36497815 PMCID: PMC9735781 DOI: 10.3390/ijerph192315742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 11/22/2022] [Accepted: 11/24/2022] [Indexed: 06/17/2023]
Abstract
This cross-sectional study assessed the oral and physical factors contributing to improvement of the masticatory performance of community-dwelling older adults in South Korea. We enrolled 84 healthy older adults (38 men, 46 women; age, 71.40 ± 5.15 years) and assessed their skeletal muscle mass index (SMI), functional tooth units (FTUs), and mixing ability index (MAI). Associations between variables were analyzed using Spearman's correlation coefficient, and the effects of SMI and FTUs on the MAI were evaluated through linear multiple regression. FTUs were positively associated with the MAI in men and women (r = 0.339, p = 0.038 and r = 0.461, p = 0.001, respectively). SMI and FTUs were moderately associated in men (r = 0.459, p = 0.004). MAI showed an approximately 4.4 times increase for each FTU in men (B = 4.442, p = 0.037); however, after the SMI was added, this effect was no longer significant. In women, the MAI increased by about 6.7 times with each FTU (B = 6.685, p = 0.004). FTUs had a significant effect on the MAI only in women with low muscle mass. While there was no significant effect of the SMI on the MAI, its influence should not be overlooked.
Collapse
Affiliation(s)
- Hee-Eun Kim
- Department of Dental Hygiene, Gachon University College of Health Science, Incheon 21936, Republic of Korea
| | - Janet Wallace
- Faculty of Medicine and Health, The University of Sydney School of Dentistry, Sydney, NSW 2010, Australia
| | - Woosung Sohn
- Faculty of Medicine and Health, The University of Sydney School of Dentistry, Sydney, NSW 2010, Australia
| |
Collapse
|
8
|
Balbim GM, Erickson KI, Ajilore OA, Aguiñaga S, Bustamante EE, Lamar M, Marquez DX. Association of physical activity levels and brain white matter in older Latino adults. ETHNICITY & HEALTH 2022; 27:1599-1615. [PMID: 33853442 PMCID: PMC8514578 DOI: 10.1080/13557858.2021.1913484] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 04/01/2021] [Indexed: 06/12/2023]
Abstract
OBJECTIVE Investigate the associations between self-reported physical activity (PA) engagement and white matter (WM) health (i.e. volume, integrity, and hyperintensities) in older Latinos. DESIGN Cross-sectional study with community-dwelling older adults from predominantly Latino neighborhoods. Participants: Thirty-four cognitively healthy older Latinos from two different cohorts. Measurements: Participants self-reported demographic information, PA engagement [Community Healthy Activities Model Program for Seniors (CHAMPS) Physical Activity Questionnaire for Older Adults] and magnetic resonance imaging (MRI). We used high-resolution three-dimensional T1- and T2-FLAIR weighted images and diffusion tensor imaging acquired via 3 T MRI. We performed a series of hierarchical linear regression models with the addition of relevant covariates to examine the associations between self-reported PA levels and WM volume, integrity, and hyperintensities (separately). We adjusted p-values with the use of the Benjamini-Hochberg's false discovery rate procedure. RESULTS Higher reported levels of leisure-time moderate-to-vigorous PA were significantly associated with higher WM volume of the posterior cingulate (β = 0.220, SE = 0.125, 95% CI 0.009-0.431, p = 0.047) and isthmus cingulate (β = 0.212, SE = 0.110, 95% CI 0.001-0.443, p = 0.044) after controlling for intracranial volume. Higher levels of total PA were significantly associated with higher overall WM volume of these same regions (posterior cingulate: β = 0.220, SE = 0.125, CI 0.024-0.421, p = 0.046; isthmus cingulate: β = 0.220, SE = 0.125, 95% CI 0.003-0.393; p = 0.040). Significant p-values did not withstand Benjamini-Hochberg's adjustment. PA was not significantly associated with WM integrity or WM hyperintensities. CONCLUSION Higher levels of PA, particularly higher leisure-time moderate-to-vigorous PA, might be associated with greater WM volume in select white matter regions key to brain network integration for physical and cognitive functioning in older Latinos. More research is needed to further confirm these associations.
Collapse
Affiliation(s)
- Guilherme M Balbim
- Department of Kinesiology and Nutrition, University of Illinois at Chicago, Chicago, United States
| | - Kirk I Erickson
- Department of Psychology, University of Pittsburgh, Pittsburgh, United States
| | - Olusola A Ajilore
- Department of Psychiatry, University of Illinois at Chicago, Chicago, United States
| | - Susan Aguiñaga
- Department of Kinesiology and Community Health, University of Illinois at Urbana-Champaign, Champaign, United States
| | - Eduardo E Bustamante
- Department of Kinesiology and Nutrition, University of Illinois at Chicago, Chicago, Illinois, United States
| | - Melissa Lamar
- Division of Behavioral Sciences, Rush University, Chicago, Illinois, United States
| | - David X Marquez
- Department of Kinesiology and Nutrition, University of Illinois at Chicago, Chicago, United States
| |
Collapse
|
9
|
Respiratory Subsets in Patients with Moderate to Severe Acute Respiratory Distress Syndrome for Early Prediction of Death. J Clin Med 2022; 11:jcm11195724. [PMID: 36233592 PMCID: PMC9570540 DOI: 10.3390/jcm11195724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 09/19/2022] [Accepted: 09/24/2022] [Indexed: 12/16/2022] Open
Abstract
Introduction: In patients with acute respiratory distress syndrome (ARDS), the PaO2/FiO2 ratio at the time of ARDS diagnosis is weakly associated with mortality. We hypothesized that setting a PaO2/FiO2 threshold in 150 mm Hg at 24 h from moderate/severe ARDS diagnosis would improve predictions of death in the intensive care unit (ICU). Methods: We conducted an ancillary study in 1303 patients with moderate to severe ARDS managed with lung-protective ventilation enrolled consecutively in four prospective multicenter cohorts in a network of ICUs. The first three cohorts were pooled (n = 1000) as a testing cohort; the fourth cohort (n = 303) served as a confirmatory cohort. Based on the thresholds for PaO2/FiO2 (150 mm Hg) and positive end-expiratory pressure (PEEP) (10 cm H2O), the patients were classified into four possible subsets at baseline and at 24 h using a standardized PEEP-FiO2 approach: (I) PaO2/FiO2 ≥ 150 at PEEP < 10, (II) PaO2/FiO2 ≥ 150 at PEEP ≥ 10, (III) PaO2/FiO2 < 150 at PEEP < 10, and (IV) PaO2/FiO2 < 150 at PEEP ≥ 10. Primary outcome was death in the ICU. Results: ICU mortalities were similar in the testing and confirmatory cohorts (375/1000, 37.5% vs. 112/303, 37.0%, respectively). At baseline, most patients from the testing cohort (n = 792/1000, 79.2%) had a PaO2/FiO2 < 150, with similar mortality among the four subsets (p = 0.23). When assessed at 24 h, ICU mortality increased with an advance in the subset: 17.9%, 22.8%, 40.0%, and 49.3% (p < 0.0001). The findings were replicated in the confirmatory cohort (p < 0.0001). However, independent of the PEEP levels, patients with PaO2/FiO2 < 150 at 24 h followed a distinct 30-day ICU survival compared with patients with PaO2/FiO2 ≥ 150 (hazard ratio 2.8, 95% CI 2.2−3.5, p < 0.0001). Conclusions: Subsets based on PaO2/FiO2 thresholds of 150 mm Hg assessed after 24 h of moderate/severe ARDS diagnosis are clinically relevant for establishing prognosis, and are helpful for selecting adjunctive therapies for hypoxemia and for enrolling patients into therapeutic trials.
Collapse
|
10
|
Yu T, Lin L, Furuya-Kanamori L, Xu C. Synthesizing evidence from the earliest studies to support decision-making: To what extent could the evidence be reliable? Res Synth Methods 2022; 13:632-644. [PMID: 35799334 PMCID: PMC9585992 DOI: 10.1002/jrsm.1587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 05/31/2022] [Accepted: 07/03/2022] [Indexed: 02/05/2023]
Abstract
In evidence-based practice, new topics generally only have a few studies available for synthesis. As a result, the evidence of such meta-analyses raised substantial concerns. We investigated the robustness of the evidence from these earliest studies. Real-world data from the Cochrane Database of Systematic Reviews (CDSR) were collected. We emulated meta-analyses with the earliest 1 to 10 studies through cumulative meta-analysis from eligible meta-analyses. The magnitude and the direction of meta-analyses with the earliest few studies were compared to the full meta-analyses. From the CDSR, we identified 20,227 meta-analyses of binary outcomes and 7683 meta-analyses of continuous outcomes. Under the tolerable difference of 20% on the magnitude of the effects, the convergence proportion ranged from 24.24% (earliest 1 study) to 77.45% (earliest 10 studies) for meta-analyses of few earliest studies with binary outcomes. For meta-analyses of continuous outcomes, the convergence proportion ranged from 13.86% to 56.52%. In terms of the direction of the effects, even when only three studies were available at the earliest stage, the majority had the same direction as full meta-analyses; Only 19% for binary outcomes and 12% for continuous outcomes changed the direction as further evidence accumulated. Synthesizing evidence from the earliest studies is feasible to support urgent decision-making, and in most cases, the decisions would be reasonable. Considering the potential uncertainties, it is essential to evaluate the confidence of the evidence of these meta-analyses and update the evidence when necessary.
Collapse
Affiliation(s)
- Tianqi Yu
- Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, China
| | - Lifeng Lin
- Department of Statistics, Florida State University, Tallahassee, Florida, USA
| | - Luis Furuya-Kanamori
- UQ Centre for Clinical Research, Faculty of Medicine, University of Queensland, Herston, Australia
| | - Chang Xu
- Ministry of Education Key Laboratory for Population Health Across-life Cycle & Anhui Provincial Key Laboratory of Population Health and Aristogenics & School of Public Health, Anhui Medical University, Anhui, China
| |
Collapse
|
11
|
Mayo DG, Hand D. Statistical significance and its critics: practicing damaging science, or damaging scientific practice? SYNTHESE 2022; 200:220. [PMID: 35578622 PMCID: PMC9096069 DOI: 10.1007/s11229-022-03692-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 04/05/2022] [Indexed: 05/27/2023]
Abstract
While the common procedure of statistical significance testing and its accompanying concept of p-values have long been surrounded by controversy, renewed concern has been triggered by the replication crisis in science. Many blame statistical significance tests themselves, and some regard them as sufficiently damaging to scientific practice as to warrant being abandoned. We take a contrary position, arguing that the central criticisms arise from misunderstanding and misusing the statistical tools, and that in fact the purported remedies themselves risk damaging science. We argue that banning the use of p-value thresholds in interpreting data does not diminish but rather exacerbates data-dredging and biasing selection effects. If an account cannot specify outcomes that will not be allowed to count as evidence for a claim-if all thresholds are abandoned-then there is no test of that claim. The contributions of this paper are: To explain the rival statistical philosophies underlying the ongoing controversy; To elucidate and reinterpret statistical significance tests, and explain how this reinterpretation ameliorates common misuses and misinterpretations; To argue why recent recommendations to replace, abandon, or retire statistical significance undermine a central function of statistics in science: to test whether observed patterns in the data are genuine or due to background variability.
Collapse
|
12
|
Bonkhoff AK, Grefkes C. Precision medicine in stroke: towards personalized outcome predictions using artificial intelligence. Brain 2022; 145:457-475. [PMID: 34918041 PMCID: PMC9014757 DOI: 10.1093/brain/awab439] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 11/02/2021] [Accepted: 11/21/2021] [Indexed: 11/16/2022] Open
Abstract
Stroke ranks among the leading causes for morbidity and mortality worldwide. New and continuously improving treatment options such as thrombolysis and thrombectomy have revolutionized acute stroke treatment in recent years. Following modern rhythms, the next revolution might well be the strategic use of the steadily increasing amounts of patient-related data for generating models enabling individualized outcome predictions. Milestones have already been achieved in several health care domains, as big data and artificial intelligence have entered everyday life. The aim of this review is to synoptically illustrate and discuss how artificial intelligence approaches may help to compute single-patient predictions in stroke outcome research in the acute, subacute and chronic stage. We will present approaches considering demographic, clinical and electrophysiological data, as well as data originating from various imaging modalities and combinations thereof. We will outline their advantages, disadvantages, their potential pitfalls and the promises they hold with a special focus on a clinical audience. Throughout the review we will highlight methodological aspects of novel machine-learning approaches as they are particularly crucial to realize precision medicine. We will finally provide an outlook on how artificial intelligence approaches might contribute to enhancing favourable outcomes after stroke.
Collapse
Affiliation(s)
- Anna K Bonkhoff
- J. Philip Kistler Stroke Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Christian Grefkes
- Cognitive Neuroscience, Institute of Neuroscience and Medicine (INM-3), Research Centre Juelich, Juelich, Germany
- Department of Neurology, University Hospital Cologne, Cologne, Germany
- Medical Faculty, University of Cologne, Cologne, Germany
| |
Collapse
|
13
|
Jankowski S, Boutron I, Clarke M. Influence of the statistical significance of results and spin on readers' interpretation of the results in an abstract for a hypothetical clinical trial: a randomised trial. BMJ Open 2022; 12:e056503. [PMID: 35396295 PMCID: PMC8996040 DOI: 10.1136/bmjopen-2021-056503] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 02/28/2022] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVES To assess the impact on readers' interpretation of the results reported in an abstract for a hypothetical clinical trial with (1) a statistically significant result (SSR), (2) spin, (3) both an SSR and spin compared with (4) no spin and no SSR. PARTICIPANTS Health students and professionals from universities and health institutions in France and the UK. INTERVENTIONS Participants completed an online questionnaire using Likert scales and free text, after reading one of the four versions of an abstract about a hypothetical randomised trial evaluating 'Naranex' and 'Bulofil' (two hypothetical drugs) for chronic low back pain. The abstracts differed in (1) reported result of 'mean difference of 1.31 points (95% CI 0.08 to 2.54, p= 0.04)' or 'mean difference of 1.31 points (95% CI -0.08 to 2.70, p= 0.06)' and (2) presence or absence of spin. The effect size for the trial's primary outcome (pain disability score) was the same in each abstract, slightly in favour of Naranex. PRIMARY OUTCOME The reader's interpretation of the trial's results, based on their answer (1, disagree; 4, neutral; 7, agree) to the following statement: 'About the main findings of the study, what is your opinion about the following statement: 'Naranex is better than Bulofil'?' RESULTS Two hundred and ninety-seven of the 404 people randomised to receive one of the four abstracts completed the study. Respondents were more likely to favour Narenex when the abstract reported an SSR without spin, a statistically significant result with spin, a non-statistically significant result with spin, compared with when it reported a non-SSR without spin. CONCLUSION Statistical significance appears to have influenced readers' perception whatever the level of spin, while spin influenced readers' perception when the results were not statistically significant but did not appear to have an impact when results were statistically significant.
Collapse
Affiliation(s)
- Sofyan Jankowski
- Université Paris Cité, INSERM, INRAE, CNAM, Centre for Research in Epidemiology and Statistics (CRESS), F-75004, Paris, France
- Centre for Public Health, Queen's University Belfast, Belfast, UK
| | - Isabelle Boutron
- Université Paris Cité, INSERM, INRAE, CNAM, Centre for Research in Epidemiology and Statistics (CRESS), F-75004, Paris, France
| | - Mike Clarke
- Centre for Public Health, Queen's University Belfast, Belfast, UK
| |
Collapse
|
14
|
Science with or without statistics: Discover-generalize-replicate? Discover-replicate-generalize? Behav Brain Sci 2022; 45:e23. [PMID: 35139936 DOI: 10.1017/s0140525x21000054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Overstated generalizability (external validity) is common in research. It may coexist with inflation of the magnitude and statistical support for effects and dismissal of internal validity problems. Generalizability may be secured before attempting replication of proposed discoveries or replication may precede efforts to generalize. These opposite approaches may decrease or increase, respectively, the use of inferential statistics with advantages and disadvantages.
Collapse
|
15
|
The Presentation of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome Is Not Influenced by the Presence or Absence of Joint Hypermobility. J Pediatr 2022; 240:186-191.e2. [PMID: 34537220 DOI: 10.1016/j.jpeds.2021.09.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 09/06/2021] [Accepted: 09/08/2021] [Indexed: 11/23/2022]
Abstract
OBJECTIVE To examine demographic and clinical characteristics of individuals with myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) with and without joint hypermobility We hypothesized that patients who were joint hypermobility-positive would have an earlier onset of ME/CFS symptoms as well as increased severity, a greater number of comorbid conditions, and a lower health-related quality of life. STUDY DESIGN From an observational cohort study of 55 individuals meeting the Fukuda criteria for ME/CFS, we compared groups using a Beighton score cutoff of 4 or higher to indicate joint hypermobility. Chart data were collected to examine the age and type of onset of ME/CFS and the presence of comorbid conditions. The impact on quality of life was assessed through questionnaires that included the Peds QL, Functional Disability Inventory, Peds QL Multidimensional Fatigue Scale, and Anxiety Subscale of the Symptom Checklist 90. RESULTS There was no significant difference between groups in mean ± SD age at onset of ME/CFS (13.3 ± 3.3 years vs 13.3 ± 2.3 years; P = .92), sex, frequency, and severity of ME/CFS symptoms, orthostatic intolerance symptoms, or comorbid conditions. There was no significant difference between the groups in measures of health-related quality of life using a Beighton score cutoff of 4 or a cutoff of 5 to define joint hypermobility. CONCLUSIONS Despite being a risk factor for the development of ME/CFS, joint hypermobility as defined in this study was not associated with other clinical characteristics of the illness.
Collapse
|
16
|
Pain in Patients with Post Paralytic Hemifacial Spasm: Before, during and after Botulinum Toxin Injections. Toxins (Basel) 2021; 14:toxins14010020. [PMID: 35050997 PMCID: PMC8779244 DOI: 10.3390/toxins14010020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 11/24/2021] [Accepted: 12/22/2021] [Indexed: 12/02/2022] Open
Abstract
It is well-established that botulinum toxin (BT) injections improve quality of life in patients with postparalytic hemifacial spasm. Nevertheless, injection-related pain and contracture-related pain have not yet been studied. The primary objective of our study was to evaluate injection-related pain in patients with facial palsy sequelae, and to compare the standard technique (syringe) with the Juvapen device. The secondary objective was to evaluate the improvement of contracture-related pain one month after BT injection. Methods: We conducted an observational, prospective, monocentric study based on 60 patients with facial palsy sequelae who received BT injections in our university ENT (ear, nose throat) department. There were 30 patients in the Juvapen group (J) and 30 in the standard technique group (ST). All patients completed Numerical Rating Scale (NRS) questionnaires immediately after the injections and one month later. Results: The average NRS score was 1.33/10 with Juvapen and 2.24/10 with the standard technique (p = 0.0058; Z = 2.75). In patients with contracture-related pain, the average NRS score was 3.53 before BT injection, and 0.41 one month after BT injection (p = 0.0001). Conclusions: Juvapen is a less-painful injection technique than the standard one. BT reduces contracture-related pain one month after injection.
Collapse
|
17
|
Stunt J, van Grootel L, Bouter L, Trafimow D, Hoekstra T, de Boer M. Why we habitually engage in null-hypothesis significance testing: A qualitative study. PLoS One 2021; 16:e0258330. [PMID: 34653185 PMCID: PMC8519469 DOI: 10.1371/journal.pone.0258330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 09/24/2021] [Indexed: 11/28/2022] Open
Abstract
Background Null Hypothesis Significance Testing (NHST) is the most familiar statistical procedure for making inferences about population effects. Important problems associated with this method have been addressed and various alternatives that overcome these problems have been developed. Despite its many well-documented drawbacks, NHST remains the prevailing method for drawing conclusions from data. Reasons for this have been insufficiently investigated. Therefore, the aim of our study was to explore the perceived barriers and facilitators related to the use of NHST and alternative statistical procedures among relevant stakeholders in the scientific system. Methods Individual semi-structured interviews and focus groups were conducted with junior and senior researchers, lecturers in statistics, editors of scientific journals and program leaders of funding agencies. During the focus groups, important themes that emerged from the interviews were discussed. Data analysis was performed using the constant comparison method, allowing emerging (sub)themes to be fully explored. A theory substantiating the prevailing use of NHST was developed based on the main themes and subthemes we identified. Results Twenty-nine interviews and six focus groups were conducted. Several interrelated facilitators and barriers associated with the use of NHST and alternative statistical procedures were identified. These factors were subsumed under three main themes: the scientific climate, scientific duty, and reactivity. As a result of the factors, most participants feel dependent in their actions upon others, have become reactive, and await action and initiatives from others. This may explain why NHST is still the standard and ubiquitously used by almost everyone involved. Conclusion Our findings demonstrate how perceived barriers to shift away from NHST set a high threshold for actual behavioral change and create a circle of interdependency between stakeholders. By taking small steps it should be possible to decrease the scientific community’s strong dependence on NHST and p-values.
Collapse
Affiliation(s)
- Jonah Stunt
- Department of Health Sciences, Section of Methodology and Applied Statistics, Vrije Universiteit, Amsterdam, The Netherlands
- Department of Radiation Oncology, Erasmus Medical Center, Rotterdam, The Netherlands
- * E-mail:
| | - Leonie van Grootel
- Department of Health Sciences, Section of Methodology and Applied Statistics, Vrije Universiteit, Amsterdam, The Netherlands
- Rathenau Institute, The Hague, The Netherlands
| | - Lex Bouter
- Department of Philosophy, Vrije Universiteit, Amsterdam, The Netherlands
- Department of Epidemiology and Data Science, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - David Trafimow
- Psychology Department, New Mexico State University, Las Cruces, New Mexico, United States of America
| | - Trynke Hoekstra
- Department of Health Sciences, Section of Methodology and Applied Statistics, Vrije Universiteit, Amsterdam, The Netherlands
| | - Michiel de Boer
- Department of Health Sciences, Section of Methodology and Applied Statistics, Vrije Universiteit, Amsterdam, The Netherlands
- Department of General Practice and Elderly Care, University Medical Center Groningen, Groningen, The Netherlands
| |
Collapse
|
18
|
Tian S, Wang F, Zhang R, Chen G. Global Pattern of CD8 + T-Cell Infiltration and Exhaustion in Colorectal Cancer Predicts Cancer Immunotherapy Response. Front Pharmacol 2021; 12:715721. [PMID: 34594218 PMCID: PMC8477790 DOI: 10.3389/fphar.2021.715721] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 07/22/2021] [Indexed: 01/22/2023] Open
Abstract
Background: The MSI/MSS status does not fully explain cancer immunotherapy response in colorectal cancer. Thus, we developed a colorectal cancer-specific method that predicts cancer immunotherapy response. Methods: We used gene expression data of 454 samples (MSI = 131, MSI-L = 23, MSS = 284, and Unknown = 16) and developed a TMEPRE method that models signatures of CD8+ T-cell infiltration and CD8+ T-cell exhaustion states in the tumor microenvironment of colorectal cancer. TMEPRE model was validated on three RNAseq datasets of melanoma patients who received pembrolizumab or nivolumab and one RNAseq dataset of purified CD8+ T cells in different exhaustion states. Results: TMEPRE showed predictive power in three datasets of anti-PD1-treated patients (p = 0.056, 0.115, 0.003). CD8+ T-cell exhaustion component of TMEPRE model correlates with anti-PD1 responding progenitor exhausted CD8+ T cells in both tumor and viral infection (p = 0.048, 0.001). The global pattern of TMEPRE on 454 colorectal cancer samples indicated that 10.6% of MSS patients and 67.2% of MSI patients show biological characteristics that can potentially benefit from anti-PD1 treatment. Within MSI nonresponders, approximately 50% showed insufficient tumor-infiltrating CD8+ T cells and 50% showed terminal exhaustion of CD8+ T cells. These terminally exhausted CD8+ T cells coexisted with signatures of myeloid-derived suppressor cells in colorectal cancer. Conclusion: TMEPRE is a colorectal cancer-specific method. It captures characteristics of CD8+ T-cell infiltration and CD8+ T-cell exhaustion state and predicts cancer immunotherapy response. A subset of MSS patients could potentially benefit from anti-PD1 treatment. Anti-PD1 resistance MSI patients with insufficient infiltration of CD8+ T cells or terminal exhaustion of CD8+ T cells need different treatment strategies.
Collapse
Affiliation(s)
- Sun Tian
- Carbon Logic Biotech (HK) Limited, Hongkong, China
| | - Fulong Wang
- StateKey Laboratory of Oncology in South China, Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Rongxin Zhang
- StateKey Laboratory of Oncology in South China, Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Gong Chen
- StateKey Laboratory of Oncology in South China, Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| |
Collapse
|
19
|
Pocock SJ, Rossello X, Owen R, Collier TJ, Stone GW, Rockhold FW. Primary and Secondary Outcome Reporting in Randomized Trials: JACC State-of-the-Art Review. J Am Coll Cardiol 2021; 78:827-839. [PMID: 34412817 DOI: 10.1016/j.jacc.2021.06.024] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 05/25/2021] [Accepted: 06/15/2021] [Indexed: 01/18/2023]
Abstract
Consensus as to best practices for the selection, reporting, and interpretation of primary and secondary outcomes of randomized controlled trials is lacking. We reviewed the strategies adopted in publications of randomized controlled trials (RCTs) for the analysis, presentation, and interpretation of efficacy outcomes from a survey of all cardiovascular RCTs published in the New England Journal of Medicine, Lancet, and the Journal of the American Medical Association during 2019. We focus on the choice of primary outcomes, the variety of approaches to selecting secondary outcomes, the options sometimes used to control type I error, and the common practice to not correct for multiple testing in reporting secondary outcomes. We comment on current practice across journals in the reporting of P values and also how conclusions in trial reports frequently adhere to an undue reliance on P < 0.05 as a basis for positive claims of treatment efficacy. We conclude with recommendations for how future RCT reports could best select, report, and interpret their findings on primary and secondary outcomes.
Collapse
Affiliation(s)
- Stuart J Pocock
- Medical Statistics Department, London School of Hygiene & Tropical Medicine, London, United Kingdom; Centro Nacional Investigaciones Cardiovasculares, Madrid, Spain.
| | - Xavier Rossello
- Medical Statistics Department, London School of Hygiene & Tropical Medicine, London, United Kingdom; Centro Nacional Investigaciones Cardiovasculares, Madrid, Spain
| | - Ruth Owen
- Medical Statistics Department, London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Tim J Collier
- Medical Statistics Department, London School of Hygiene & Tropical Medicine, London, United Kingdom
| | - Gregg W Stone
- The Zena and Michael A Wiener Cardiovascular Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Frank W Rockhold
- Duke Clinical Research Institute, Duke University Medical Center, Durham, North Carolina, USA
| |
Collapse
|
20
|
Ruberg SJ. Détente: A Practical Understanding of P values and Bayesian Posterior Probabilities. Clin Pharmacol Ther 2021; 109:1489-1498. [PMID: 32748400 PMCID: PMC8246739 DOI: 10.1002/cpt.2004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 06/27/2020] [Indexed: 11/18/2022]
Abstract
Null hypothesis significance testing (NHST) with its benchmark P value < 0.05 has long been a stalwart of scientific reporting and such statistically significant findings have been used to imply scientifically or clinically significant findings. Challenges to this approach have arisen over the past 6 decades, but they have largely been unheeded. There is a growing movement for using Bayesian statistical inference to quantify the probability that a scientific finding is credible. There have been differences of opinion between the frequentist (i.e., NHST) and Bayesian schools of inference, and warnings about the use or misuse of P values have come from both schools of thought spanning many decades. Controversies in this arena have been heightened by the American Statistical Association statement on P values and the further denouncement of the term "statistical significance" by others. My experience has been that many scientists, including many statisticians, do not have a sound conceptual grasp of the fundamental differences in these approaches, thereby creating even greater confusion and acrimony. If we let A represent the observed data, and B represent the hypothesis of interest, then the fundamental distinction between these two approaches can be described as the frequentist approach using the conditional probability pr(A | B) (i.e., the P value), and the Bayesian approach using pr(B | A) (the posterior probability). This paper will further explain the fundamental differences in NHST and Bayesian approaches and demonstrate how they can co-exist harmoniously to guide clinical trial design and inference.
Collapse
|
21
|
|
22
|
Hudson R. Should We Strive to Make Science Bias-Free? A Philosophical Assessment of the Reproducibility Crisis. JOURNAL FOR GENERAL PHILOSOPHY OF SCIENCE = ZEITSCHRIFT FUR ALLGEMEINE WISSENSCHAFTSTHEORIE 2021; 52:389-405. [PMID: 34720421 PMCID: PMC8550477 DOI: 10.1007/s10838-020-09548-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/16/2020] [Indexed: 06/13/2023]
Abstract
Recently, many scientists have become concerned about an excessive number of failures to reproduce statistically significant effects. The situation has become dire enough that the situation has been named the 'reproducibility crisis'. After reviewing the relevant literature to confirm the observation that scientists do indeed view replication as currently problematic, I explain in philosophical terms why the replication of empirical phenomena, such as statistically significant effects, is important for scientific progress. Following that explanation, I examine various diagnoses of the reproducibility crisis, and argue that for the majority of scientists the crisis is due, at least in part, to a form of publication bias. This conclusion sets the stage for an assessment of the view that evidential relations in science are inherently value-laden, a view championed by Heather Douglas and Kevin Elliott. I argue, in response to Douglas and Elliott, and as motivated by the meta-scientific resistance scientists harbour to a publication bias, that if we advocate the value-ladenness of science the result would be a deepening of the reproducibility crisis.
Collapse
Affiliation(s)
- Robert Hudson
- Department of Philosophy, University of Saskatchewan, 9 Campus Drive, Saskatoon, SK S7N 5A5 Canada
| |
Collapse
|
23
|
Affiliation(s)
- Erik W. Zwet
- Department of Biomedical Data Sciences Leiden University Medical Center Leiden The Netherlands
| | - Eric A. Cator
- Faculty of Science Radboud University Nijmegen The Netherlands
| |
Collapse
|
24
|
Amiri M, Deckert M, Michel MC, Poole C, Stang A. Statistical inference in abstracts of three influential clinical pharmacology journals analyzed using a text-mining algorithm. Br J Clin Pharmacol 2021; 87:4173-4182. [PMID: 33769597 DOI: 10.1111/bcp.14836] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 03/08/2021] [Accepted: 03/19/2021] [Indexed: 11/30/2022] Open
Abstract
AIM To describe the trend in the prevalence of statistical inference in three influential clinical pharmacology journals METHODS: We applied a computer-based algorithm to abstracts of three clinical pharmacology journals published in 1976 to 2016 to identify statistical inference and its subtypes. Furthermore, we manually reviewed a random sample of 300 articles to access algorithm's performance in finding statistical inference in abstracts and as a screening tool for presence and absence of statistical inference in full text. RESULT The algorithm identified 59% (13,375/22,516 [mid p 95% CI, 59%-60%]) article abstracts with statistical inference. The percentage of abstracts with statistical inference was similar in 1976 and 2016, 48% (179/377 [mid p 95%CI, 42%-52%]) versus 49% (386/791 [mid p 95%CI, 45%-52%]). Statistical reporting pattern varied among journals. Among abstracts containing any statistical inference in the publications from 1976 to 2016 null-hypothesis significance testing was the most prevalent reported statistical inference. The algorithm had high sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for finding statistical inferences in abstract. While PPV for predicting the statistical inference in full text (including abstract, text, tables and figures) was high, NPV was low. CONCLUSION Despite journal's editorials and statistical associations' guidelines, most authors focused on testing rather than estimation. In future, a better statistical reporting might be ensured by improving the statistical knowledge of authors and an addition of statistical guides to journals' instruction to authors to the extent that editors would like their statistical inference preferences to be incorporated into submitted manuscripts.
Collapse
Affiliation(s)
- Marjan Amiri
- Institute of Medical Informatics, Biometry and Epidemiology, University Hospital Essen, University of Duisburg-Essen, Essen, North Rhine-Westphalia, Germany
- Centre for Clinical Trials Essen (ZKSE), University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Markus Deckert
- Center of Clinical Epidemiology; c/o Institute of Medical Informatics, Biometry and Epidemiology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
| | - Martin C Michel
- Department of Pharmacology, Johannes Gutenberg University, Mainz, Germany
- Partnership for the Assessment and Accreditation of Scientific Practice, Heidelberg, Germany
| | - Charles Poole
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Andreas Stang
- Institute of Medical Informatics, Biometry and Epidemiology, University Hospital Essen, University of Duisburg-Essen, Essen, North Rhine-Westphalia, Germany
- Center of Clinical Epidemiology; c/o Institute of Medical Informatics, Biometry and Epidemiology, University Hospital Essen, University of Duisburg-Essen, Essen, Germany
- School of Public Health, Department of Epidemiology, Boston University, Boston, USA
| |
Collapse
|
25
|
|
26
|
Kleppe A, Skrede OJ, De Raedt S, Liestøl K, Kerr DJ, Danielsen HE. Designing deep learning studies in cancer diagnostics. Nat Rev Cancer 2021; 21:199-211. [PMID: 33514930 DOI: 10.1038/s41568-020-00327-9] [Citation(s) in RCA: 141] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/09/2020] [Indexed: 12/16/2022]
Abstract
The number of publications on deep learning for cancer diagnostics is rapidly increasing, and systems are frequently claimed to perform comparable with or better than clinicians. However, few systems have yet demonstrated real-world medical utility. In this Perspective, we discuss reasons for the moderate progress and describe remedies designed to facilitate transition to the clinic. Recent, presumably influential, deep learning studies in cancer diagnostics, of which the vast majority used images as input to the system, are evaluated to reveal the status of the field. By manipulating real data, we then exemplify that much and varied training data facilitate the generalizability of neural networks and thus the ability to use them clinically. To reduce the risk of biased performance estimation of deep learning systems, we advocate evaluation in external cohorts and strongly advise that the planned analyses, including a predefined primary analysis, are described in a protocol preferentially stored in an online repository. Recommended protocol items should be established for the field, and we present our suggestions.
Collapse
Affiliation(s)
- Andreas Kleppe
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Ole-Johan Skrede
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Sepp De Raedt
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
| | - Knut Liestøl
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
| | - David J Kerr
- Nuffield Division of Clinical Laboratory Sciences, University of Oxford, Oxford, UK
| | - Håvard E Danielsen
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, Oslo, Norway.
- Department of Informatics, University of Oslo, Oslo, Norway.
- Nuffield Division of Clinical Laboratory Sciences, University of Oxford, Oxford, UK.
| |
Collapse
|
27
|
Emmerich CH, Gamboa LM, Hofmann MCJ, Bonin-Andresen M, Arbach O, Schendel P, Gerlach B, Hempel K, Bespalov A, Dirnagl U, Parnham MJ. Improving target assessment in biomedical research: the GOT-IT recommendations. Nat Rev Drug Discov 2021; 20:64-81. [PMID: 33199880 PMCID: PMC7667479 DOI: 10.1038/s41573-020-0087-3] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2020] [Indexed: 02/06/2023]
Abstract
Academic research plays a key role in identifying new drug targets, including understanding target biology and links between targets and disease states. To lead to new drugs, however, research must progress from purely academic exploration to the initiation of efforts to identify and test a drug candidate in clinical trials, which are typically conducted by the biopharma industry. This transition can be facilitated by a timely focus on target assessment aspects such as target-related safety issues, druggability and assayability, as well as the potential for target modulation to achieve differentiation from established therapies. Here, we present recommendations from the GOT-IT working group, which have been designed to support academic scientists and funders of translational research in identifying and prioritizing target assessment activities and in defining a critical path to reach scientific goals as well as goals related to licensing, partnering with industry or initiating clinical development programmes. Based on sets of guiding questions for different areas of target assessment, the GOT-IT framework is intended to stimulate academic scientists' awareness of factors that make translational research more robust and efficient, and to facilitate academia-industry collaboration.
Collapse
Affiliation(s)
| | - Lorena Martinez Gamboa
- Department of Experimental Neurology, Charité-Universitätsmedizin Berlin, Berlin, Germany
- QUEST Center for Transforming Biomedical Research, Berlin Institute of Health, Berlin, Germany
| | - Martine C J Hofmann
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Branch for Translational Medicine & Pharmacology TMP, Frankfurt am Main, Germany
| | - Marc Bonin-Andresen
- Department of Experimental Neurology, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Olga Arbach
- Department of Experimental Neurology, Charité-Universitätsmedizin Berlin, Berlin, Germany
- SPARK-Validation Fund, Berlin Institute of Health, Berlin, Germany
| | - Pascal Schendel
- Department of Experimental Neurology, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | | | - Katja Hempel
- Boehringer-Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
| | - Anton Bespalov
- PAASP GmbH, Heidelberg, Germany
- Valdman Institute of Pharmacology, Pavlov Medical University, St. Petersburg, Russia
| | - Ulrich Dirnagl
- Department of Experimental Neurology, Charité-Universitätsmedizin Berlin, Berlin, Germany
- QUEST Center for Transforming Biomedical Research, Berlin Institute of Health, Berlin, Germany
| | - Michael J Parnham
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Branch for Translational Medicine & Pharmacology TMP, Frankfurt am Main, Germany
- Faculty of Biochemistry, Chemistry & Pharmacy, J.W. Goethe University Frankfurt, Frankfurt am Main, Germany
| |
Collapse
|
28
|
Piovani D, Pansieri C, Bonovas S. Evaluating Non-Statistically Significant Results From Trials in Practice. JAMA 2020; 324:1679. [PMID: 33107932 DOI: 10.1001/jama.2020.15645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Daniele Piovani
- Humanitas University: Humanitas Clinical and Research Center, IRCCS, Milan, Italy
| | - Claudia Pansieri
- Humanitas University: Humanitas Clinical and Research Center, IRCCS, Milan, Italy
| | - Stefanos Bonovas
- Humanitas University: Humanitas Clinical and Research Center, IRCCS, Milan, Italy
| |
Collapse
|
29
|
Bomze D, Asher N, Hasan Ali O, Flatz L, Azoulay D, Markel G, Meirson T. Survival-Inferred Fragility Index of Phase 3 Clinical Trials Evaluating Immune Checkpoint Inhibitors. JAMA Netw Open 2020; 3:e2017675. [PMID: 33095247 PMCID: PMC7584930 DOI: 10.1001/jamanetworkopen.2020.17675] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
IMPORTANCE In science and medical research, extreme and dichotomous conclusions may be drawn based on whether the P value falls above or below the threshold. The fragility index (ie, the minimum number of changes from nonevents to events resulting in loss of statistical significance) captures the vulnerability of statistics in trials with binary outcomes. There are a growing number of clinical trials of immune checkpoint inhibitors (ICIs), as well as expanding eligibility for patients to receive them. The robustness of survival outcomes in randomized clinical trials (RCTs) should be evaluated using the fragility index extended to time-to-event data. OBJECTIVE To calculate the fragility of survival data in RCTs evaluating ICIs. DESIGN, SETTING, AND PARTICIPANTS In this cross-sectional study, data on phase 3 prospective RCTs investigating ICIs included in PubMed from inception until January 1, 2020, were extracted. Two- or three-group studies reporting results for overall survival were eligible for the survival-inferred fragility index (SIFI) calculation, which is the minimum number of reassignments of the best survivors from the interventional group to the control group resulting in loss of significance (defined as P < .05 by log-rank test). For nonsignificant results, a negative SIFI was calculated by reversing the direction of reassignment (from the control group to the interventional group). MAIN OUTCOMES AND MEASURES Survival-inferred fragility index. RESULTS A total of 45 phase 3 prospective RCTs (4 of which had 3 groups, for a total of 49 groups) were identified, of which 6 (13%) investigated anti-cytotoxic T-lymphocyte-associated protein 4 (CTLA-4) agents, 25 (56%) investigated anti-programmed cell death 1 (PD-1) agents, 12 (27%) investigated anti-programmed cell death 1 ligand 1 agents, and 3 (7%) investigated the combination of anti-CTLA-4 and anti-PD-1 agents. The median SIFI was 5 (interquartile range, -4 to 12) for the intention-to-treat analysis; for these trials, the SIFI was 1% or less of the total sample size in 17 of 49 populations (35%). In 25 of the 49 intention-to-treat populations (51%), the SIFI was less than the number of censored patients in the intervention group shortly after randomization (defined as <5% of the follow-up time). CONCLUSIONS AND RELEVANCE This study suggests that many phase 3 RCTs evaluating ICI therapies have a low SIFI for overall survival, resulting in uncertainty regarding their potential clinical benefit. Although not a definitive solution for the problems arising from dichotomization, SIFI provides an additional means of assessing and communicating the strength of statistical conclusions.
Collapse
Affiliation(s)
- David Bomze
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Institute for Immunobiology, Kantonsspital St Gallen, St Gallen, Switzerland
| | - Nethanel Asher
- Ella Lemelbaum Institute for Immuno-Oncology, Sheba Medical Center, Ramat-Gan, Israel
| | - Omar Hasan Ali
- Institute for Immunobiology, Kantonsspital St Gallen, St Gallen, Switzerland
- Department of Dermatology, University Hospital of Zurich, Zurich, Switzerland
| | - Lukas Flatz
- Institute for Immunobiology, Kantonsspital St Gallen, St Gallen, Switzerland
- Department of Dermatology, University Hospital of Zurich, Zurich, Switzerland
- Department of Oncology, Kantonsspital St Gallen, St Gallen, Switzerland
| | - Daniel Azoulay
- Center for Liver Diseases, Sheba Medical Center, Ramat-Gan, Israel
| | - Gal Markel
- Ella Lemelbaum Institute for Immuno-Oncology, Sheba Medical Center, Ramat-Gan, Israel
- Department of Clinical Microbiology and Immunology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Tomer Meirson
- Ella Lemelbaum Institute for Immuno-Oncology, Sheba Medical Center, Ramat-Gan, Israel
- Azrieli Faculty of Medicine, Bar-Ilan University, Safed, Israel
| |
Collapse
|
30
|
Affiliation(s)
- Ulrich Dirnagl
- Berlin Institute of Health, QUEST Center for Transforming Biomedical Research, Berlin, Germany. .,Department of Experimental Neurology, Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
31
|
Putman MS, Harrison Ragle A, Ruderman EM. The Quality of Randomized Controlled Trials in High-impact Rheumatology Journals, 1998-2018. J Rheumatol Suppl 2020; 47:1446-1449. [PMID: 32238517 DOI: 10.3899/jrheum.191306] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/11/2020] [Indexed: 01/02/2023]
Abstract
OBJECTIVE Well-designed randomized controlled trials (RCT) mitigate bias and confounding, but previous evaluations of rheumatology trials found high rates of methodological flaws. Outside of rheumatoid arthritis, no studies in the modern era have assessed the quality of rheumatology RCT over time or regarding industry funding. METHODS We identified all RCT published in 3 high-impact rheumatology journals from 1998, 2008, and 2018. Quality metrics derived from a modified Jadad scale were analyzed by year of publication and by funding source. RESULTS Ninety-six publications met inclusion criteria; 82 of these described the primary analysis of an RCT. Over time (1998-2008-2018), trials were less likely to adequately report dropouts and withdrawals (100% vs 82% vs 60%; p < 0.01) or include an active comparator (44% vs 12% vs 13%; p = 0.01). Later trials were more likely to evaluate biologic therapy (11% vs 38% vs 83%; p < 0.01) and report adequate randomization procedures (39% vs 29% vs 60%; p = 0.04). Seventy-nine percent of trials received industry funding. Industry-funded trials were more likely to report double-blinding (86% vs 53%; p < 0.01), patient-reported outcome measures (77% vs 41%; p < 0.01), and intention-to-treat analyses (86% vs 65%; p = 0.04). CONCLUSION Industry-funded trials comprise the majority of RCT published in high-impact rheumatology journals and more frequently report metrics associated with RCT quality. RCT assessing active comparators and nonbiologic therapies have become less common in high-impact rheumatology journals.
Collapse
Affiliation(s)
- Michael S Putman
- From the Division of Rheumatology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA. .,M.S. Putman, MD; A. Harrison Ragle, MD; E.M. Ruderman, MD, Division of Rheumatology, Northwestern University Feinberg School of Medicine.
| | - Ashley Harrison Ragle
- From the Division of Rheumatology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.,M.S. Putman, MD; A. Harrison Ragle, MD; E.M. Ruderman, MD, Division of Rheumatology, Northwestern University Feinberg School of Medicine
| | - Eric M Ruderman
- From the Division of Rheumatology, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA.,M.S. Putman, MD; A. Harrison Ragle, MD; E.M. Ruderman, MD, Division of Rheumatology, Northwestern University Feinberg School of Medicine
| |
Collapse
|
32
|
Xing A, Chu H, Lin L. Fragility index of network meta-analysis with application to smoking cessation data. J Clin Epidemiol 2020; 127:29-39. [PMID: 32659361 DOI: 10.1016/j.jclinepi.2020.07.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2020] [Revised: 06/11/2020] [Accepted: 07/08/2020] [Indexed: 12/11/2022]
Abstract
BACKGROUND AND OBJECTIVES The network meta-analysis (NMA) is frequently used to synthesize evidence for multiple treatment comparisons, but its complexity may affect the robustness (or fragility) of the results. The fragility index (FI) is recently proposed to assess the fragility of the results from clinical studies and from pairwise meta-analyses. We extend the FI to NMAs with binary outcomes. METHODS We define the FI for each treatment comparison in NMAs. It quantifies the minimal number of events necessary to be modified for altering the comparison's statistical significance. We introduce an algorithm to derive the FI and visualizations of the process. A worked example of smoking cessation data is used to illustrate the proposed methods. RESULTS Some treatment comparisons had small FIs; their significance (or nonsignificance) could be altered by modifying a few events' status. They were related to various factors, such as P-values, event counts, and sample sizes, in the original NMA. After modifying event status, treatment ranking measures were also changed to different extents. CONCLUSION Many NMAs include insufficiently compared treatments, small event counts, or small sample sizes; their results are potentially fragile. The FI offers a useful tool to evaluate treatment comparisons' robustness and reliability.
Collapse
Affiliation(s)
- Aiwen Xing
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Haitao Chu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Lifeng Lin
- Department of Statistics, Florida State University, Tallahassee, FL, USA.
| |
Collapse
|
33
|
Abstract
The development of basal insulin analogues has reduced the risk of hypoglycaemia in insulin-treated individuals with type 2 diabetes. Insulin degludec and insulin glargine 300 U/ml (glargine U300) represent an evolution of basal insulin analogues, both of them reducing the risk of hypoglycaemia as compared with that associated with glargine U100. However, whether degludec and glargine U300 are equivalent with respect to glycaemic control and risk of hypoglycaemia remains to be fully ascertained. In the CONCLUDE trial, 1609 individuals with type 2 diabetes were randomised to either degludec 200 U/ml (degludec U200) or glargine U300. In this issue of Diabetologia (https://doi.org/10.1007/s00125-019-05080-9) the investigators report that during the maintenance period, HbA1c improved to a similar extent in the two groups with no significant difference in the rate of overall hypoglycaemia (the primary endpoint of the study), while rates of nocturnal symptomatic and severe hypoglycaemia (secondary endpoints) were lower with degludec U200 than with glargine U300. These results, although of great interest to the clinician, need to be carefully interpreted as they cannot be considered as conclusive. First, the primary endpoint was not met and, therefore, analyses of secondary endpoints remain exploratory. Even assuming that degludec is superior to glargine in reducing the risk of hypoglycaemia, the mechanism(s) accounting for such an advantage remain elusive and potential differences in pharmacokinetics and pharmacodynamics difficult to appreciate because of methodological issues. The study design had to be amended because of lack of reliability of the glucometers initially used in the trial, particularly in the low blood glucose ranges, so the potential implications of these changes in the subsequent conduct of the trial cannot be excluded. Finally, comparison with the BRIGHT trial, the only other available head-to-head study, is complicated by differences between the two studies in the primary endpoint (HbA1c reduction vs reduction of the risk of hypoglycaemia), study population (insulin-experienced vs insulin-naive) and concomitant glucose-lowering medications. In spite of all this, CONCLUDE teaches us an important lesson regarding the need, particularly in the clinical setting, to monitor the reliability of the glucometers the diabetic individual uses to adjust his/her insulin dose. Insufficient precision or inappropriate use of the glucometer can easily offset any minute advantage a new insulin can offer with respect to glycaemic control and risk of hypoglycaemia.
Collapse
Affiliation(s)
- Stefano Del Prato
- Department of Clinical & Experimental Medicine, Section of Diabetes, University of Pisa, Nuovo Ospedale Santa Chiara, Via Paradisa, 2, 56124, Pisa, Italy.
| |
Collapse
|
34
|
Williams M, Trist D. Editorial. Curr Opin Pharmacol 2020; 51:66-67. [DOI: 10.1016/j.coph.2019.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
35
|
Late weaning and maternal closeness, associated with advanced motor and visual maturation, reinforce autonomy in healthy, 2-year-old children. Sci Rep 2020; 10:5251. [PMID: 32251309 PMCID: PMC7090084 DOI: 10.1038/s41598-020-61917-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 02/27/2020] [Indexed: 11/09/2022] Open
Abstract
We studied neurodevelopmental outcomes and behaviours in healthy 2-year old children (N = 1306) from Brazil, India, Italy, Kenya and the UK participating in the INTERGROWTH-21st Project. There was a positive independent relationship of duration of exclusive breastfeeding (EBF) and age at weaning with gross motor development, vision and autonomic physical activities, most evident if children were exclusively breastfed for ≥7 months or weaned at ≥7 months. There was no association with cognition, language or behaviour. Children exclusively breastfed from birth to <5 months or weaned at >6 months had, in a dose-effect pattern, adjusting for confounding factors, higher scores for "emotional reactivity". The positive effect of EBF and age at weaning on gross motor, running and climbing scores was strongest among children with the highest scores in maternal closeness proxy indicators. EBF, late weaning and maternal closeness, associated with advanced motor and vision maturation, independently influence autonomous behaviours in healthy children.
Collapse
|
36
|
Di Leo G, Sardanelli F. Statistical significance: p value, 0.05 threshold, and applications to radiomics-reasons for a conservative approach. Eur Radiol Exp 2020; 4:18. [PMID: 32157489 PMCID: PMC7064671 DOI: 10.1186/s41747-020-0145-y] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Accepted: 01/23/2020] [Indexed: 12/17/2022] Open
Abstract
Here, we summarise the unresolved debate about p value and its dichotomisation. We present the statement of the American Statistical Association against the misuse of statistical significance as well as the proposals to abandon the use of p value and to reduce the significance threshold from 0.05 to 0.005. We highlight reasons for a conservative approach, as clinical research needs dichotomic answers to guide decision-making, in particular in the case of diagnostic imaging and interventional radiology. With a reduced p value threshold, the cost of research could increase while spontaneous research could be reduced. Secondary evidence from systematic reviews/meta-analyses, data sharing, and cost-effective analyses are better ways to mitigate the false discovery rate and lack of reproducibility associated with the use of the 0.05 threshold. Importantly, when reporting p values, authors should always provide the actual value, not only statements of "p < 0.05" or "p ≥ 0.05", because p values give a measure of the degree of data compatibility with the null hypothesis. Notably, radiomics and big data, fuelled by the application of artificial intelligence, involve hundreds/thousands of tested features similarly to other "omics" such as genomics, where a reduction in the significance threshold, based on well-known corrections for multiple testing, has been already adopted.
Collapse
Affiliation(s)
- Giovanni Di Leo
- Radiology Unit, IRCCS Policlinico San Donato, Via Morandi 30, 20097, San Donato Milanese, Italy.
| | - Francesco Sardanelli
- Radiology Unit, IRCCS Policlinico San Donato, Via Morandi 30, 20097, San Donato Milanese, Italy
- Dipartimento di Scienze Biomediche per la Salute, Università degli Studi di Milano, Via Morandi 30, 20097, San Donato Milanese, Italy
| |
Collapse
|
37
|
Bell RJ. Don’t skip the methods section! Randomized controlled trials are not all the same. Climacteric 2020; 23:224-225. [DOI: 10.1080/13697137.2020.1732916] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- R. J. Bell
- Women’s Health Research Program, School of Public Health and Preventive Medicine, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
38
|
Prasad V, Booth CM. Statistical significance and clinical evidence - Authors' reply. Lancet Oncol 2020; 21:e119. [PMID: 32135103 DOI: 10.1016/s1470-2045(20)30092-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Revised: 02/10/2020] [Accepted: 02/10/2020] [Indexed: 10/24/2022]
Affiliation(s)
- Vinay Prasad
- Department of Medicine and Center for Health Care Ethics, Oregon Health & Science University, Portland, OR 97202 USA.
| | - Christopher M Booth
- Department of Oncology, Queen's University, Kingston, ON, Canada; Division of Cancer Care and Epidemiology, Queen's Cancer Research Institute, Kingston, ON, Canada
| |
Collapse
|
39
|
Breheny K, Hollingworth W, Kandiyali R, Dixon P, Loose A, Craggs P, Grzeda M, Sparrow J. Assessing the construct validity and responsiveness of Preference-Based Measures (PBMs) in cataract surgery patients. Qual Life Res 2020; 29:1935-1946. [PMID: 32080789 PMCID: PMC7295830 DOI: 10.1007/s11136-020-02443-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/08/2020] [Indexed: 01/07/2023]
Abstract
PURPOSE The validity and responsiveness of the EQ-5D-3L in visual conditions has been questioned, inspiring development of a vision 'bolt-on' domain (EQ-5D-3L + VIS). Developments in preference-based measures (PBM) also includes the EQ-5D-5L and the ICECAP-O capability wellbeing measure. This study aimed to examine the construct validity and responsiveness of the EQ-5D-3L, EQ-5D-5L, EQ-5D-3L + VIS and ICECAP-O in cataract surgery patients for the first time, to inform choice of PBM for economic evaluation in this population. METHODS The analyses used data from the UK Predict-CAT cataract surgery cohort study. PBMs and the Cat-PROM5 [a validated measure of cataract quality of life (QOL)] were completed before surgery and 4-8 weeks after. Construct validity was assessed using correlations and known-group differences evaluated using regression. Responsiveness was evaluated using effect sizes and analysis of variance to compare change scores between groups, defined by patient-reported and clinical outcomes. RESULTS The sample comprised 1315 patients at baseline. No PBMs were associated with visual acuity and only the ICECAP-O (Spearman's rs = - 0.35), EQ-5D-3L + VIS (rs = - 0.42) and EQ-5D-5L (Value Set for England rs = - 0.31) correlated at least moderately with the Cat-PROM5. Effect sizes of change were consistently largest for the EQ-5D-3L + VIS (range 0.34-0.41), followed by the ICECAP-O (range 0.20-0.34). Results indicated no improvement in responsiveness using the EQ-5D-5L (range 0.13-0.16) compared to the EQ-5D-3L (range 0.17-0.20). CONCLUSIONS Whilst no PBMs comprehensively demonstrated evidence of construct validity and responsiveness in cataract surgery patients, the ICECAP-O was the most responsive generic PBM to improvements in QOL. Surprisingly the EQ-5D-5L was not more responsive than the EQ-5D-3L in this setting.
Collapse
Affiliation(s)
- Katie Breheny
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
| | - William Hollingworth
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Rebecca Kandiyali
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Padraig Dixon
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Abi Loose
- Department of Ophthalmology, Bristol Eye Hospital, Bristol, UK
| | - Pippa Craggs
- Department of Ophthalmology, Bristol Eye Hospital, Bristol, UK
| | - Mariusz Grzeda
- Department of Ophthalmology, Bristol Eye Hospital, Bristol, UK
| | - John Sparrow
- Department of Ophthalmology, Bristol Eye Hospital, Bristol, UK
| |
Collapse
|
40
|
Null hypothesis significance testing and effect sizes: can we 'effect' everything … or … anything? Curr Opin Pharmacol 2020; 51:68-77. [PMID: 31948894 DOI: 10.1016/j.coph.2019.12.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Accepted: 12/06/2019] [Indexed: 11/23/2022]
Abstract
The Null Hypothesis Significance Testing (NHST) paradigm is increasingly criticized. Estimation approaches such as point estimates and confidence intervals, while having limitations, provide better descriptions of results than P-values and statements about significance levels. Their use is supported by many statisticians. The effect size approach is an important part of power and sample size calculations at the experimental design stage and in meta-analysis and in the interpretation of the biological importance of study results. Care is needed, however, to ensure that such effect sizes are relevant for the endpoint. Effect sizes should not be used to interpret results without accompanying limits, such as confidence intervals. New methods, especially Bayesian approaches, are being developed; however, no single method provides a simple answer. Rather there is a need to improve researchers understanding of the complex issues underlying experimental design, statistical analysis and interpretation of results.
Collapse
|
41
|
Amrhein V, Greenland S, McShane BB. Statistical significance gives bias a free pass. Eur J Clin Invest 2019; 49:e13176. [PMID: 31610012 DOI: 10.1111/eci.13176] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 10/08/2019] [Indexed: 12/21/2022]
Affiliation(s)
- Valentin Amrhein
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
| | - Sander Greenland
- Department of Epidemiology and Department of Statistics, University of California, Los Angeles, CA, USA
| | - Blakeley B McShane
- Kellogg School of Management, Northwestern University, Evanston, IL, USA
| |
Collapse
|
42
|
Koletsi D, Solmi M, Pandis N, Fleming PS, Correll CU, Ioannidis JPA. Most recommended medical interventions reach P < 0.005 for their primary outcomes in meta-analyses. Int J Epidemiol 2019; 49:885-893. [DOI: 10.1093/ije/dyz241] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/01/2019] [Indexed: 11/14/2022] Open
Abstract
Abstract
Background
It has been proposed that the threshold of statistical significance should shift from P-value < 0.05 to P-value < 0.005, but there is concern that this move may dismiss effective, useful interventions. We aimed to assess how often medical interventions are recommended although their evidence in meta-analyses of randomized trials lies between P-value = 0.05 and P-value = 0.005.
Methods
We included Cochrane systematic reviews (SRs) published from 1 January 2013 to 30 June 2014 that had at least one meta-analysis with GRADE (Grading of Recommendations Assessment, Development and Evaluation) assessment and at least one primary outcome having favourable results for efficacy at P-value < 0.05. Only comparisons of randomized trials between active versus no treatment/placebo were included. We then assessed the respective UpToDate recommendations for clinical practice from 22 May 2018 to 5 October 2018 and recorded how many treatments were recommended and what were the P-values in their meta-analysis evidence. The primary analysis was based on the first-listed outcomes.
Results
Of 608 screened SRs with GRADE assessment, 113 SRs were eligible, including 143 comparisons of which 128 comparisons had first-listed primary outcomes with UpToDate coverage. Altogether, 60% (58/97) of interventions with P-values < 0.005 for their evidence were recommended versus 32% (10/31) of those with P-value 0.005–0.05. Therefore, most (58/68, 85.2%) of the recommended interventions had P-values < 0.005 for the first-listed primary outcome. Of the 10 exceptions, 4 had other primary outcomes with P-values < 0.005 and another 4 had additional extensive evidence for similar indications that would allow extrapolation for practice recommendations.
Conclusions
Few interventions are recommended without their evidence from meta-analyses of randomized trials reaching P-value < 0.005.
Collapse
Affiliation(s)
- Despina Koletsi
- Department of Orthodontics, School of Dentistry, National and Kapodistrian University of Athens, Athens, Greece
- Clinic of Orthodontics and Pediatric Dentistry, Center of Dental Medicine, University of Zurich, Zurich, Switzerland
| | - Marco Solmi
- Department of Neuroscience, University of Padua, Padua, Italy
- Padua Neuroscience Center, University of Padua, Padua, Italy
| | - Nikolaos Pandis
- Department of Orthodontics and Dentofacial Orthopedics, School of Dental Medicine, Medical Faculty, University of Bern, Bern, Switzerland
| | - Padhraig S Fleming
- Department of Oral Bioengineering, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
| | - Christoph U Correll
- Department of Psychiatry, The Zucker Hillside Hospital, Northwell Health, Glen Oaks, NY, USA
- Department of Psychiatry and Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, USA
- The Feinstein Institute for Medical Research, Center for Psychiatric Neuroscience, Manhasset, NY, USA
- Department of Child and Adolescent Psychiatry, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - John P A Ioannidis
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
- Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA
- Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA, USA
| |
Collapse
|
43
|
Hardwicke TE, Ioannidis JPA. Petitions in scientific argumentation: Dissecting the request to retire statistical significance. Eur J Clin Invest 2019; 49:e13162. [PMID: 31380567 DOI: 10.1111/eci.13162] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 07/31/2019] [Indexed: 12/16/2022]
Affiliation(s)
- Tom E Hardwicke
- Meta-Research Innovation Center Berlin (METRIC-B), Berlin Institute of Health, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - John P A Ioannidis
- Meta-Research Innovation Center Berlin (METRIC-B), Berlin Institute of Health, Charité-Universitätsmedizin Berlin, Berlin, Germany.,Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California.,Department of Medicine, Department of Health Research and Policy, Department of Biomedical Data Science, Department of Statistics, Stanford University, Stanford, California
| |
Collapse
|
44
|
Mayo DG. P-value thresholds: Forfeit at your peril. Eur J Clin Invest 2019; 49:e13170. [PMID: 31514242 DOI: 10.1111/eci.13170] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Accepted: 09/08/2019] [Indexed: 12/19/2022]
Affiliation(s)
- Deborah G Mayo
- Department of Philosophy, Virginia Tech, Blacksburg, VA, USA
| |
Collapse
|
45
|
Importance and Significance: Synonyms Sometimes But Not Specifically in Statistics. J Neurol Phys Ther 2019; 43:195-196. [DOI: 10.1097/npt.0000000000000294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
46
|
Story DA. Feasibility and pilot studies: dropping the fig leaf. Anaesthesia 2019; 75:152-154. [DOI: 10.1111/anae.14865] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/29/2019] [Indexed: 12/22/2022]
Affiliation(s)
- D. A. Story
- Centre for Integrated Critical Care The University of Melbourne Victoria Australia
| |
Collapse
|
47
|
Do low-carbohydrate diets increase energy expenditure? Int J Obes (Lond) 2019; 43:2350-2354. [PMID: 31548574 PMCID: PMC8076039 DOI: 10.1038/s41366-019-0456-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 05/31/2019] [Accepted: 06/30/2019] [Indexed: 01/15/2023]
|
48
|
Bresee L. Do We Give Too Much Significance to Statistical Significance? Can J Hosp Pharm 2019; 72:339-340. [PMID: 31692571 PMCID: PMC6799965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Affiliation(s)
- Lauren Bresee
- , BScPharm, ACPR, MSc, PhD, is a Scientific Advisor with the Canadian Agency for Drugs and Technologies in Health (CADTH), Ottawa, Ontario; an Adjunct Assistant Professor with the Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta; and a member of the O'Brien Institute for Public Health, University of Calgary. She is also an Associate Editor with the Canadian Journal of Hospital Pharmacy
| |
Collapse
|
49
|
Bresee L. [Not Available]. Can J Hosp Pharm 2019; 72:341-342. [PMID: 31692586 PMCID: PMC6799959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Affiliation(s)
- Lauren Bresee
- , B. Sc. Pharm., ACPR, M. Sc., Ph. D., est conseillère scientifique auprès de l'Agence canadienne des médicaments et des technologies de la santé (ACMTS) à Ottawa (Ontario); professeure agréée adjointe au Département des sciences de la santé communautaire, Faculté de médecine, Université de Calgary (Alberta); et membre de l'Institut O'Brien de la santé publique de l'Université de Calgary. Elle est également rédactrice adjointe du Journal canadien de la pharmacie hospitalière
| |
Collapse
|
50
|
Affiliation(s)
- Howard Bauchner
- Editor (Bauchner); Deputy Editor (Golub); and Executive Editor (Fontanarosa)
| | - Robert M Golub
- Editor (Bauchner); Deputy Editor (Golub); and Executive Editor (Fontanarosa)
| | - Phil B Fontanarosa
- Editor (Bauchner); Deputy Editor (Golub); and Executive Editor (Fontanarosa)
| |
Collapse
|