1
|
Pape M, Miyagi M, Ritz SA, Boulicault M, Richardson SS, Maney DL. Sex contextualism in laboratory research: Enhancing rigor and precision in the study of sex-related variables. Cell 2024; 187:1316-1326. [PMID: 38490173 PMCID: PMC11219044 DOI: 10.1016/j.cell.2024.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/21/2023] [Accepted: 02/08/2024] [Indexed: 03/17/2024]
Abstract
Understanding sex-related variation in health and illness requires rigorous and precise approaches to revealing underlying mechanisms. A first step is to recognize that sex is not in and of itself a causal mechanism; rather, it is a classification system comprising a set of categories, usually assigned according to a range of varying traits. Moving beyond sex as a system of classification to working with concrete and measurable sex-related variables is necessary for precision. Whether and how these sex-related variables matter-and what patterns of difference they contribute to-will vary in context-specific ways. Second, when researchers incorporate these sex-related variables into research designs, rigorous analytical methods are needed to allow strongly supported conclusions. Third, the interpretation and reporting of sex-related variation require care to ensure that basic and preclinical research advance health equity for all.
Collapse
Affiliation(s)
- Madeleine Pape
- Institute of Social Sciences, University of Lausanne, Lausanne, Switzerland.
| | - Miriam Miyagi
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - Stacey A Ritz
- Department of Pathology & Molecular Medicine, Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | - Marion Boulicault
- Department of Philosophy, University of Edinburgh, Edinburgh, Scotland
| | - Sarah S Richardson
- Department of the History of Science, Harvard University, Cambridge, MA, USA; Committee on Degrees in Studies of Women, Gender, and Sexuality, Harvard University, Cambridge, MA, USA
| | - Donna L Maney
- Department of Psychology, Emory University, Atlanta, GA, USA; Harvard-Radcliffe Institute, Harvard University, Cambridge, MA, USA
| |
Collapse
|
2
|
Wallach JD, Glick L, Gueorguieva R, O’Malley SS. Evidence of subgroup differences in meta-analyses evaluating medications for alcohol use disorder: An umbrella review. ALCOHOL, CLINICAL & EXPERIMENTAL RESEARCH 2024; 48:5-15. [PMID: 38102794 PMCID: PMC10841726 DOI: 10.1111/acer.15229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/13/2023] [Accepted: 11/13/2023] [Indexed: 12/17/2023]
Abstract
Randomized controlled trials (RCTs) evaluating medications for alcohol use disorder (AUD) often examine heterogeneity of treatment effects through subgroup analyses that contrast effect estimates in groups of patients across individual demographic, clinical, and study design-related characteristics. However, these analyses are often not prespecified or adequately powered, highlighting the potential role of subgroup analyses in meta-analysis. Here, we conducted an umbrella review (i.e., a systematic review of meta-analyses) to determine the range and characteristics of reported subgroup analyses in meta-analyses of AUD medications. We searched PubMed to identify meta-analyses of RCTs evaluating medications for the management of AUD, alcohol abuse, or alcohol dependence in adults. We sought studies that measured drinking-related outcomes; quality of life, function, and rates of mortality; adverse events; and dropout. We considered meta-analyses that reported the results from formal subgroup analyses (comparing the summary effects across subgroup levels); summary effect estimates stratified across subgroup levels; and meta-regression, regression, or correlation-based subgroup analyses. We analyzed nine meta-analyses that included 61 formal subgroup analyses (median = 6 per meta-analysis), of which 33 (54%) were based on baseline participant-level and 28 (46%) were based on trial-level characteristics. Of the 58 subgroup analyses with either a p-value from a subgroup test or a statement by the authors that the subgroup analyses were not statistically significant, eight (14%) were statistically significant at the p < 0.05 level. Twelve meta-analyses reported the results of 102 meta-regression analyses, of which 25 (25%) identified statistically significant predictors of the relevant outcome of interest; nine (9%) were based on baseline participant-level and 93 (91%) were based on trial characteristics. Subgroup analyses across meta-analyses of AUD medications often focus on study-level characteristics, which may not be as clinically informative as subgroup analyses based on participant-level characteristics. Opportunities exist for future meta-analyses to standardize their subgroup methodology, focus on more clinically informative participant-level characteristics, and use predictive approaches to account for multiple relevant variables.
Collapse
Affiliation(s)
- Joshua D. Wallach
- Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Laura Glick
- Department of Internal Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | | | | |
Collapse
|
3
|
Eliot L, Beery AK, Jacobs EG, LeBlanc HF, Maney DL, McCarthy MM. Why and How to Account for Sex and Gender in Brain and Behavioral Research. J Neurosci 2023; 43:6344-6356. [PMID: 37704386 PMCID: PMC10500996 DOI: 10.1523/jneurosci.0020-23.2023] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/14/2023] [Accepted: 07/18/2023] [Indexed: 09/15/2023] Open
Abstract
Long overlooked in neuroscience research, sex and gender are increasingly included as key variables potentially impacting all levels of neurobehavioral analysis. Still, many neuroscientists do not understand the difference between the terms "sex" and "gender," the complexity and nuance of each, or how to best include them as variables in research designs. This TechSights article outlines rationales for considering the influence of sex and gender across taxa, and provides technical guidance for strengthening the rigor and reproducibility of such analyses. This guidance includes the use of appropriate statistical methods for comparing groups as well as controls for key covariates of sex (e.g., total intracranial volume) and gender (e.g., income, caregiver stress, bias). We also recommend approaches for interpreting and communicating sex- and gender-related findings about the brain, which have often been misconstrued by neuroscientists and the lay public alike.
Collapse
Affiliation(s)
- Lise Eliot
- Stanson Toshok Center for Brain Function and Repair, Chicago Medical School, Rosalind Franklin University of Medicine & Science, North Chicago, Illinois 60064
| | - Annaliese K Beery
- Department of Integrative Biology, University of California-Berkeley, Berkeley, California 94720
| | - Emily G Jacobs
- Department of Psychological & Brain Sciences, University of California-Santa Barbara, Santa Barbara, California 93106
| | - Hannah F LeBlanc
- Division of the Humanities & Social Sciences, California Institute of Technology, Pasadena, California 91125
| | - Donna L Maney
- Department of Psychology, Emory University, Atlanta, Georgia 30322
| | - Margaret M McCarthy
- Department of Pharmacology, University of Maryland School of Medicine, Baltimore, Maryland 21201
| |
Collapse
|
4
|
Maney DL, Rich-Edwards JW. Sex-Inclusive Biomedicine: Are New Policies Increasing Rigor and Reproducibility? Womens Health Issues 2023; 33:461-464. [PMID: 37087311 DOI: 10.1016/j.whi.2023.03.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 03/20/2023] [Accepted: 03/20/2023] [Indexed: 04/24/2023]
Affiliation(s)
- Donna L Maney
- Department of Psychology, Emory University, Atlanta, Georgia.
| | - Janet W Rich-Edwards
- Division of Women's Health, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
5
|
Maney DL, Karkazis K, Hagen KBS. Considering Sex as a Variable at a Research University: Knowledge, Attitudes, and Practices. J Womens Health (Larchmt) 2023; 32:843-851. [PMID: 37585517 PMCID: PMC10457618 DOI: 10.1089/jwh.2022.0522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 06/07/2023] [Indexed: 08/18/2023] Open
Abstract
Biomedical research has a history of excluding females as research subjects, which threatens rigor, reproducibility, and inclusivity. In 2016, to redress this bias, the U.S. National Institutes of Health (NIH) implemented a policy requiring the consideration of sex as a biological variable (SABV) in all studies involving vertebrate animals, including humans. Unless strongly justified, females and males must be included in all studies and results reported disaggregated by sex. Recent evidence indicates, however, that misunderstandings of the policy and other significant barriers impede its implementation. To shed light on those barriers at our home institution, we conducted a study funded by the Emory University Specialized Center of Research Excellence on Sex Differences (SCORE). In semistructured interviews of Emory principal investigators in the biological sciences, we noted their knowledge of what the policy entails and why it was implemented, their attitudes toward it, and the extent to which it has or has not changed their research practices. Although attitudes toward SABV were generally positive, most researchers face challenges with respect to its implementation. We suggest interventions that can be mounted at the level of home institutions, such as raising awareness of locally available core facilities, to help address these challenges. More training is needed on what the policy asks of researchers, how sex is defined, the nonhormonal ways that sex differences can manifest, and best practices for statistical analysis of sex-based data. Home institutions may also want to explore ways to lessen the stress associated with rollout of SABV policy.
Collapse
Affiliation(s)
- Donna L. Maney
- Department of Psychology, Emory University, Gender, and Sexuality Studies, Emory University, Atlanta, Georgia, USA
| | - Katrina Karkazis
- Department of Women's, Gender, and Sexuality Studies, Emory University, Atlanta, Georgia, USA
| | - Kimberly B. Sessions Hagen
- Department of Behavioral, Social, and Health Education Sciences, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA
| |
Collapse
|
6
|
Mason WA, Cuttance EL, Müller KR, Huxley JN, Laven RA. Graduate Student Literature Review: A systematic review on the associations between nonsteroidal anti-inflammatory drug use at the time of diagnosis and treatment of claw horn lameness in dairy cattle and lameness scores, algometer readings, and lying times. J Dairy Sci 2022; 105:9021-9037. [PMID: 36114054 DOI: 10.3168/jds.2022-22127] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/19/2022] [Indexed: 01/21/2023]
Abstract
The objectives of this systematic review were to investigate the association between nonsteroidal anti-inflammatory drug (NSAID) use during the treatment of claw horn lameness in dairy cattle and locomotion score (LS), nociceptive threshold, and lying times. A total of 229 studies were initially identified and had their title and abstract screened. From this, we screened the full text of 23 articles, identifying 6 articles for inclusion in the systematic review. Of these 6, 5 reported LS, 2 reported nociceptor thresholds, and 1 reported lying times. The quality of evidence was assessed using a Cochrane risk-of-bias tool and CONSORT items reported for each included study. Due to heterogeneity between the studies, data were reported following Cochrane's Synthesis without meta-analysis guidelines. Identified heterogeneity between the studies included differences in LS systems and statistical analyses, length of time from enrollment to outcome reported, the NSAID used, concomitant treatments administered, and severity and chronicity of lameness. Recommendations are made with respect to consistency of LS reporting and analysis, along with improvements that may be noted with compulsory reporting guidelines. There were at least some concerns over the risk of bias in 4 of the studies, with risks of bias present in missing outcome data between the study groups. Within the 5 studies included with LS outcomes, there were 22 different pairwise comparisons with either NSAID or NSAID + block as the intervention, with measures of association with presence or absence of lameness as the outcome available for 20 of these comparisons. Animals in the NSAID intervention groups had a lower point estimate lameness risk than animals in the comparison groups in 3 of 8 and 9 of 14 analyses for LS outcomes <10 and ≥10 d post-treatment, respectively. However, there was no difference identified between animals in the NSAID intervention groups compared with the animals in the control group in any of these pairwise comparisons with lameness as the outcome. Twelve pairwise comparisons were reported in the 2 studies with nociceptor threshold as an outcome. Animals in the NSAID intervention groups had a greater nociceptor threshold point estimate compared with animals in the comparison groups in 6 of 6 and 1 of 6 analyses for outcomes <10 and ≥10 d post-treatment, respectively. However, no differences were identified between animals in the NSAID intervention groups and those in the comparison groups. All 4 pairwise comparisons reported in the study with lying times as an outcome found no differences between animals in the NSAID groups and those in the comparison groups. Despite the widespread use of NSAID in the treatment of claw horn lameness, there is a lack of studies of NSAID association with LS, nociceptive thresholds, or lying times. The limited evidence is consistent with no association with NSAID use and those parameters, but comparability across studies was limited by heterogeneity.
Collapse
Affiliation(s)
- W A Mason
- EpiVets Limited, Mahoe St., Te Awamutu, 3800 New Zealand.
| | - E L Cuttance
- EpiVets Limited, Mahoe St., Te Awamutu, 3800 New Zealand
| | - K R Müller
- Massey University, School of Veterinary Science, Private Bag 11 222, Palmerston North, 4474 New Zealand
| | - J N Huxley
- Massey University, School of Veterinary Science, Private Bag 11 222, Palmerston North, 4474 New Zealand
| | - R A Laven
- Massey University, School of Veterinary Science, Private Bag 11 222, Palmerston North, 4474 New Zealand
| |
Collapse
|
7
|
Rodríguez-Ramallo H, Báez-Gutiérrez N, Otero-Candelera R, Martín LAK. Subgroup Analysis in Pulmonary Hypertension-Specific Therapy Clinical Trials: A Systematic Review. J Pers Med 2022; 12:863. [PMID: 35743648 PMCID: PMC9224970 DOI: 10.3390/jpm12060863] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 05/18/2022] [Accepted: 05/23/2022] [Indexed: 12/20/2022] Open
Abstract
Pulmonary hypertension (PH) treatment decisions are driven by the results of randomized controlled trials (RCTs). Subgroup analyses are often performed to assess whether the intervention effect will change due to the patient's characteristics, thus allowing for individualized decisions. This review aimed to evaluate the appropriateness and interpretation of subgroup analyses performed in PH-specific therapy RCTs published between 2000 and 2020. Claims of subgroup effects were evaluated with prespecified criteria. Overall, 30 RCTs were included. Subgroup analyses presented: a high number of subgroup analyses reported, lack of prespecification, and lack of interaction tests. The trial protocol was not available for most RCTs; significant differences were found in those articles that published the protocol. Authors reported 13 claims of subgroup effect, with 12 claims meeting four or fewer of Sun's criteria. Even when most RCTs were generally at low risk of bias and were published in high-impact journals, the credibility and general quality of subgroup analyses and subgroup claims were low due to methodological flaws. Clinicians should be skeptical of claims of subgroup effects and interpret subgroup analyses with caution, as due to their poor quality, these analyses may not serve as guidance for personalized care.
Collapse
Affiliation(s)
- Héctor Rodríguez-Ramallo
- Hospital Pharmacy Department, Virgen del Rocio University Hospital, 41004 Seville, Spain; (H.R.-R.); (L.A.-k.M.)
| | - Nerea Báez-Gutiérrez
- Hospital Pharmacy Department, Reina Sofía University Hospital, 14004 Cordoba, Spain
| | | | - Laila Abdel-kader Martín
- Hospital Pharmacy Department, Virgen del Rocio University Hospital, 41004 Seville, Spain; (H.R.-R.); (L.A.-k.M.)
| |
Collapse
|
8
|
Carland C, Hansra B, Parsons C, Lyubarova R, Khandelwal A. Adequate enrollment of women in cardiovascular drug trials and the need for sex-specific assessment and reporting. AMERICAN HEART JOURNAL PLUS : CARDIOLOGY RESEARCH AND PRACTICE 2022; 17:100155. [PMID: 38559887 PMCID: PMC10978324 DOI: 10.1016/j.ahjo.2022.100155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 06/09/2022] [Accepted: 06/14/2022] [Indexed: 04/04/2024]
Abstract
Cardiovascular disease (CVD) is the leading cause of death for women in the United States and globally. There is an abundance of evidence-based trials evaluating the efficacy of drug therapies to reduce morbidity and mortality in CVD. Additionally, there are well-established influences of sex, through a variety of mechanisms, on pharmacologic treatments in CVD. Despite this, the majority of drug trials are not powered to evaluate sex-specific outcomes, and much of the data that exists is gathered post hoc and through meta-analysis. The FDA established a committee in 1993 to increase the enrollment of women in clinical trials to improve this situation. Several authors, reviewing committees, and professional societies have highlighted the importance of sex-specific analysis and reporting. Despite these statements, there has not been a major improvement in representation or reporting. There are ongoing efforts to assess trial design, female representation on steering committees, and clinical trial processes to improve the representation of women. This review will describe the pharmacologic basis for the need for sex-specific assessment of cardiovascular drug therapies. It will also review the sex-specific reporting of landmark drug trials in hypertension, coronary artery disease (CAD), hyperlipidemia, and heart failure (HF). In reporting enrollment of women, several therapeutic areas like antihypertensives and newer anticoagulation trials fare better than therapeutics for HF and acute coronary syndromes. Further, drug trials and cardiometabolic or lifestyle intervention trials had a higher percentage of female participants than the device or procedural trials.
Collapse
Affiliation(s)
- Corinne Carland
- Department of Medicine, University of Pennsylvania, United States of America
| | - Barinder Hansra
- Division of Cardiology and Department of Critical Care Medicine, UPMC, United States of America
| | - Cody Parsons
- Cardiovascular Health, Stanford Health Care, United States of America
| | - Radmila Lyubarova
- Division of Cardiology, Albany Medical College, United States of America
| | - Abha Khandelwal
- Division of Cardiology, Stanford School of Medicine, United States of America
| |
Collapse
|
9
|
Chusyd DE, Austad SN, Brown AW, Chen X, Dickinson SL, Ejima K, Fluharty D, Golzarri-Arroyo L, Holden R, Jamshidi-Naeini Y, Landsittel D, Lartey S, Mannix E, Vorland CJ, Allison DB. From Model Organisms to Humans, the Opportunity for More Rigor in Methodologic and Statistical Analysis, Design, and Interpretation of Aging and Senescence Research. J Gerontol A Biol Sci Med Sci 2021; 77:2155-2164. [PMID: 34950945 PMCID: PMC9678201 DOI: 10.1093/gerona/glab382] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Indexed: 12/26/2022] Open
Abstract
This review identifies frequent design and analysis errors in aging and senescence research and discusses best practices in study design, statistical methods, analyses, and interpretation. Recommendations are offered for how to avoid these problems. The following issues are addressed: (a) errors in randomization, (b) errors related to testing within-group instead of between-group differences, (c) failing to account for clustering, (d) failing to consider interference effects, (e) standardizing metrics of effect size, (f) maximum life-span testing, (g) testing for effects beyond the mean, (h) tests for power and sample size, (i) compression of morbidity versus survival curve squaring, and (j) other hot topics, including modeling high-dimensional data and complex relationships and assessing model assumptions and biases. We hope that bringing increased awareness of these topics to the scientific community will emphasize the importance of employing sound statistical practices in all aspects of aging and senescence research.
Collapse
Affiliation(s)
- Daniella E Chusyd
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Steven N Austad
- Department of Biology, University of Alabama at Birmingham, Birmingham, Alabama, USA,Nathan Shock Center, University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Andrew W Brown
- Department of Applied Health Science, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Xiwei Chen
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Stephanie L Dickinson
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Keisuke Ejima
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - David Fluharty
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA,Departments of Mathematics and Economics, Ivy Tech Community College, Columbus, Indiana, USA
| | - Lilian Golzarri-Arroyo
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Richard Holden
- Department of Health and Wellness Design, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Yasaman Jamshidi-Naeini
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Doug Landsittel
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Stella Lartey
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, Indiana, USA
| | - Edward Mannix
- Department of Anatomy, Cell Biology, and Physiology, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Colby J Vorland
- Department of Applied Health Science, Indiana University Bloomington, Bloomington, Indiana, USA
| | - David B Allison
- Address correspondence to: David B. Allison, PhD, Department of Epidemiology and Biostatistics, Indiana University Bloomington, 1025 E. 7th St., PH 111, Bloomington, IN 47405, USA. E-mail:
| |
Collapse
|
10
|
Abstract
A survey reveals that many researchers do not use appropriate statistical analyses to evaluate sex differences in biomedical research.
Collapse
Affiliation(s)
- Colby J Vorland
- Department of Applied Health Science, Indiana University School of Public Health, Bloomington, United States
| |
Collapse
|
11
|
Vorland CJ, Brown AW, Dawson JA, Dickinson SL, Golzarri-Arroyo L, Hannon BA, Heo M, Heymsfield SB, Jayawardene WP, Kahathuduwa CN, Keith SW, Oakes JM, Tekwe CD, Thabane L, Allison DB. Errors in the implementation, analysis, and reporting of randomization within obesity and nutrition research: a guide to their avoidance. Int J Obes (Lond) 2021; 45:2335-2346. [PMID: 34326476 PMCID: PMC8528702 DOI: 10.1038/s41366-021-00909-z] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 06/26/2021] [Accepted: 07/06/2021] [Indexed: 02/06/2023]
Abstract
Randomization is an important tool used to establish causal inferences in studies designed to further our understanding of questions related to obesity and nutrition. To take advantage of the inferences afforded by randomization, scientific standards must be upheld during the planning, execution, analysis, and reporting of such studies. We discuss ten errors in randomized experiments from real-world examples from the literature and outline best practices for their avoidance. These ten errors include: representing nonrandom allocation as random, failing to adequately conceal allocation, not accounting for changing allocation ratios, replacing subjects in nonrandom ways, failing to account for non-independence, drawing inferences by comparing statistical significance from within-group comparisons instead of between-groups, pooling data and breaking the randomized design, failing to account for missing data, failing to report sufficient information to understand study methods, and failing to frame the causal question as testing the randomized assignment per se. We hope that these examples will aid researchers, reviewers, journal editors, and other readers to endeavor to a high standard of scientific rigor in randomized experiments within obesity and nutrition research.
Collapse
Affiliation(s)
- Colby J Vorland
- Department of Applied Health Science, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA.
| | - Andrew W Brown
- Department of Applied Health Science, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA
| | - John A Dawson
- Department of Nutritional Sciences, Texas Tech University, Lubbock, TX, USA
| | - Stephanie L Dickinson
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA
| | - Lilian Golzarri-Arroyo
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA
| | - Bridget A Hannon
- Division of Nutritional Sciences, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Moonseong Heo
- Department of Public Health Sciences, Clemson University, Clemson, SC, USA
| | - Steven B Heymsfield
- Pennington Biomedical Research Center, Louisiana State University, Baton Rouge, LA, USA
| | - Wasantha P Jayawardene
- Department of Applied Health Science, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA
| | - Chanaka N Kahathuduwa
- Department of Psychiatry, School of Medicine, Texas Tech University Health Sciences Center, Lubbock, TX, USA
| | - Scott W Keith
- Department of Pharmacology and Experimental Therapeutics, Division of Biostatistics, Thomas Jefferson University, Philadelphia, PA, USA
| | - J Michael Oakes
- Department of Epidemiology, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Carmen D Tekwe
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA
| | - Lehana Thabane
- Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, ON, Canada
| | - David B Allison
- Department of Epidemiology and Biostatistics, Indiana University School of Public Health-Bloomington, Bloomington, IN, USA.
| |
Collapse
|
12
|
Rojanaworarit C. Misleading Epidemiological and Statistical Evidence in the Presence of Simpson's Paradox: An Illustrative Study Using Simulated Scenarios of Observational Study Designs. J Med Life 2020; 13:37-44. [PMID: 32341699 PMCID: PMC7175433 DOI: 10.25122/jml-2019-0120] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
This study empirically illustrates the mechanism by which epidemiological effect measures and statistical evidence can be misleading in the presence of Simpson's paradox and identify possible alternative methods of analysis to manage the paradox. Three scenarios of observational study designs, including cross-sectional, cohort, and case-control approaches, are simulated. In each scenario, data are generated, and various methods of epidemiological and statistical analyses are undertaken to obtain empirical results that illustrate Simpson's paradox and mislead conclusions. Rational methods of analysis are also performed to illustrate how to avoid pitfalls and obtain valid results. In the presence of Simpson's paradox, results from analyses in overall data contradict the findings from all subgroups of the same data. This paradox occurs when distributions of confounding characteristics are unequal in the groups being compared. Data analysis methods which do not take confounding factor into account, including epidemiological 2×2 table analysis, independent samples t-test, Wilcoxon rank-sum test, chi-square test, and univariable regression analysis, cannot manage the problem of Simpson's paradox and mislead research conclusions. Mantel-Haenszel procedure and multivariable regression methods are examples of rational analysis methods leading to valid results. Therefore, Simpson's paradox arises as a consequence of extreme unequal distributions of a specific inherent characteristic in groups being compared. Analytical methods which take control of confounding effect must be applied to manage the paradox and obtain valid research evidence regarding the causal association.
Collapse
Affiliation(s)
- Chanapong Rojanaworarit
- Department of Health Professions, School of Health Professions and Human Services, Hofstra University, Hempstead, New York, United States of America
| |
Collapse
|
13
|
Borg DN, Lohse KR, Sainani KL. Ten Common Statistical Errors from All Phases of Research, and Their Fixes. PM R 2020; 12:610-614. [PMID: 32358859 DOI: 10.1002/pmrj.12395] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 04/24/2020] [Indexed: 11/06/2022]
Affiliation(s)
- David N Borg
- The Hopkins Centre, Menzies Health Institute Queensland, Griffith University, Brisbane, Queensland, Australia
| | - Keith R Lohse
- Department of Health, Kinesiology, and Recreation, University of Utah, Salt Lake City, Utah, USA.,Department of Physical Therapy and Athletic Training, University of Utah, Salt Lake City, UT, USA
| | - Kristin L Sainani
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA, USA
| |
Collapse
|
14
|
Trepanowski JF, Ioannidis JPA. Perspective: Limiting Dependence on Nonrandomized Studies and Improving Randomized Trials in Human Nutrition Research: Why and How. Adv Nutr 2018; 9:367-377. [PMID: 30032218 PMCID: PMC6054237 DOI: 10.1093/advances/nmy014] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
A large majority of human nutrition research uses nonrandomized observational designs, but this has led to little reliable progress. This is mostly due to many epistemologic problems, the most important of which are as follows: difficulty detecting small (or even tiny) effect sizes reliably for nutritional risk factors and nutrition-related interventions; difficulty properly accounting for massive confounding among many nutrients, clinical outcomes, and other variables; difficulty measuring diet accurately; and suboptimal research reporting. Tiny effect sizes and massive confounding are largely unfixable problems that narrowly confine the scenarios in which nonrandomized observational research is useful. Although nonrandomized studies and randomized trials have different priorities (assessment of long-term causality compared with assessment of treatment effects), the odds for obtaining reliable information with the former are limited. Randomized study designs should therefore largely replace nonrandomized studies in human nutrition research going forward. To achieve this, many of the limitations that have traditionally plagued most randomized trials in nutrition, such as small sample size, short length of follow-up, high cost, and selective reporting, among others, must be overcome. Pivotal megatrials with tens of thousands of participants and lifelong follow-up are possible in nutrition science with proper streamlining of operational costs. Fixable problems that have undermined observational research, such as dietary measurement error and selective reporting, need to be addressed in randomized trials. For focused questions in which dietary adherence is important to maximize, trials with direct observation of participants in experimental in-house settings may offer clean answers on short-term metabolic outcomes. Other study designs of randomized trials to consider in nutrition include registry-based designs and "N-of-1" designs. Mendelian randomization designs may also offer some more reliable leads for testing interventions in trials. Collectively, an improved randomized agenda may clarify many things in nutrition science that might never be answered credibly with nonrandomized observational designs.
Collapse
Affiliation(s)
| | - John P A Ioannidis
- Stanford Prevention Research Center
- Meta-Research Innovation Center at Stanford (METRICS)
- Departments of Medicine, Stanford University, Stanford, CA
- Departments of Health Research and Policy, Stanford University, Stanford, CA
- Departments of Biomedical Data Science, Stanford University, Stanford, CA
- Departments of Statistics, Stanford University, Stanford, CA
| |
Collapse
|
15
|
Wallach JD, Sullivan PG, Trepanowski JF, Sainani KL, Steyerberg EW, Ioannidis JPA. Evaluation of Evidence of Statistical Support and Corroboration of Subgroup Claims in Randomized Clinical Trials. JAMA Intern Med 2017; 177:554-560. [PMID: 28192563 PMCID: PMC6657347 DOI: 10.1001/jamainternmed.2016.9125] [Citation(s) in RCA: 86] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Importance Many published randomized clinical trials (RCTs) make claims for subgroup differences. Objective To evaluate how often subgroup claims reported in the abstracts of RCTs are actually supported by statistical evidence (P < .05 from an interaction test) and corroborated by subsequent RCTs and meta-analyses. Data Sources This meta-epidemiological survey examines data sets of trials with at least 1 subgroup claim, including Subgroup Analysis of Trials Is Rarely Easy (SATIRE) articles and Discontinuation of Randomized Trials (DISCO) articles. We used Scopus (updated July 2016) to search for English-language articles citing each of the eligible index articles with at least 1 subgroup finding in the abstract. Study Selection Articles with a subgroup claim in the abstract with or without evidence of statistical heterogeneity (P < .05 from an interaction test) in the text and articles attempting to corroborate the subgroup findings. Data Extraction and Synthesis Study characteristics of trials with at least 1 subgroup claim in the abstract were recorded. Two reviewers extracted the data necessary to calculate subgroup-level effect sizes, standard errors, and the P values for interaction. For individual RCTs and meta-analyses that attempted to corroborate the subgroup findings from the index articles, trial characteristics were extracted. Cochran Q test was used to reevaluate heterogeneity with the data from all available trials. Main Outcomes and Measures The number of subgroup claims in the abstracts of RCTs, the number of subgroup claims in the abstracts of RCTs with statistical support (subgroup findings), and the number of subgroup findings corroborated by subsequent RCTs and meta-analyses. Results Sixty-four eligible RCTs made a total of 117 subgroup claims in their abstracts. Of these 117 claims, only 46 (39.3%) in 33 articles had evidence of statistically significant heterogeneity from a test for interaction. In addition, out of these 46 subgroup findings, only 16 (34.8%) ensured balance between randomization groups within the subgroups (eg, through stratified randomization), 13 (28.3%) entailed a prespecified subgroup analysis, and 1 (2.2%) was adjusted for multiple testing. Only 5 (10.9%) of the 46 subgroup findings had at least 1 subsequent pure corroboration attempt by a meta-analysis or an RCT. In all 5 cases, the corroboration attempts found no evidence of a statistically significant subgroup effect. In addition, all effect sizes from meta-analyses were attenuated toward the null. Conclusions and Relevance A minority of subgroup claims made in the abstracts of RCTs are supported by their own data (ie, a significant interaction effect). For those that have statistical support (P < .05 from an interaction test), most fail to meet other best practices for subgroup tests, including prespecification, stratified randomization, and adjustment for multiple testing. Attempts to corroborate statistically significant subgroup differences are rare; when done, the initially observed subgroup differences are not reproduced.
Collapse
Affiliation(s)
- Joshua D Wallach
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California2Meta-Research Innovation Center at Stanford (METRICS), Stanford University School of Medicine, Stanford, California
| | - Patrick G Sullivan
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California2Meta-Research Innovation Center at Stanford (METRICS), Stanford University School of Medicine, Stanford, California3Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - John F Trepanowski
- Stanford Prevention Research Center, Department of Medicine, Stanford University School of Medicine, Stanford, California
| | - Kristin L Sainani
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California
| | | | - John P A Ioannidis
- Department of Health Research and Policy, Stanford University School of Medicine, Stanford, California2Meta-Research Innovation Center at Stanford (METRICS), Stanford University School of Medicine, Stanford, California3Department of Medicine, Stanford University School of Medicine, Stanford, California4Stanford Prevention Research Center, Department of Medicine, Stanford University School of Medicine, Stanford, California6Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, California
| |
Collapse
|