1
|
Sundar S, Agarwal R, Davenport C, Scandrett K, Johnson S, Sengupta P, Selvi-Vikram R, Kwong FL, Mallett S, Rick C, Kehoe S, Timmerman D, Bourne T, Van Calster B, Stobart H, Neal RD, Menon U, Gentry-Maharaj A, Sturdy L, Ottridge R, Deeks J. Risk-prediction models in postmenopausal patients with symptoms of suspected ovarian cancer in the UK (ROCkeTS): a multicentre, prospective diagnostic accuracy study. Lancet Oncol 2024; 25:1371-1386. [PMID: 39362250 DOI: 10.1016/s1470-2045(24)00406-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 07/16/2024] [Accepted: 07/18/2024] [Indexed: 10/05/2024]
Abstract
BACKGROUND Multiple risk-prediction models are used in clinical practice to triage patients as being at low risk or high risk of ovarian cancer. In the ROCkeTS study, we aimed to identify the best diagnostic test for ovarian cancer in symptomatic patients, through head-to-head comparisons of risk-prediction models, in a real-world setting. Here, we report the results for the postmenopausal cohort. METHODS In this multicentre, prospective diagnostic accuracy study, we recruited newly presenting female patients aged 16-90 years with non-specific symptoms and raised CA125 or abnormal ultrasound results (or both) who had been referred via rapid access, elective clinics, or emergency presentations from 23 hospitals in the UK. Patients with normal CA125 and simple ovarian cysts of smaller than 5 cm in diameter, active non-ovarian malignancy, or previous ovarian malignancy, or those who were pregnant or declined a transvaginal scan, were ineligible. In this analysis, only postmenopausal participants were included. Participants completed a symptom questionnaire, gave a blood sample, and had transabdominal and transvaginal ultrasounds performed by International Ovarian Tumour Analysis consortium (IOTA)-certified sonographers. Index tests were Risk of Malignancy 1 (RMI1) at a threshold of 200, Risk of Malignancy Algorithm (ROMA) at multiple thresholds, IOTA Assessment of Different Neoplasias in the Adnexa (ADNEX) at thresholds of 3% and 10%, IOTA SRRisk model at thresholds of 3% and 10%, IOTA Simple Rules (malignant vs benign, or inconclusive), and CA125 at 35 IU/mL. In a post-hoc analysis, the Ovarian Adnexal and Reporting Data System (ORADS) at 10% was derived from IOTA ultrasound variables using established methods since ORADS was described after completion of recruitment. Index tests were conducted by study staff masked to the results of the reference standard. The comparator was RMI1 at the 250 threshold (the current UK National Health Service standard of care). The reference standard was surgical or biopsy tissue histology or cytology within 3 months, or a self-reported diagnosis of ovarian cancer at 12 month follow-up. The primary outcome was diagnostic accuracy at predicting primary invasive ovarian cancer versus benign or normal histology, assessed by analysing the sensitivity, specificity, C-index, area under receiver operating characteristic curve, positive and negative predictive values, and calibration plots in participants with conclusive reference standard results and available index test data. This study is registered with the International Standard Randomised Controlled Trial Number registry (ISRCTN17160843). FINDINGS Between July 13, 2015, and Nov 30, 2018, 1242 postmenopausal patients were recruited, of whom 215 (17%) had primary ovarian cancer. 166 participants had missing, inconclusive, or other reference standard results; therefore, data from a maximum of 1076 participants were used to assess the index tests for the primary outcome. Compared with RMI1 at 250 (sensitivity 82·9% [95% CI 76·7 to 88·0], specificity 87·4% [84·9 to 89·6]), IOTA ADNEX at 10% was more sensitive (difference of -13·9% [-20·2 to -7·6], p<0·0001) but less specific (difference of 28·5% [24·7 to 32·3], p<0·0001). ROMA at 29·9 had similar sensitivity (difference of -3·6% [-9·1 to 1·9], p=0·24) but lower specificity (difference of 5·2% [2·5 to 8·0], p=0·0001). RMI1 at 200 had similar sensitivity (difference of -2·1% [-4·7 to 0·5], p=0·13) but lower specificity (difference of 3·0% [1·7 to 4·3], p<0·0001). IOTA SRRisk model at 10% had similar sensitivity (difference of -4·3% [-11·0 to -2·3], p=0·23) but lower specificity (difference of 16·2% [12·6 to 19·8], p<0·0001). IOTA Simple Rules had similar sensitivity (difference of -1·6% [-9·3 to 6·2], p=0·82) and specificity (difference of -2·2% [-5·1 to 0·6], p=0·14). CA125 at 35 IU/mL had similar sensitivity (difference of -2·1% [-6·6 to 2·3], p=0·42) but higher specificity (difference of 6·7% [4·3 to 9·1], p<0·0001). In a post-hoc analysis, when compared with RMI1 at 250, ORADS achieved similar sensitivity (difference of -2·1%, 95% CI -8·6 to 4·3, p=0·60) and lower specificity (difference of 10·2%, 95% CI 6·8 to 13·6, p<0·0001). INTERPRETATION In view of its higher sensitivity than RMI1 at 250, despite some loss in specificity, we recommend that IOTA ADNEX at 10% should be considered as the new standard-of-care diagnostic in ovarian cancer for postmenopausal patients. FUNDING UK National Institute of Heath Research.
Collapse
Affiliation(s)
- Sudha Sundar
- Pan Birmingham Gynaecological Cancer Centre, Sandwell and West Birmingham Hospitals NHS Trust, Birmingham, UK; Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK.
| | - Ridhi Agarwal
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | - Clare Davenport
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | - Katie Scandrett
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK; NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust, University of Birmingham, Birmingham, UK
| | - Susanne Johnson
- University Hospital Southampton NHS Foundation Trust, Southampton, UK
| | - Partha Sengupta
- County Durham and Darlington NHS Foundation Trust, Darlington, UK
| | | | - Fong Lien Kwong
- Pan Birmingham Gynaecological Cancer Centre, Sandwell and West Birmingham Hospitals NHS Trust, Birmingham, UK
| | - Sue Mallett
- Centre for Medical Imaging, University College London, London, UK
| | - Caroline Rick
- School of Medicine, University of Nottingham, Nottingham, UK
| | - Sean Kehoe
- St Peter's College, University of Oxford, Oxford, UK
| | - Dirk Timmerman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; Department of Obstetrics and Gynecology, University Hospitals KU Leuven, Leuven, Belgium
| | - Tom Bourne
- Faculty of Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; Leuven Unit for Health Technology Assessment Research (LUHTAR), KU Leuven, Leuven, Belgium
| | | | - Richard D Neal
- University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Usha Menon
- Department of Women's Cancer, Elizabeth Garrett Anderson Institute for Women's Health, University College London, London, UK; MRC Clinical Trials Unit, Institute of Clinical Trials and Methodology, University College London, London, UK
| | - Alex Gentry-Maharaj
- Department of Women's Cancer, Elizabeth Garrett Anderson Institute for Women's Health, University College London, London, UK; MRC Clinical Trials Unit, Institute of Clinical Trials and Methodology, University College London, London, UK
| | - Lauren Sturdy
- Birmingham Clinical Trials Unit, University of Birmingham, Birmingham, UK
| | - Ryan Ottridge
- Birmingham Clinical Trials Unit, University of Birmingham, Birmingham, UK
| | - Jon Deeks
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK; NIHR Birmingham Biomedical Research Centre, University Hospitals Birmingham NHS Foundation Trust, University of Birmingham, Birmingham, UK
| |
Collapse
|
2
|
Andaur Navarro CL, Damen JAA, Ghannad M, Dhiman P, van Smeden M, Reitsma JB, Collins GS, Riley RD, Moons KGM, Hooft L. SPIN-PM: a consensus framework to evaluate the presence of spin in studies on prediction models. J Clin Epidemiol 2024; 170:111364. [PMID: 38631529 DOI: 10.1016/j.jclinepi.2024.111364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/01/2024] [Accepted: 04/08/2024] [Indexed: 04/19/2024]
Abstract
OBJECTIVES To develop a framework to identify and evaluate spin practices and its facilitators in studies on clinical prediction model regardless of the modeling technique. STUDY DESIGN AND SETTING We followed a three-phase consensus process: (1) premeeting literature review to generate items to be included; (2) a series of structured meetings to provide comments discussed and exchanged viewpoints on items to be included with a panel of experienced researchers; and (3) postmeeting review on final list of items and examples to be included. Through this iterative consensus process, a framework was derived after all panel's researchers agreed. RESULTS This consensus process involved a panel of eight researchers and resulted in SPIN-Prediction Models which consists of two categories of spin (misleading interpretation and misleading transportability), and within these categories, two forms of spin (spin practices and facilitators of spin). We provide criteria and examples. CONCLUSION We proposed this guidance aiming to facilitate not only the accurate reporting but also an accurate interpretation and extrapolation of clinical prediction models which will likely improve the reporting quality of subsequent research, as well as reduce research waste.
Collapse
Affiliation(s)
- Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Mona Ghannad
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Paula Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology & Musculoskeletal Sciences, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Johannes B Reitsma
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology & Musculoskeletal Sciences, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Richard D Riley
- Institute of Applied Health Research, College of Medical and Dental Sciences, University of Birmingham, Birmingham B15 2TT, UK
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
3
|
Raynaud M, Al-Awadhi S, Louis K, Zhang H, Su X, Goutaudier V, Wang J, Demir Z, Wei Y, Truchot A, Bouquegneau A, Del Bello A, Bailly É, Lombardi Y, Maanaoui M, Giarraputo A, Naser S, Divard G, Aubert O, Murad MH, Wang C, Liu L, Bestard O, Naesens M, Friedewald JJ, Lefaucheur C, Riella L, Collins G, Ioannidis JP, Loupy A. Prognostic Biomarkers in Kidney Transplantation: A Systematic Review and Critical Appraisal. J Am Soc Nephrol 2024; 35:177-188. [PMID: 38053242 PMCID: PMC10843205 DOI: 10.1681/asn.0000000000000260] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 10/08/2023] [Indexed: 12/07/2023] Open
Abstract
SIGNIFICANCE STATEMENT Why are there so few biomarkers accepted by health authorities and implemented in clinical practice, despite the high and growing number of biomaker studies in medical research ? In this meta-epidemiological study, including 804 studies that were critically appraised by expert reviewers, the authors have identified all prognostic kidney transplant biomarkers and showed overall suboptimal study designs, methods, results, interpretation, reproducible research standards, and transparency. The authors also demonstrated for the first time that the limited number of studies challenged the added value of their candidate biomarkers against standard-of-care routine patient monitoring parameters. Most biomarker studies tended to be single-center, retrospective studies with a small number of patients and clinical events. Less than 5% of the studies performed an external validation. The authors also showed the poor transparency reporting and identified a data beautification phenomenon. These findings suggest that there is much wasted research effort in transplant biomarker medical research and highlight the need to produce more rigorous studies so that more biomarkers may be validated and successfully implemented in clinical practice. BACKGROUND Despite the increasing number of biomarker studies published in the transplant literature over the past 20 years, demonstrations of their clinical benefit and their implementation in routine clinical practice are lacking. We hypothesized that suboptimal design, data, methodology, and reporting might contribute to this phenomenon. METHODS We formed a consortium of experts in systematic reviews, nephrologists, methodologists, and epidemiologists. A systematic literature search was performed in PubMed, Embase, Scopus, Web of Science, and Cochrane Library between January 1, 2005, and November 12, 2022 (PROSPERO ID: CRD42020154747). All English language, original studies investigating the association between a biomarker and kidney allograft outcome were included. The final set of publications was assessed by expert reviewers. After data collection, two independent reviewers randomly evaluated the inconsistencies for 30% of the references for each reviewer. If more than 5% of inconsistencies were observed for one given reviewer, a re-evaluation was conducted for all the references of the reviewer. The biomarkers were categorized according to their type and the biological milieu from which they were measured. The study characteristics related to the design, methods, results, and their interpretation were assessed, as well as reproducible research practices and transparency indicators. RESULTS A total of 7372 publications were screened and 804 studies met the inclusion criteria. A total of 1143 biomarkers were assessed among the included studies from blood ( n =821, 71.8%), intragraft ( n =169, 14.8%), or urine ( n =81, 7.1%) compartments. The number of studies significantly increased, with a median, yearly number of 31.5 studies (interquartile range [IQR], 23.8-35.5) between 2005 and 2012 and 57.5 (IQR, 53.3-59.8) between 2013 and 2022 ( P < 0.001). A total of 655 studies (81.5%) were retrospective, while 595 (74.0%) used data from a single center. The median number of patients included was 232 (IQR, 96-629) with a median follow-up post-transplant of 4.8 years (IQR, 3.0-6.2). Only 4.7% of studies were externally validated. A total of 346 studies (43.0%) did not adjust their biomarker for key prognostic factors, while only 3.1% of studies adjusted the biomarker for standard-of-care patient monitoring factors. Data sharing, code sharing, and registration occurred in 8.8%, 1.1%, and 4.6% of studies, respectively. A total of 158 studies (20.0%) emphasized the clinical relevance of the biomarker, despite the reported nonsignificant association of the biomarker with the outcome measure. A total of 288 studies assessed rejection as an outcome. We showed that these rejection studies shared the same characteristics as other studies. CONCLUSIONS Biomarker studies in kidney transplantation lack validation, rigorous design and methodology, accurate interpretation, and transparency. Higher standards are needed in biomarker research to prove the clinical utility and support clinical use.
Collapse
Affiliation(s)
- Marc Raynaud
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Solaf Al-Awadhi
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Kevin Louis
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Huanxi Zhang
- The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Xiaojun Su
- The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Valentin Goutaudier
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Jiali Wang
- The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Zeynep Demir
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Yongcheng Wei
- The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Agathe Truchot
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Antoine Bouquegneau
- Department of Nephrology-Dialysis-Transplantation, University Hospital of Liège, Liège, Belgium
| | - Arnaud Del Bello
- Department of Nephrology and Organ Transplantation, INSERM, CHU Rangueil & Purpan, Université Paul Sabatier, Toulouse, France
| | - Élodie Bailly
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
- Thomas E. Starzl Transplantation Institute, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
| | - Yannis Lombardi
- Kidney Transplant Department, Tenon Hospital, Assistance Publique – Hôpitaux de Paris, Paris, France
| | - Mehdi Maanaoui
- Nephrology Department, CHU Lille, Lille University, Lille, France
- INSERM U1190, Translational Research for Diabetes, Lille, France
| | - Alessia Giarraputo
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
- Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padua, Padua, Italy
| | - Sofia Naser
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Gillian Divard
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | - Olivier Aubert
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| | | | - Changxi Wang
- The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Longshan Liu
- The First Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Oriol Bestard
- Nephrology Department, Hospital de Vall d'Hebron, Barcelona, Spain
| | - Maarten Naesens
- Department of Microbiology, Immunology and Transplantation, Nephrology and Renal Transplantation Research Group, KU Leuven, Leuven, Belgium
| | - John J. Friedewald
- Division of Transplantation, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Carmen Lefaucheur
- Kidney Transplant Department, Saint-Louis Hospital, Assistance Publique - Hôpitaux de Paris, Paris, France
| | - Leonardo Riella
- Renal Division, Schuster Family Transplantation Research Center, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Gary Collins
- Center for Statistics in Medicine, NDORMS, Botnar Research Center, University of Oxford, Oxford, United Kingdom
| | - John P.A. Ioannidis
- Departments of Medicine, of Epidemiology and Population Health, of Biomedical Data Science, and of Statistics and Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California
| | - Alexandre Loupy
- INSERM, PARCC, Paris Institute for Transplantation and Organ Regeneration, Université de Paris Cité, Paris, France
| |
Collapse
|
4
|
Andaur Navarro CL, Damen JAA, Takada T, Nijman SWJ, Dhiman P, Ma J, Collins GS, Bajpai R, Riley RD, Moons KGM, Hooft L. Systematic review finds "spin" practices and poor reporting standards in studies on machine learning-based prediction models. J Clin Epidemiol 2023; 158:99-110. [PMID: 37024020 DOI: 10.1016/j.jclinepi.2023.03.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 02/24/2023] [Accepted: 03/28/2023] [Indexed: 04/08/2023]
Abstract
OBJECTIVES We evaluated the presence and frequency of spin practices and poor reporting standards in studies that developed and/or validated clinical prediction models using supervised machine learning techniques. STUDY DESIGN AND SETTING We systematically searched PubMed from 01/2018 to 12/2019 to identify diagnostic and prognostic prediction model studies using supervised machine learning. No restrictions were placed on data source, outcome, or clinical specialty. RESULTS We included 152 studies: 38% reported diagnostic models and 62% prognostic models. When reported, discrimination was described without precision estimates in 53/71 abstracts (74.6% [95% CI 63.4-83.3]) and 53/81 main texts (65.4% [95% CI 54.6-74.9]). Of the 21 abstracts that recommended the model to be used in daily practice, 20 (95.2% [95% CI 77.3-99.8]) lacked any external validation of the developed models. Likewise, 74/133 (55.6% [95% CI 47.2-63.8]) studies made recommendations for clinical use in their main text without any external validation. Reporting guidelines were cited in 13/152 (8.6% [95% CI 5.1-14.1]) studies. CONCLUSION Spin practices and poor reporting standards are also present in studies on prediction models using machine learning techniques. A tailored framework for the identification of spin will enhance the sound reporting of prediction model studies.
Collapse
Affiliation(s)
- Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Toshihiko Takada
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Steven W J Nijman
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Paula Dhiman
- Center for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Jie Ma
- Center for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK
| | - Gary S Collins
- Center for Statistics in Medicine, NDORMS, University of Oxford, Oxford, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Ram Bajpai
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Keele, UK
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands; Cochrane Netherlands, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
5
|
Dhiman P, Ma J, Andaur Navarro CL, Speich B, Bullock G, Damen JAA, Hooft L, Kirtley S, Riley RD, Van Calster B, Moons KGM, Collins GS. Overinterpretation of findings in machine learning prediction model studies in oncology: a systematic review. J Clin Epidemiol 2023; 157:120-133. [PMID: 36935090 DOI: 10.1016/j.jclinepi.2023.03.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 03/03/2023] [Accepted: 03/14/2023] [Indexed: 03/19/2023]
Abstract
OBJECTIVES In biomedical research, spin is the overinterpretation of findings, and it is a growing concern. To date, the presence of spin has not been evaluated in prognostic model research in oncology, including studies developing and validating models for individualized risk prediction. STUDY DESIGN AND SETTING We conducted a systematic review, searching MEDLINE and EMBASE for oncology-related studies that developed and validated a prognostic model using machine learning published between 1st January, 2019, and 5th September, 2019. We used existing spin frameworks and described areas of highly suggestive spin practices. RESULTS We included 62 publications (including 152 developed models; 37 validated models). Reporting was inconsistent between methods and the results in 27% of studies due to additional analysis and selective reporting. Thirty-two studies (out of 36 applicable studies) reported comparisons between developed models in their discussion and predominantly used discrimination measures to support their claims (78%). Thirty-five studies (56%) used an overly strong or leading word in their title, abstract, results, discussion, or conclusion. CONCLUSION The potential for spin needs to be considered when reading, interpreting, and using studies that developed and validated prognostic models in oncology. Researchers should carefully report their prognostic model research using words that reflect their actual results and strength of evidence.
Collapse
Affiliation(s)
- Paula Dhiman
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK.
| | - Jie Ma
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Constanza L Andaur Navarro
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Benjamin Speich
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK; Meta-Research Centre, Department of Clinical Research, University Hospital Basel, University of Basel, Basel, Switzerland
| | - Garrett Bullock
- Nuffield Department of Orthopaedics, Rheumatology, and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - Johanna A A Damen
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Lotty Hooft
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Shona Kirtley
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK
| | - Richard D Riley
- Centre for Prognosis Research, School of Medicine, Keele University, Staffordshire, UK, ST5 5BG
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium; Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands; EPI-centre, KU Leuven, Leuven, Belgium
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford OX3 7LD, UK; NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| |
Collapse
|
6
|
Jankowski S, Boutron I, Clarke M. Influence of the statistical significance of results and spin on readers' interpretation of the results in an abstract for a hypothetical clinical trial: a randomised trial. BMJ Open 2022; 12:e056503. [PMID: 35396295 PMCID: PMC8996040 DOI: 10.1136/bmjopen-2021-056503] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 02/28/2022] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVES To assess the impact on readers' interpretation of the results reported in an abstract for a hypothetical clinical trial with (1) a statistically significant result (SSR), (2) spin, (3) both an SSR and spin compared with (4) no spin and no SSR. PARTICIPANTS Health students and professionals from universities and health institutions in France and the UK. INTERVENTIONS Participants completed an online questionnaire using Likert scales and free text, after reading one of the four versions of an abstract about a hypothetical randomised trial evaluating 'Naranex' and 'Bulofil' (two hypothetical drugs) for chronic low back pain. The abstracts differed in (1) reported result of 'mean difference of 1.31 points (95% CI 0.08 to 2.54, p= 0.04)' or 'mean difference of 1.31 points (95% CI -0.08 to 2.70, p= 0.06)' and (2) presence or absence of spin. The effect size for the trial's primary outcome (pain disability score) was the same in each abstract, slightly in favour of Naranex. PRIMARY OUTCOME The reader's interpretation of the trial's results, based on their answer (1, disagree; 4, neutral; 7, agree) to the following statement: 'About the main findings of the study, what is your opinion about the following statement: 'Naranex is better than Bulofil'?' RESULTS Two hundred and ninety-seven of the 404 people randomised to receive one of the four abstracts completed the study. Respondents were more likely to favour Narenex when the abstract reported an SSR without spin, a statistically significant result with spin, a non-statistically significant result with spin, compared with when it reported a non-SSR without spin. CONCLUSION Statistical significance appears to have influenced readers' perception whatever the level of spin, while spin influenced readers' perception when the results were not statistically significant but did not appear to have an impact when results were statistically significant.
Collapse
Affiliation(s)
- Sofyan Jankowski
- Université Paris Cité, INSERM, INRAE, CNAM, Centre for Research in Epidemiology and Statistics (CRESS), F-75004, Paris, France
- Centre for Public Health, Queen's University Belfast, Belfast, UK
| | - Isabelle Boutron
- Université Paris Cité, INSERM, INRAE, CNAM, Centre for Research in Epidemiology and Statistics (CRESS), F-75004, Paris, France
| | - Mike Clarke
- Centre for Public Health, Queen's University Belfast, Belfast, UK
| |
Collapse
|
7
|
Ito C, Hashimoto A, Uemura K, Oba K. Misleading Reporting (Spin) in Noninferiority Randomized Clinical Trials in Oncology With Statistically Not Significant Results: A Systematic Review. JAMA Netw Open 2021; 4:e2135765. [PMID: 34874407 PMCID: PMC8652604 DOI: 10.1001/jamanetworkopen.2021.35765] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
IMPORTANCE Spin, the inaccurate reporting of randomized clinical trials (RCTs) with results that are not statistically significant for the primary end point, distorts interpretation of results and leads to misinterpretation. However, the prevalence of spin and related factors in noninferiority cancer RCTs remains unclear. OBJECTIVE To examine misleading reporting, or spin, and the associated factors in noninferiority cancer RCTs through a systematic review. DATA SOURCES A systematic search of the PubMed database was performed for articles published between January 1, 2010, and December 31, 2019, using the Cochrane Highly Sensitive Search Strategy. STUDY SELECTION Two investigators independently selected studies using the inclusion criteria of noninferiority parallel-group RCTs aiming to confirm effects to cancer treatments published between January 1, 2010, and December 31, 2019, reporting results that were not statistically significant for the primary end points. DATA EXTRACTION AND SYNTHESIS Standardized data abstraction was used to extract information concerning the trial characteristics and spin based on a prespecified definition. The main investigator extracted the trial characteristics while both readers independently evaluated the spin. The Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) reporting guideline was followed. MAIN OUTCOMES AND MEASURES The main outcome was spin prevalence in any section of the report. Spin was defined as use of specific reporting strategies, from whatever motive, to highlight that the experimental treatment is beneficial, despite no statistically significant difference for the primary outcome, or to distract the reader from results that are not statistically significant. The associations (prevalence difference and odds ratios [ORs]) between spin and trial characteristics were also evaluated. RESULTS The analysis included 52 of 2752 reports identified in the PubMed search. Spin was identified in 39 reports (75.0%; 95% CI, 61.6%-84.9%), including the abstract (34 reports [65.4%; 95% CI, 51.1%-76.9%]) and the main text (38 reports [73.1%; 95% CI, 59.7%-83.3%]). Univariate analysis found that the spin prevalence was higher in reports with data managers (prevalence difference, 27%; 95% CI, 1.1%-50.3%), reports without funding from for-profit sources (prevalence difference, 31.2%; 95% CI, 4.8%-53.8%), and reports of novel experimental treatments (prevalence difference, 37.5%; 95% CI, 5.8%-64.7%). Multivariable analysis found that novel experimental treatment (OR, 4.64; 95% CI, 0.98-22.02) and funding only from nonprofit sources only (OR, 5.20; 95% CI, 1.21-22.29) were associated with spin. CONCLUSIONS AND RELEVANCE In this systematic review, most noninferiority RCTs reporting results that were not statistically significant for the primary end points showed distorted interpretation and inaccurate reporting. The novelty of an experimental treatment and funding only from nonprofit sources were associated with spin.
Collapse
Affiliation(s)
- Chiyo Ito
- Graduate School of Interdisciplinary Information Studies, The University of Tokyo, Tokyo, Japan
- Clinical and Translational Research Center, Niigata University Medical and Dental Hospital, Niigata, Japan
| | - Atsushi Hashimoto
- Clinical and Translational Research Center, Niigata University Medical and Dental Hospital, Niigata, Japan
| | - Kohei Uemura
- Interfaculty Initiative in Information Studies, The University of Tokyo, Tokyo, Japan
| | - Koji Oba
- Interfaculty Initiative in Information Studies, The University of Tokyo, Tokyo, Japan
- Department of Biostatistics, School of Public Health, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
8
|
Shirvani S, Rives-Lange C, Rassy N, Berger A, Carette C, Poghosyan T, Czernichow S. Spin in the Scientific Literature on Bariatric Endoscopy: a Systematic Review of Randomized Controlled Trials. Obes Surg 2021; 32:503-511. [PMID: 34783961 DOI: 10.1007/s11695-021-05790-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/28/2021] [Accepted: 11/09/2021] [Indexed: 11/28/2022]
Abstract
Bariatric endoscopy (BE) is an emerging treatment option for people with obesity. Spin (i.e., the practice of frequent misrepresentation or overinterpretation of study findings) may lead to imbalanced and unjustified optimism in the interpretation of the results. The aim of this systematic review was to determine the frequency and type of spin in randomized controlled trials (RCTs) of endoscopic primary weight loss techniques with statistically significant and nonsignificant primary outcomes. In conclusion, spin is observed in the abstract and main text of BE reports and can lead to misinterpretation or overinterpretation of the results. Since BE challenges the available non-endoscopic treatments for obesity, further research is needed to better qualify these techniques, as being effective and safe, as well as predefined hypotheses and analyses.
Collapse
Affiliation(s)
- Sayeh Shirvani
- UMR1153, Epidemiology and Biostatistics Sorbonne Paris Cité Center (CRESS), METHODS team, INSERM, Paris, France
| | - Claire Rives-Lange
- UMR1153, Epidemiology and Biostatistics Sorbonne Paris Cité Center (CRESS), METHODS team, INSERM, Paris, France.,Assistance Publique - Hôpitaux de Paris (AP-HP), Hôpital Européen Georges Pompidou, Service de Nutrition, Centre Spécialisé Obésité, Université de Paris, 20 rue Leblanc, 75015, Paris, France
| | - Nathalie Rassy
- Département de Médecine Oncologique, Gustave Roussy, Villejuif, France
| | - Arthur Berger
- Pôle hépato-gastro-entérologie, diabétologie, nutrition et endocrinologie, Centre Hospitalier Universitaire (CHU) de Bordeaux, Bordeaux, France
| | - Claire Carette
- Assistance Publique - Hôpitaux de Paris (AP-HP), Hôpital Européen Georges Pompidou, Service de Nutrition, Centre Spécialisé Obésité, Université de Paris, 20 rue Leblanc, 75015, Paris, France.,INSERM, U1418, Centre d'Investigation Clinique (CIC), Université de Pairs, Paris, France
| | - Tigran Poghosyan
- Assistance Publique - Hôpitaux de Paris (AP-HP), Service de chirurgie digestive, Hôpital Européen Georges Pompidou, Université de Paris, Inserm UMRS 1149, Paris, France
| | - Sébastien Czernichow
- UMR1153, Epidemiology and Biostatistics Sorbonne Paris Cité Center (CRESS), METHODS team, INSERM, Paris, France. .,Assistance Publique - Hôpitaux de Paris (AP-HP), Hôpital Européen Georges Pompidou, Service de Nutrition, Centre Spécialisé Obésité, Université de Paris, 20 rue Leblanc, 75015, Paris, France.
| |
Collapse
|
9
|
Pereira GC, Prates G, Medina M, Ferreira C, Latorraca CDOC, Pacheco RL, Martimbianco ALC, Riera R. High frequency of spin bias in controlled trials of cannabis derivatives and their synthetic analogues: A meta-epidemiologic study. J Clin Epidemiol 2021; 140:3-12. [PMID: 34450305 DOI: 10.1016/j.jclinepi.2021.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 08/18/2021] [Accepted: 08/19/2021] [Indexed: 11/19/2022]
Abstract
OBJECTIVE To investigate the frequency and perform a qualitative analysis of spin bias in publications of controlled trials assessing the therapeutic use of cannabis derivatives and their synthetic analogues. STUDY DESIGN AND SETTING Meta-epidemiologic study carried out at the Universidade Federal de São Paulo, Brazil. RESULTS A total of 65 publications with at least one efficacy primary outcome were considered. The results analysis for the primary outcome indicated statistically significant effects in 44.6% (29/65) of the publications, and 70.7% (45/65) of the conclusions were considered favorable to the intervention. Among the 36 publications that found statistically nonsignificant results for the primary outcome, 44.4% (16/36) presented conclusions favorable to or recommending the intervention, which represents spin bias according to the definition adopted in this study. Qualitative analysis of the 16 studies with spin bias showed selective outcomes reporting (elevating secondary outcomes that had positive results or reporting only subgroup results), deviations from the planned statistical analysis, and failure to consider or report uncertainty in the estimates of treatment effects. CONCLUSION The frequency of spin bias among publications of controlled trials with statistically nonsignificant results assessing the therapeutic use of cannabis derivatives and their synthetic analogues reached 44.4%. When not observed by readers, such deviation can lead to misconduct in clinical practice through the adoption of interventions that are not effective or whose effectiveness is uncertain.
Collapse
Affiliation(s)
| | - Gabriela Prates
- Universidade Federal de São Paulo (Unifesp), São Paulo - SP, Brazil
| | - Matheus Medina
- Universidade Federal de São Paulo (Unifesp), São Paulo - SP, Brazil
| | | | | | - Rafael Leite Pacheco
- Centro Universitário São Camilo (CUSC), São Paulo - SP, Brazil. Hospital Sírio-Libanês. Universidade Federal de São Paulo (Unifesp). Oxford-Brazil EBM Alliance.
| | | | - Rachel Riera
- Centre of Health Technology Assessment, Hospital Sírio-Libanês, São Paulo, Brazil. Discipline of Evidence-Based Medicine, Escola Paulista de Medicina (EPM), Universidade Federal de São Paulo (Unifesp), São Paulo - SP, Brazil. Oxford-Brazil EBM Alliance
| |
Collapse
|
10
|
Faulkner JJ, Polson C, Dodd AH, Ottwell R, Arthur W, Neff J, Chronister J, Hartwell M, Wright DN, Vassar M. Evaluation of spin in the abstracts of systematic reviews and meta-analyses focused on the treatment of obesity. Obesity (Silver Spring) 2021; 29:1285-1293. [PMID: 34314111 DOI: 10.1002/oby.23192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 02/27/2021] [Accepted: 03/23/2021] [Indexed: 01/08/2023]
Abstract
OBJECTIVE Spin, i.e., the misrepresentation of research findings, has the potential to affect patient care. Evidence suggests that spin is prevalent in obesity randomized controlled trials. Therefore, the primary objective of this study was to evaluate spin in abstracts of systematic reviews covering obesity treatments. METHODS MEDLINE and Embase were searched to retrieve systematic reviews on obesity treatments. Each systematic review abstract was inspected for the nine most severe types of spin, i.e., the misrepresentation of study findings by exaggeration or omission, regardless of intentionality. Screening and data extraction occurred in a masked, triplicate fashion. Methodological quality was determined using A MeaSurement Tool to Assess systematic Reviews (AMSTAR-2). RESULTS Spin was identified in 20 (out of 200, 10%) abstracts, with spin type 5 (claiming efficacy despite high risk of bias among primary studies) being most common (11/200, 5.5%). Spin types 2 and 7, both related to unsupported efficacy claims, were not found. No associations were found between spin and extracted study characteristics. The methodological quality of the sample was rated as follows: critically low (23.0%), low (13.5%), moderate (60.5%), and high (3%). CONCLUSIONS Although these findings demonstrate a low proportion of spin in the abstracts of systematic reviews for obesity treatment; increased preventive measures may further reduce its presence.
Collapse
Affiliation(s)
- Jantzen J Faulkner
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Connor Polson
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Andrew H Dodd
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Ryan Ottwell
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Wade Arthur
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Jenny Neff
- Department of Internal Medicine, Oklahoma State University Medical Center, Tulsa, Oklahoma, USA
| | - Justin Chronister
- Department of Internal Medicine, Oklahoma State University Medical Center, Tulsa, Oklahoma, USA
| | - Micah Hartwell
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
- Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Drew N Wright
- Samuel J. Wood Library & C. V. Starr Biomedical Information Center, Weill Cornell Medical College, New York, New York, USA
| | - Matt Vassar
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
- Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| |
Collapse
|
11
|
Corcoran A, Neale M, Arthur W, Ottwell R, Roberts W, Hartwell M, Cates S, Wright DN, Beaman J, Vassar M. Evaluating spin in the abstracts of systematic reviews and meta-analyses on cannabis use disorder. Subst Abus 2021; 43:1-9. [PMID: 34283700 DOI: 10.1080/08897077.2021.1944953] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
BACKGROUND Clinicians rely upon abstracts to provide them quick synopses of research findings that may apply to their practice. Spin can exist within these abstracts that distorts or misrepresents the findings. Our goal was to evaluate the level of spin within systematic reviews (SRs) focused on the treatment of cannabis use disorder (CUD). Methods: A systematic search was conducted in May 2020. To meet inclusion criteria, publications had to be either an SR or meta-analysis related to the treatment of cannabis use. Screening and data extraction was performed in a duplicate and masked fashion. Study quality was assessed using AMSTAR-2 Results: 16/24 SRs (66.7%) contained at least one form of spin in the abstract. The most common forms of spin identified were type 3-selective reporting of or overemphasis on efficacy outcomes or analysis favoring the beneficial effect of the experimental intervention (45.8%)-and type 8-the review's findings from a surrogate marker or a specific outcome to the global improvement of the disease (37.5%). No significant association between spin and intervention type, PRISMA requirements, or funding source was identified. Weak positive correlations were found between the presence of spin and abstract word count (r =.217) and between spin and AMSTAR-2 rating (r = 0.143). "Moderate" was the most common AMSTAR-2 rating (9/24, 37.5%), followed by "low" (7/24, 29.2%) and "critically low" (7/24, 29.2%). One systematic review received an AMSTAR-2 rating of "high" (1/24, 4.2%). Conclusions: Spin was common among abstracts from the SRs focused on the treatments for CUD. Higher quality studies may help reduce the overall rate as well as standardizing treatment outcomes. To facilitate this, we encourage all authors, peer-reviewers, and editors to be more aware of the various types of spin as they can help reduce the overall amount of spin seen within the literature.
Collapse
Affiliation(s)
- Adam Corcoran
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Monika Neale
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
- College of Osteopathic Medicine, Kansas City University of Medicine and Biosciences, Kansas City, Missouri, USA
| | - Wade Arthur
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Ryan Ottwell
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Will Roberts
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Micah Hartwell
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
- Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Stephens Cates
- Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Drew N Wright
- Samuel J. Wood Library & C. V. Starr Biomedical Information Center, Weill Cornell Medical College, New York, New York, USA
| | - Jason Beaman
- Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| | - Matt Vassar
- Office of Medical Student Research, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
- Department of Psychiatry and Behavioral Sciences, Oklahoma State University Center for Health Sciences, Tulsa, Oklahoma, USA
| |
Collapse
|
12
|
Velde HM, van Heteren JAA, Smit AL, Stegeman I. Spin in Published Reports of Tinnitus Randomized Controlled Trials: Evidence of Overinterpretation of Results. Front Neurol 2021; 12:693937. [PMID: 34335451 PMCID: PMC8322656 DOI: 10.3389/fneur.2021.693937] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 06/01/2021] [Indexed: 11/13/2022] Open
Abstract
Background: Spin refers to reporting practices that could distort the interpretation and mislead readers by being more optimistic than the results justify, thereby possibly changing the perception of clinicians and influence their decisions. Because of the clinical importance of accurate interpretation of results and the evidence of spin in other research fields, we aim to identify the nature and frequency of spin in published reports of tinnitus randomized controlled trials (RCTs) and to assess possible determinants and effects of spin. Methods: We searched PubMed systematically for RCTs with tinnitus-related outcomes published from 2015 to 2019. All eligible articles were assessed on actual and potential spin using prespecified criteria. Results: Our search identified 628 studies, of which 87 were eligible for evaluation. A total of 95% of the studies contained actual or potential spin. Actual spin was found mostly in the conclusion of articles, which reflected something else than the reported point estimate (or CI) of the outcome (n = 34, 39%) or which was selectively focused (n = 49, 56%). Linguistic spin ("trend," "marginally significant," or "tendency toward an effect") was found in 17% of the studies. We were not able to assess the association between study characteristics and the occurrence of spin due to the low number of trials for some categories of the study characteristics. We found no effect of spin on type of journal [odds ratio (OR) -0.13, 95% CI -0.56-0.31], journal impact factor (OR 0.17, 95% CI -0.18-0.51), or number of citations (OR 1.95, CI -2.74-6.65). Conclusion: There is a large amount of spin in tinnitus RCTs. Our findings show that there is room for improvement in reporting and interpretation of results. Awareness of different forms of spin must be raised to improve research quality and reduce research waste.
Collapse
Affiliation(s)
- Hedwig M. Velde
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Utrecht, Utrecht, Netherlands
- University Medical Center Utrecht Brain Center, University Medical Center Utrecht, Utrecht, Netherlands
| | - Jan A. A. van Heteren
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Utrecht, Utrecht, Netherlands
- University Medical Center Utrecht Brain Center, University Medical Center Utrecht, Utrecht, Netherlands
| | - Adriana L. Smit
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Utrecht, Utrecht, Netherlands
- University Medical Center Utrecht Brain Center, University Medical Center Utrecht, Utrecht, Netherlands
| | - Inge Stegeman
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Utrecht, Utrecht, Netherlands
- University Medical Center Utrecht Brain Center, University Medical Center Utrecht, Utrecht, Netherlands
- Department of Ophthalmology, University Medical Center Utrecht, Utrecht, Netherlands
- Epidemiology and Data Science, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
13
|
Olarte Parra C, Bertizzolo L, Schroter S, Dechartres A, Goetghebeur E. Consistency of causal claims in observational studies: a review of papers published in a general medical journal. BMJ Open 2021; 11:e043339. [PMID: 34016660 PMCID: PMC8141434 DOI: 10.1136/bmjopen-2020-043339] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE To evaluate the consistency of causal statements in observational studies published in The BMJ. DESIGN Review of observational studies published in a general medical journal. DATA SOURCE Cohort and other longitudinal studies describing an exposure-outcome relationship published in The BMJ in 2018. We also had access to the submitted papers and reviewer reports. MAIN OUTCOME MEASURES Proportion of published research papers with 'inconsistent' use of causal language. Papers where language was consistently causal or non-causal were classified as 'consistently causal' or 'consistently not causal', respectively. For the 'inconsistent' papers, we then compared the published and submitted version. RESULTS Of 151 published research papers, 60 described eligible studies. Of these 60, we classified the causal language used as 'consistently causal' (48%), 'inconsistent' (20%) and 'consistently not causal'(32%). Eleven out of 12 (92%) of the 'inconsistent' papers were already inconsistent on submission. The inconsistencies found in both submitted and published versions were mainly due to mismatches between objectives and conclusions. One section might be carefully phrased in terms of association while the other presented causal language. When identifying only an association, some authors jumped to recommending acting on the findings as if motivated by the evidence presented. CONCLUSION Further guidance is necessary for authors on what constitutes a causal statement and how to justify or discuss assumptions involved. Based on screening these papers, we provide a list of expressions beyond the obvious 'cause' word which may inspire a useful more comprehensive compendium on causal language.
Collapse
Affiliation(s)
- Camila Olarte Parra
- Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
- Clinical Epidemiology, Biostatistics and Bioinformatics, University of Amsterdam, Amsterdam, Noord-Holland, The Netherlands
| | - Lorenzo Bertizzolo
- Clinical Epidemiology, Biostatistics and Bioinformatics, University of Amsterdam, Amsterdam, Noord-Holland, The Netherlands
- U 1153, Equipe Methods, INSERM, Paris, France
| | | | - Agnès Dechartres
- Département de Santé Publique, Centre de Pharmacoépidémiologie de l'AP-HP (Cephepi), CIC-1422, F75013, Sorbonne Université, Hôpital Pitié Salpêtrière, Paris, France
- Institut Pierre Louis d'Epidémiologie et de Santé Publique, AP-HP, Sorbonne Université, Institut National de la Santé et de la Recherche Médicale (INSERM), Paris, France
| | - Els Goetghebeur
- Applied Mathematics, Computer Science and Statistics, Ghent University, Gent, Belgium
| |
Collapse
|
14
|
Flores H, Kannan D, Ottwell R, Arthur W, Hartwell M, Patel N, Bowers A, Po W, Wright DN, Chen S, Miao Z, Vassar M. Evaluation of spin in the abstracts of systematic reviews and meta-analyses on breast cancer treatment, screening, and quality of life outcomes: A cross-sectional study. J Cancer Policy 2021; 27:100268. [DOI: 10.1016/j.jcpo.2020.100268] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 12/13/2020] [Accepted: 12/14/2020] [Indexed: 01/23/2023]
|
15
|
Ghannad M, Yang B, Leeflang M, Aldcroft A, Bossuyt PM, Schroter S, Boutron I. A randomized trial of an editorial intervention to reduce spin in the abstract's conclusion of manuscripts showed no significant effect. J Clin Epidemiol 2020; 130:69-77. [PMID: 33096222 DOI: 10.1016/j.jclinepi.2020.10.014] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 10/13/2020] [Accepted: 10/15/2020] [Indexed: 01/10/2023]
Abstract
OBJECTIVE To estimate the effect of an intervention compared to the usual peer-review process on reducing spin in the abstract's conclusion of biomedical study reports. STUDY DESIGN AND SETTING We conducted a two-arm, parallel-group RCT in a sample of primary research manuscripts submitted to BMJ Open. The authors received short instructions alongside the peer reviewers' comments in the intervention group. We assessed the presence of spin (primary outcome), types of spin, and wording change in the revised abstract's conclusion. Outcome assessors were blinded to the intervention assignment. RESULTS Of the 184 manuscripts randomized, 108 (54 intervention, 54 control) were selected for revision and could be evaluated for the presence of spin. The proportion of manuscripts with spin was 6% lower (95% CI: 24% lower to 13% higher) in the intervention group (57%, 31/54) than in the control group (63%, 34/54). The wording of the revised abstract's conclusion was changed in 34/54 (63%) manuscripts in the intervention group and 26/54 (48%) in the control group. The four prespecified types of spin involved (i) selective reporting (12 in the intervention group vs. 8 in the control group), (ii) including information not supported by evidence (9 vs. 9), and (iii) interpretation not consistent with the study results (14 vs. 18), and (iv) unjustified recommendations for practice (5 vs. 11). CONCLUSION These short instructions to authors did not have a statistically significant effect on reducing spin in revised abstract conclusions, and based on the confidence interval, the existence of a large effect can be excluded. Other interventions to reduce spin in reports of original research should be evaluated. STUDY REGISTRATION osf.io/xnuyt.
Collapse
Affiliation(s)
- Mona Ghannad
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam UMC, University of Amsterdam, Amsterdam Public Health Research Institute, Meibergdreef 9, 1105 AZ Amsterdam, the Netherlands; Université de Paris, CRESS, INSERM, INRA, F-75004 Paris, France.
| | - Bada Yang
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam UMC, University of Amsterdam, Amsterdam Public Health Research Institute, Meibergdreef 9, 1105 AZ Amsterdam, the Netherlands
| | | | | | - Patrick M Bossuyt
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Amsterdam UMC, University of Amsterdam, Amsterdam Public Health Research Institute, Meibergdreef 9, 1105 AZ Amsterdam, the Netherlands
| | | | | |
Collapse
|
16
|
Dormire KD, Whitehead AJ, Wayant C, Bowers A, Vassar M. Evaluation of misrepresented findings in the abstracts of acute respiratory distress syndrome randomized trials with nonsignificant primary endpoints. THE CLINICAL RESPIRATORY JOURNAL 2020; 15:287-292. [PMID: 33080096 DOI: 10.1111/crj.13295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 10/08/2020] [Indexed: 11/30/2022]
Abstract
OBJECTIVE We investigated the randomized controlled trials (RCTs) related to acute respiratory distress syndrome (ARDS) to assess the presentation and frequency of misrepresented research findings, also known as spin. METHODS We searched PubMed (MEDLINE) for studies published from January 1, 2011 to December 31, 2019. We included randomized controlled trials with an ARDS intervention and a nonsignificant primary endpoint. Trial screening and data extraction was performed on all studies independently and in duplicate. The primary endpoint was to investigate the frequency and manifestation of spin in RCT abstracts. Our secondary endpoint was to investigate associations between funding source and spin. RESULTS Our PubMed search returned 766 articles with 37 meeting inclusion criteria. Spin was present in 14 (14/37, 37.8%; 95% CI 22.5%-55.2%) abstracts. The most common manifestations of spin were claiming benefit based on a significant secondary endpoint (6/14, 42.9%), followed by the use of 'trend' statements, such as 'trend toward significance' (2/14, 14.3%; 95% CI 1.8%-42.8%). The most common spin in abstract conclusions was in the form of claiming benefit due to a significant secondary endpoint (3/4, 75%; 95% CI 19.4%-99.4%). Our secondary endpoint did not identify a significant difference in the prevalence of spin in publicly funded (5/19, 26.3%; 95% CI 9.1%-51.2%) compared to privately funded (4/12, 33.3%; 95% CI 9.9%-65.1%) studies (p>.05). CONCLUSIONS RCTs of ARDS interventions with nonsignificant primary endpoints often included spin in the abstract. Spin in the abstract may influence clinician appraisal and interpretation of diagnostic or treatment modalities.
Collapse
Affiliation(s)
- Kody D Dormire
- Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
| | - Aldon J Whitehead
- Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
| | - Cole Wayant
- Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
| | - Aaron Bowers
- Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
| | - Matt Vassar
- Oklahoma State University Center for Health Sciences, Tulsa, OK, USA
| |
Collapse
|
17
|
Bovbjerg ML. Current Resources for Evidence-Based Practice, September 2020. J Obstet Gynecol Neonatal Nurs 2020; 49:487-499. [PMID: 32805207 PMCID: PMC7428455 DOI: 10.1016/j.jogn.2020.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
An extensive review of new resources to support the provision of evidence-based care for women and infants. The current column includes a discussion of “spin” in scientific reporting and its effect on summaries and syntheses of the literature and commentaries on reviews about early versus late amniotomy as part of labor induction protocols and the economic burden associated with maternal morbidity.
Collapse
|
18
|
Soriano Sánchez JA, Soriano Solís S, Romero Rangel JAI. Role of the Checklist in Neurosurgery, a Realistic Perspective to "The Need for Surgical Safety Checklists in Neurosurgery Now and in the Future - a Systematic Review". World Neurosurg 2019; 134:121-122. [PMID: 31678319 DOI: 10.1016/j.wneu.2019.10.143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 10/22/2019] [Indexed: 10/25/2022]
|