1
|
Jiang L, Lan M, Menke JD, Vorland CJ, Kilicoglu H. CONSORT-TM: Text classification models for assessing the completeness of randomized controlled trial publications. medRxiv 2024:2024.03.31.24305138. [PMID: 38633775 PMCID: PMC11023672 DOI: 10.1101/2024.03.31.24305138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
Objective To develop text classification models for determining whether the checklist items in the CONSORT reporting guidelines are reported in randomized controlled trial publications. Materials and Methods Using a corpus annotated at the sentence level with 37 fine-grained CONSORT items, we trained several sentence classification models (PubMedBERT fine-tuning, BioGPT fine-tuning, and in-context learning with GPT-4) and compared their performance. To address the problem of small training dataset, we used several data augmentation methods (EDA, UMLS-EDA, text generation and rephrasing with GPT-4) and assessed their impact on the fine-tuned PubMedBERT model. We also fine-tuned PubMedBERT models limited to checklist items associated with specific sections (e.g., Methods) to evaluate whether such models could improve performance compared to the single full model. We performed 5-fold cross-validation and report precision, recall, F1 score, and area under curve (AUC). Results Fine-tuned PubMedBERT model that takes as input the sentence and the surrounding sentence representations and uses section headers yielded the best overall performance (0.71 micro-F1, 0.64 macro-F1). Data augmentation had limited positive effect, UMLS-EDA yielding slightly better results than data augmentation using GPT-4. BioGPT fine-tuning and GPT-4 in-context learning exhibited suboptimal results. Methods-specific model yielded higher performance for methodology items, other section-specific models did not have significant impact. Conclusion Most CONSORT checklist items can be recognized reasonably well with the fine-tuned PubMedBERT model but there is room for improvement. Improved models can underpin the journal editorial workflows and CONSORT adherence checks and can help authors in improving the reporting quality and completeness of their manuscripts.
Collapse
Affiliation(s)
- Lan Jiang
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Mengfei Lan
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Joe D. Menke
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| | - Colby J Vorland
- Indiana University, School of Public Health, Bloomington, IN, USA
| | - Halil Kilicoglu
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, USA
| |
Collapse
|
2
|
Menke J, Eckmann P, Ozyurt IB, Roelandse M, Anderson N, Grethe J, Gamst A, Bandrowski A. Establishing Institutional Scores With the Rigor and Transparency Index: Large-scale Analysis of Scientific Reporting Quality. J Med Internet Res 2022; 24:e37324. [PMID: 35759334 PMCID: PMC9274430 DOI: 10.2196/37324] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/10/2022] [Accepted: 05/23/2022] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Improving rigor and transparency measures should lead to improvements in reproducibility across the scientific literature; however, the assessment of measures of transparency tends to be very difficult if performed manually. OBJECTIVE This study addresses the enhancement of the Rigor and Transparency Index (RTI, version 2.0), which attempts to automatically assess the rigor and transparency of journals, institutions, and countries using manuscripts scored on criteria found in reproducibility guidelines (eg, Materials Design, Analysis, and Reporting checklist criteria). METHODS The RTI tracks 27 entity types using natural language processing techniques such as Bidirectional Long Short-term Memory Conditional Random Field-based models and regular expressions; this allowed us to assess over 2 million papers accessed through PubMed Central. RESULTS Between 1997 and 2020 (where data were readily available in our data set), rigor and transparency measures showed general improvement (RTI 2.29 to 4.13), suggesting that authors are taking the need for improved reporting seriously. The top-scoring journals in 2020 were the Journal of Neurochemistry (6.23), British Journal of Pharmacology (6.07), and Nature Neuroscience (5.93). We extracted the institution and country of origin from the author affiliations to expand our analysis beyond journals. Among institutions publishing >1000 papers in 2020 (in the PubMed Central open access set), Capital Medical University (4.75), Yonsei University (4.58), and University of Copenhagen (4.53) were the top performers in terms of RTI. In country-level performance, we found that Ethiopia and Norway consistently topped the RTI charts of countries with 100 or more papers per year. In addition, we tested our assumption that the RTI may serve as a reliable proxy for scientific replicability (ie, a high RTI represents papers containing sufficient information for replication efforts). Using work by the Reproducibility Project: Cancer Biology, we determined that replication papers (RTI 7.61, SD 0.78) scored significantly higher (P<.001) than the original papers (RTI 3.39, SD 1.12), which according to the project required additional information from authors to begin replication efforts. CONCLUSIONS These results align with our view that RTI may serve as a reliable proxy for scientific replicability. Unfortunately, RTI measures for journals, institutions, and countries fall short of the replicated paper average. If we consider the RTI of these replication studies as a target for future manuscripts, more work will be needed to ensure that the average manuscript contains sufficient information for replication attempts.
Collapse
Affiliation(s)
- Joe Menke
- Center for Research in Biological Systems, University of California, San Diego, La Jolla, CA, United States
- SciCrunch Inc., San Diego, CA, United States
| | - Peter Eckmann
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| | - Ibrahim Burak Ozyurt
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| | | | | | - Jeffrey Grethe
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| | - Anthony Gamst
- Department of Mathematics, University of California, San Diego, CA, United States
| | - Anita Bandrowski
- SciCrunch Inc., San Diego, CA, United States
- Department of Neuroscience, University of California, San Diego, La Jolla, CA, United States
| |
Collapse
|
3
|
Kearney A, Rosala-Hallas A, Rainford N, Blazeby JM, Clarke M, Lane AJ, Gamble C. Increased transparency was required when reporting imputation of primary outcome data in clinical trials. J Clin Epidemiol 2022; 146:60-67. [PMID: 35218883 DOI: 10.1016/j.jclinepi.2022.02.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 02/02/2022] [Accepted: 02/17/2022] [Indexed: 11/16/2022]
Abstract
OBJECTIVE To explore the transparency of reporting primary outcome data within randomised controlled trials (RCTs) in the presence of missing data. STUDY DESIGN / Setting: A cohort examination of RCTs published in the four major medical journals (NEJM, JAMA, BMJ, Lancet) in 2013 and the first quarter of 2018. Data was extracted on reporting quality, the number of randomised participants and the number of participants included within the primary outcome analysis with observed or imputed data. RESULTS 91/159 (57%) of the studies analysed from 2013 and 19/46 (41%) from 2018 included imputed data within the primary outcome analysis. Of these, only 13/91 (14%) studies from 2013 and 1/19 (5%) studies from 2018 explicitly reported the number of imputed values in the CONSORT diagram. Results tables included levels of imputed data in 12/91 (13%) studies in 2013 and 4/19 (21%) in 2018. Consequently, identification of imputed data was a time-consuming task requiring extensive cross-referencing of all manuscript elements. CONCLUSION Imputed primary outcome data is poorly reported. Participant flow diagrams frequently reported participant status which does not necessarily correlate to availability of data. We recommended that the number of imputed values are explicitly reported within CONSORT flow diagrams to increase transparency.
Collapse
Affiliation(s)
- Anna Kearney
- Department of Health Data Science, University of Liverpool, Liverpool, UK.
| | - Anna Rosala-Hallas
- Liverpool Clinical Trials Centre, University of Liverpool, Liverpool, UK
| | - Naomi Rainford
- Liverpool Clinical Trials Centre, University of Liverpool, Liverpool, UK
| | - Jane M Blazeby
- Bristol Biomedical Research Centre, Population health sciences, University of Bristol, Bristol. UK
| | - Mike Clarke
- Northern Ireland Methodology Hub, Centre for Public Health, Queen's University Belfast, Belfast, UK
| | - Athene J Lane
- CONDuCTII Hub for Trials Methodology Research, School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Carrol Gamble
- Department of Health Data Science, University of Liverpool, Liverpool, UK; Liverpool Clinical Trials Centre, University of Liverpool, Liverpool, UK
| |
Collapse
|
4
|
Kahan D. Critical Appraisal of Qualitative Studies of Muslim Females' Perceptions of Physical Activity Barriers and Facilitators. Int J Environ Res Public Health 2019; 16:E5040. [PMID: 31835677 DOI: 10.3390/ijerph16245040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 12/07/2019] [Accepted: 12/09/2019] [Indexed: 11/25/2022]
Abstract
Muslim women’s perceptions of cultural, religious, and secular determinants of physical activity have been studied for many years, with information typically acquired through focus groups or interviews. Multiple reviews synthesizing the research have been published, however, individual studies have not been scrutinized for their quality/rigor. Therefore, I critically appraised the quality of the body of qualitative research studies that utilized focus groups to identify Muslim women’s perceptions of physical activity barriers and facilitators. I utilized 26 items from the Consolidated Criteria for Reporting Qualitative Research (COREQ) to assess the quality of 56 papers published between 1987 and 2016. Using crosstabulations, I also examined associations between paper quality (low vs. high) and binary categorical variables for impact factor, maximum paper length allowed, publication year, and database the paper was indexed. Overall, papers averaged only 10.5 of 26 COREQ reporting criteria and only two out of 26 items were reported in more than 75% of the papers. Paper quality was not associated with impact factor and length. High quality papers were more likely published more recently (i.e., 2011 or later) and in journals indexed in the PubMed database compared to low quality papers. There is contention among qualitative researchers about standardizing reporting criteria, and while the trend in quality appears to be improving, journal reviewers and editors ought to hold authors to greater accountability in reporting.
Collapse
|
5
|
McManus E, Turner D, Gray E, Khawar H, Okoli T, Sach T. Barriers and Facilitators to Model Replication Within Health Economics. Value Health 2019; 22:1018-1025. [PMID: 31511178 DOI: 10.1016/j.jval.2019.04.1928] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 04/09/2019] [Accepted: 04/28/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND Model replication is important because it enables researchers to check research integrity and transparency and, potentially, to inform the model conceptualization process when developing a new or updated model. OBJECTIVE The aim of this study was to evaluate the replicability of published decision analytic models and to identify the barriers and facilitators to replication. METHODS Replication attempts of 5 published economic modeling studies were made. The replications were conducted using only publicly available information within the manuscripts and supplementary materials. The replicator attempted to reproduce the key results detailed in the paper, for example, the total cost, total outcomes, and if applicable, incremental cost-effectiveness ratio reported. Although a replication attempt was not explicitly defined as a success or failure, the replicated results were compared for percentage difference to the original results. RESULTS In conducting the replication attempts, common barriers and facilitators emerged. For most case studies, the replicator needed to make additional assumptions when recreating the model. This was often exacerbated by conflicting information being presented in the text and the tables. Across the case studies, the variation between original and replicated results ranged from -4.54% to 108.00% for costs and -3.81% to 0.40% for outcomes. CONCLUSION This study demonstrates that although models may appear to be comprehensively reported, it is often not enough to facilitate a precise replication. Further work is needed to understand how to improve model transparency and in turn increase the chances of replication, thus ensuring future usability.
Collapse
Affiliation(s)
- Emma McManus
- Norwich Medical School, University of East Anglia, Norwich, England, UK.
| | - David Turner
- Norwich Medical School, University of East Anglia, Norwich, England, UK
| | - Ewan Gray
- Division of Population Health, Health Services Research & Primary Care, The University of Manchester, Manchester, England, UK
| | - Haseeb Khawar
- Norwich Medical School, University of East Anglia, Norwich, England, UK
| | - Toochukwu Okoli
- Norwich Medical School, University of East Anglia, Norwich, England, UK
| | - Tracey Sach
- Norwich Medical School, University of East Anglia, Norwich, England, UK
| |
Collapse
|
6
|
Fink A, Parhami I, Rosenthal RJ, Campos MD, Siani A, Fong TW. How transparent is behavioral intervention research on pathological gambling and other gambling-related disorders? A systematic literature review. Addiction 2012; 107:1915-28. [PMID: 22487136 PMCID: PMC3401241 DOI: 10.1111/j.1360-0443.2012.03911.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AIMS To review the transparency of reports of behavioral interventions for pathological gambling and other gambling-related disorders. METHODS We used the Transparent Reporting of Evaluations with Nonrandomized Designs (TREND) Statement to develop the 59-question adapted TREND questionnaire (ATQ). Each ATQ question corresponds to a transparency guideline and asks how clearly a study reports its objectives, research design, analytical methods and conclusions. A subset of 23 ATQ questions is considered particularly important. We searched PubMed, PsychINFO and Web of Science to identify experimental evaluations published between 2000 and 2011 aiming to reduce problem gambling behaviors or decrease problems caused by gambling. Twenty-six English-language reports met the inclusion criteria and were reviewed by three abstractors using the ATQ. RESULTS The average report adhered to 38.4 (65.1%) of the 59 ATQ transparency guidelines. Each of the 59 ATQ questions received positive responses from an average of 16.9 (63.8%) of the reports. The subset of 23 particularly relevant questions received an average of 15.3 (66.5%) positive responses. Thirty-two of 59 (54%) ATQ questions were answered positively by 75% or more of the study reports, while 12 (20.3%) received positive responses by 25% or fewer. Publication year did not affect these findings. CONCLUSIONS Gambling intervention reports need to improve their transparency by adhering to currently neglected and particularly relevant guidelines. Among them are recommendations for comparing study participants who are lost to follow-up and those who are retained, comparing study participants with the target population, describing methods used to minimize potential bias due to group assignment, and reporting adverse events or unintended effects.
Collapse
Affiliation(s)
- Arlene Fink
- University of California, Los Angeles (UCLA) Gambling Studies Program, Los Angeles, CA 90095, USA.
| | - Iman Parhami
- UCLA Gambling Studies Program, Los Angeles, CA, USA
,Department of Psychiatry and Biobehavioral Sciences at University of California, Los Angeles, USA
| | - Richard J. Rosenthal
- UCLA Gambling Studies Program, Los Angeles, CA, USA
,David Geffen School of Medicine, University of California, Los Angeles, USA
,Department of Psychiatry and Biobehavioral Sciences at University of California, Los Angeles, USA
| | - Michael D. Campos
- UCLA Gambling Studies Program, Los Angeles, CA, USA
,Department of Psychiatry and Biobehavioral Sciences at University of California, Los Angeles, USA
| | - Aaron Siani
- UCLA Gambling Studies Program, Los Angeles, CA, USA
,David Geffen School of Medicine, University of California, Los Angeles, USA
| | - Timothy W. Fong
- UCLA Gambling Studies Program, Los Angeles, CA, USA
,David Geffen School of Medicine, University of California, Los Angeles, USA
,Department of Psychiatry and Biobehavioral Sciences at University of California, Los Angeles, USA
| |
Collapse
|