Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Dunn AG, Coiera E, Bourgeois FT. Unreported links between trial registrations and published articles were identified using document similarity measures in a cross-sectional analysis of ClinicalTrials.gov. J Clin Epidemiol 2018;95:94-101. [PMID: 29277557 DOI: 10.1016/j.jclinepi.2017.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Revised: 11/24/2017] [Accepted: 12/14/2017] [Indexed: 12/14/2022]

For:	Dunn AG, Coiera E, Bourgeois FT. Unreported links between trial registrations and published articles were identified using document similarity measures in a cross-sectional analysis of ClinicalTrials.gov. J Clin Epidemiol 2018;95:94-101. [PMID: 29277557 DOI: 10.1016/j.jclinepi.2017.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Revised: 11/24/2017] [Accepted: 12/14/2017] [Indexed: 12/14/2022]

Number

Cited by Other Article(s)

Wright EC, Kapuria D, Ben-Yakov G, Sharma D, Basu D, Cho MH, Abijo T, Wilkins KJ. Time to Publication for Randomized Clinical Trials Presented as Abstracts at Three Gastroenterology and Hepatology Conferences in 2017. GASTRO HEP ADVANCES 2023;2:370-379. [PMID: 36938381 PMCID: PMC10022591 DOI: 10.1016/j.gastha.2022.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 12/12/2022] [Indexed: 12/23/2022]

Eysenbach G, Šuster S, Baldwin T, Verspoor K. Predicting Publication of Clinical Trials Using Structured and Unstructured Data: Model Development and Validation Study. J Med Internet Res 2022;24:e38859. [PMID: 36563029 PMCID: PMC9823568 DOI: 10.2196/38859] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 10/14/2022] [Accepted: 11/16/2022] [Indexed: 12/24/2022] Open

Abstract

BACKGROUND

Publication of registered clinical trials is a critical step in the timely dissemination of trial findings. However, a significant proportion of completed clinical trials are never published, motivating the need to analyze the factors behind success or failure to publish. This could inform study design, help regulatory decision-making, and improve resource allocation. It could also enhance our understanding of bias in the publication of trials and publication trends based on the research direction or strength of the findings. Although the publication of clinical trials has been addressed in several descriptive studies at an aggregate level, there is a lack of research on the predictive analysis of a trial's publishability given an individual (planned) clinical trial description.

OBJECTIVE

We aimed to conduct a study that combined structured and unstructured features relevant to publication status in a single predictive approach. Established natural language processing techniques as well as recent pretrained language models enabled us to incorporate information from the textual descriptions of clinical trials into a machine learning approach. We were particularly interested in whether and which textual features could improve the classification accuracy for publication outcomes.

METHODS

In this study, we used metadata from ClinicalTrials.gov (a registry of clinical trials) and MEDLINE (a database of academic journal articles) to build a data set of clinical trials (N=76,950) that contained the description of a registered trial and its publication outcome (27,702/76,950, 36% published and 49,248/76,950, 64% unpublished). This is the largest data set of its kind, which we released as part of this work. The publication outcome in the data set was identified from MEDLINE based on clinical trial identifiers. We carried out a descriptive analysis and predicted the publication outcome using 2 approaches: a neural network with a large domain-specific language model and a random forest classifier using a weighted bag-of-words representation of text.

RESULTS

First, our analysis of the newly created data set corroborates several findings from the existing literature regarding attributes associated with a higher publication rate. Second, a crucial observation from our predictive modeling was that the addition of textual features (eg, eligibility criteria) offers consistent improvements over using only structured data (F₁-score=0.62-0.64 vs F₁-score=0.61 without textual features). Both pretrained language models and more basic word-based representations provide high-utility text representations, with no significant empirical difference between the two.

CONCLUSIONS

Different factors affect the publication of a registered clinical trial. Our approach to predictive modeling combines heterogeneous features, both structured and unstructured. We show that methods from natural language processing can provide effective textual features to enable more accurate prediction of publication success, which has not been explored for this task previously.

Collapse

Liu S, Bourgeois FT, Dunn AG. Identifying unreported links between ClinicalTrials.gov trial registrations and their published results. Res Synth Methods 2022;13:342-352. [PMID: 34970844 PMCID: PMC9090946 DOI: 10.1002/jrsm.1545] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Revised: 12/13/2021] [Accepted: 12/17/2021] [Indexed: 11/10/2022]

Nicholls SG, McDonald S, McKenzie JE, Carroll K, Taljaard M. A review identified challenges distinguishing primary reports of randomized trials for meta-research: A proposal for improved reporting. J Clin Epidemiol 2022;145:121-125. [PMID: 35081448 PMCID: PMC9233092 DOI: 10.1016/j.jclinepi.2022.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 01/04/2022] [Accepted: 01/18/2022] [Indexed: 11/15/2022]

Smalheiser NR, Holt AW. A web-based tool for automatically linking clinical trials to their publications. J Am Med Inform Assoc 2022;29:822-830. [PMID: 35020887 PMCID: PMC9006700 DOI: 10.1093/jamia/ocab290] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 12/20/2021] [Accepted: 12/23/2021] [Indexed: 01/12/2023] Open

Surian D, Bourgeois FT, Dunn AG. The automation of relevant trial registration screening for systematic review updates: an evaluation study on a large dataset of ClinicalTrials.gov registrations. BMC Med Res Methodol 2021;21:281. [PMID: 34922458 PMCID: PMC8684229 DOI: 10.1186/s12874-021-01485-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 11/22/2021] [Indexed: 11/10/2022] Open

Smalheiser NR, Holt AW. New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial. JAMIA Open 2020;3:338-341. [PMID: 33215068 PMCID: PMC7660960 DOI: 10.1093/jamiaopen/ooaa042] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 06/15/2020] [Accepted: 09/02/2020] [Indexed: 12/04/2022] Open

Harrison E, Martin P, Surian D, Dunn AG. Recommending research articles to consumers of online vaccination information. QUANTITATIVE SCIENCE STUDIES 2020. [DOI: 10.1162/qss_a_00030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Bashir R, Dunn AG. Software engineering principles address current problems in the systematic review ecosystem. J Clin Epidemiol 2019;109:136-141. [PMID: 30582972 DOI: 10.1016/j.jclinepi.2018.12.014] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Revised: 11/04/2018] [Accepted: 12/17/2018] [Indexed: 12/19/2022]

Martin P, Surian D, Bashir R, Bourgeois FT, Dunn AG. Trial2rev: Combining machine learning and crowd-sourcing to create a shared space for updating systematic reviews. JAMIA Open 2019;2:15-22. [PMID: 31984340 PMCID: PMC6951914 DOI: 10.1093/jamiaopen/ooy062] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Revised: 12/05/2018] [Accepted: 12/07/2018] [Indexed: 01/15/2023] Open