Title of Project: A Novel Tool that Allows Interactive Screening of PubMed Citations Showed Promise for the Semi-Automation of Identification of Biomedical Literature.
J Clin Epidemiol 2022;
150:63-71. [PMID:
35738306 DOI:
10.1016/j.jclinepi.2022.06.007]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 06/10/2022] [Accepted: 06/13/2022] [Indexed: 11/21/2022]
Abstract
OBJECTIVE
Systematic reviews form the basis of evidence-based medicine but are expensive and time-consuming to produce. To address this burden, we have developed a literature identification system (Pythia) that combines the query formulation and citation screening steps.
STUDY DESIGN
Pythia incorporates a set of natural-language questions with machine-learning algorithms to rank all PubMed citations based on relevance, returning the 100 top-ranked citations for human screening. The tagged citations are iteratively exploited by Pythia to refine the search and re-rank the citations.
RESULTS
Across seven systematic reviews, the ability of Pythia to identify the relevant citations (sensitivity) ranged from 0.09 to 0.58. The number of abstracts reviewed per relevant abstract (NNR) was lower than in the manually screened project in four reviews, higher in two, and had mixed results in one. The reviews that had greater overall sensitivity retrieved more relevant citations in early batches, but retrieval was generally unaffected by other aspects, such as study design, study size, and specific key question.
CONCLUSIONS
Due to its low sensitivity, Pythia is not ready for widespread use. Future research should explore ways to encode domain knowledge in query formulation to better enrich the questions used in the search.
Collapse