Ye C, Fabbri D. Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.
J Biomed Inform 2018;
83:63-72. [PMID:
29793071 DOI:
10.1016/j.jbi.2018.05.014]
[Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 04/24/2018] [Accepted: 05/20/2018] [Indexed: 01/20/2023]
Abstract
OBJECTIVE
Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks.
MATERIALS AND METHODS
Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart.
RESULTS
The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question.
CONCLUSIONS
Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings.
Collapse