Zeng X, Liao T, Xu L, Wang Z. AERMNet: Attention-enhanced relational memory network for medical image report generation.
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;
244:107979. [PMID:
38113805 DOI:
10.1016/j.cmpb.2023.107979]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 11/26/2023] [Accepted: 12/12/2023] [Indexed: 12/21/2023]
Abstract
BACKGROUND AND OBJECTIVES
The automatic generation of medical image diagnostic reports can assist doctors in reducing their workload and improving the efficiency and accuracy of diagnosis. However, among the most existing report generation models, there are problems that the weak correlation between generated words and the lack of contextual information in the report generation process.
METHODS
To address the above problems, we propose an Attention-Enhanced Relational Memory Network (AERMNet) model, where the relational memory module is continuously updated by the words generated in the previous time step to strengthen the correlation between words in generated medical image report. And the double LSTM with interaction module reduces the loss of context information and makes full use of feature information. Thus, more accurate disease information can be generated by AERMNet for medical image reports.
RESULTS
Experimental results on four medical datasets Fetal heart (FH), Ultrasound, IU X-Ray and MIMIC-CXR, show that our proposed method outperforms some of the previous models with respect to language generation metrics (Cider improving by 2.4% on FH, Bleu1 improving by 2.4% on Ultrasound, Cider improving by 16.4% on IU X-Ray, Bleu2 improving by 9.7% on MIMIC-CXR).
CONCLUSIONS
This work promotes the development of medical image report generation and expands the prospects of computer-aided diagnosis applications. Our code is released at https://github.com/llttxx/AERMNET.
Collapse