Li Z, Wang Y, Zhu Y, Xu J, Wei J, Xie J, Zhang J. Modality-based attention and dual-stream multiple instance convolutional neural network for predicting microvascular invasion of hepatocellular carcinoma.
Front Oncol 2023;
13:1195110. [PMID:
37434971 PMCID:
PMC10331018 DOI:
10.3389/fonc.2023.1195110]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 05/30/2023] [Indexed: 07/13/2023] [Imported: 08/29/2023] Open
Abstract
Background and purpose
The presence of microvascular invasion (MVI) is a crucial indicator of postoperative recurrence in patients with hepatocellular carcinoma (HCC). Detecting MVI before surgery can improve personalized surgical planning and enhance patient survival. However, existing automatic diagnosis methods for MVI have certain limitations. Some methods only analyze information from a single slice and overlook the context of the entire lesion, while others require high computational resources to process the entire tumor with a three-dimension (3D) convolutional neural network (CNN), which could be challenging to train. To address these limitations, this paper proposes a modality-based attention and dual-stream multiple instance learning(MIL) CNN.
Materials and methods
In this retrospective study, 283 patients with histologically confirmed HCC who underwent surgical resection between April 2017 and September 2019 were included. Five magnetic resonance (MR) modalities including T2-weighted, arterial phase, venous phase, delay phase and apparent diffusion coefficient images were used in image acquisition of each patient. Firstly, Each two-dimension (2D) slice of HCC magnetic resonance image (MRI) was converted into an instance embedding. Secondly, modality attention module was designed to emulates the decision-making process of doctors and helped the model to focus on the important MRI sequences. Thirdly, instance embeddings of 3D scans were aggregated into a bag embedding by a dual-stream MIL aggregator, in which the critical slices were given greater consideration. The dataset was split into a training set and a testing set in a 4:1 ratio, and model performance was evaluated using five-fold cross-validation.
Results
Using the proposed method, the prediction of MVI achieved an accuracy of 76.43% and an AUC of 74.22%, significantly surpassing the performance of the baseline methods.
Conclusion
Our modality-based attention and dual-stream MIL CNN can achieve outstanding results for MVI prediction.
Collapse