Liang ZJ, Chang S, Gao Y, Cao W, Kuo LR, Pomeroy MJ, Li LC, Abbasi AF, Bandovic J, Reiter MJ, Pickhardt PJ. Leveraging prior knowledge in machine intelligence to improve lesion diagnosis for early cancer detection.
Med Phys 2025. [PMID:
40268724 DOI:
10.1002/mp.17841]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2024] [Revised: 03/09/2025] [Accepted: 04/04/2025] [Indexed: 04/25/2025] Open
Abstract
BACKGROUND
Experts' interpretations of medical images for lesion diagnosis may not always align with the underlying in vivo tissue pathology and, therefore, cannot be considered the definitive truth regarding malignancy or benignity. While current machine learning (ML) models in medical imaging can replicate expert interpretations, their results may also diverge from the actual ground truth.
PURPOSE
This study investigates various factors contributing to these discrepancies and proposes solutions.
METHODS
The central idea of the proposed solution is to integrate prior knowledge into ML models to enhance the characterization of in vivo tissues. The incorporation of prior knowledge into decision-making is task-specific, tailored to the data acquired for that task. This central idea was tested on the diagnosis of lesions using low dose computed tomography (LdCT) for early cancer detection, particularly focusing on more challenging, ambiguous or indeterminate lesions (IDLs) as classified by experts. One key piece of prior knowledge involves CT x-ray energy spectrum, where different energies interact with in vivo tissues within a lesion, producing variable but reproducible image contrasts that encapsulate biological information. Typically, CT imaging devices use only the high-energy portion of this spectrum for data acquisition; however, this study considers the full spectrum for lesion diagnostics. Another critical aspect of prior knowledge includes the functional or dynamic properties of in vivo tissues, such as elasticity, which can indicate pathological conditions. Instead of relying solely on abstract image features as current ML models do, this study extracts these tissue pathological characteristics from the image contrast variations.
RESULTS
The method was tested on LdCT images of four sets of IDLs, including pulmonary nodules and colorectal polyps, with pathological reports serving as the ground truth for malignancy or benignity. The method achieved an area under the receiver operating characteristic curve (AUC) of 0.98 ± 0.03, demonstrating a significant improvement over existing state-of-the-art ML models, which typically have AUCs in the 0.70 range.
CONCLUSION
Leveraging prior knowledge in machine intelligence can enhance lesion diagnosis, resolve the ambiguity of IDLs interpreted by experts, and improve the effectiveness of LdCT screening for early-stage cancers.
Collapse