Chen S, Phuc PT, Nguyen P, Burton W, Lin S, Lin W, Lu CY, Hsu M, Cheng C, Hsu JC. A novel prediction model of the risk of pancreatic cancer among diabetes patients using multiple clinical data and machine learning.
Cancer Med 2023;
12:19987-19999. [PMID:
37737056 PMCID:
PMC10587954 DOI:
10.1002/cam4.6547]
[Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/14/2023] [Accepted: 09/06/2023] [Indexed: 09/23/2023] Open
Abstract
INTRODUCTION
Pancreatic cancer is associated with poor prognosis. Considering the increased global incidence of diabetes cases and that individuals with diabetes are considered a high-risk subpopulation for pancreatic cancer, it is critical to detect the risk of pancreatic cancer within populations of person living = with diabetes. This study aimed to develop a novel prediction model for pancreatic cancer risk among patients with diabetes, using = a real-world database containing clinical features and employing numerous artificial intelligent approach algorithms.
METHODS
This retrospective observational study analyzed data on patients with Type 2 diabetes from a multisite Taiwanese EMR database between 2009 and 2019. Predictors were selected in accordance with the literature review and clinical perspectives. The prediction models were constructed using machine learning algorithms such as logistic regression, linear discriminant analysis, gradient boosting machine, and random forest.
RESULTS
The cohort consisted of 66,384 patients. The Linear Discriminant Analysis (LDA) model generated the highest AUROC of 0.9073, followed by the Voting Ensemble and Gradient Boosting machine models. LDA, the best model, exhibited an accuracy of 84.03%, a sensitivity of 0.8611, and a specificity of 0.8403. The most significant predictors identified for pancreatic cancer risk were glucose, glycated hemoglobin, hyperlipidemia comorbidity, antidiabetic drug use, and lipid-modifying drug use.
CONCLUSION
This study successfully developed a highly accurate 4-year risk model for pancreatic cancer in patients with diabetes using real-world clinical data and multiple machine-learning algorithms. Potentially, our predictors offer an opportunity to identify pancreatic cancer early and thus increase prevention and invention windows to impact survival in diabetic patients.
Collapse