Yang Q, Luo T, Zhang W, Zhong X, He P, Zheng H. Data-driven treatment pathways mining for early breast cancer using cSPADE algorithm and system clustering.
Int J Health Plann Manage 2022;
37:2569-2584. [PMID:
35445441 DOI:
10.1002/hpm.3483]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 02/09/2022] [Accepted: 03/30/2022] [Indexed: 02/05/2023] Open
Abstract
OBJECTIVES
Due to the multidimensional, multilayered, and chronological order of the cancer data, it was challenging for us to extract treatment paths. To determine whether the cSPADE algorithm and system clustering proposed in this study can effectively identify the treatment pathways for early breast cancer.
METHODS
We applied data mining technology to the electronic medical records of 6891 early breast cancer patients to mine treatment pathways. We provided a method of extracting data from EMR and performed three-stage mining: determining the treatment stage through the cSPADE algorithm → system clustering for treatment plan extraction → cSPADE mining sequence pattern for treatment. The Kolmogorov-Smirnov test and correlation analysis were used to cross-validate the sequence rules of early breast cancer treatment pathways.
RESULTS
We unearthed 55 sequence rules for early breast cancer treatment, 3 preoperative neoadjuvant chemotherapy regimens, three postoperative chemotherapy regimens, and 2 chemotherapy regimens for patients without surgery. Through 5-fold cross-validation, Pearson and Spearman correlation tests were performed. At the significance level of p < 0.05, all correlation coefficients of support, confidence and lift were greater than 0.89. Using the Kolmogorov-Smirnov test, we found no significant differences between the sequence distributions.
CONCLUSIONS
We have proved that cSPADE algorithm combined system clustering is an effective technique for identifying temporal relationships between treatment modalities, enabling hierarchical and vertical mining of breast cancer treatment models. In addition, we confirmed the robustness of the results by cross-validation of these treatment pathway ordering rules. Through this method, the treatment path of early breast cancer patients can be revealed, and the real-world breast cancer treatment behaviour model can be evaluated, which can provide reference for the redesign and optimization of treatment path.
Collapse