1
|
Wu Y, Zhang Y, Duan S, Gu C, Wei C, Fang Y. Survival prediction in second primary breast cancer patients with machine learning: An analysis of SEER database. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108310. [PMID: 38996803 DOI: 10.1016/j.cmpb.2024.108310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 06/01/2024] [Accepted: 06/25/2024] [Indexed: 07/14/2024]
Abstract
BACKGROUND Studies have found that first primary cancer (FPC) survivors are at high risk of developing second primary breast cancer (SPBC). However, there is a lack of prognostic studies specifically focusing on patients with SPBC. METHODS This retrospective study used data from Surveillance, Epidemiology and End Results Program. We selected female FPC survivors diagnosed with SPBC from 12 registries (from January 1998 to December 2018) to construct prognostic models. Meanwhile, SPBC patients selected from another five registries (from January 2010 to December 2018) were used as the validation set to test the model's generalization ability. Four machine learning models and a Cox proportional hazards regression (CoxPH) were constructed to predict the overall survival of SPBC patients. Univariate and multivariate Cox regression analyses were used for feature selection. Model performance was assessed using time-dependent area under the ROC curve (t-AUC) and integrated Brier score (iBrier). RESULTS A total of 10,321 female FPC survivors with SPBC (mean age [SD]: 66.03 [11.17]) were included for model construction. These patients were randomly split into a training set (mean age [SD]: 65.98 [11.15]) and a test set (mean age [SD]: 66.15 [11.23]) with a ratio of 7:3. In validation set, a total of 3,638 SPBC patients (mean age [SD]: 66.28 [10.68]) were finally enrolled. Sixteen features were selected for model construction through univariate and multivariable Cox regression analyses. Among five models, random survival forest model showed excellent performance with a t-AUC of 0.805 (95 %CI: 0.803 - 0.807) and an iBrier of 0.123 (95 %CI: 0.122 - 0.124) on testing set, as well as a t-AUC of 0.803 (95 %CI: 0.801 - 0.807) and an iBrier of 0.098 (95 %CI: 0.096 - 0.103) on validation set. Through feature importance ranking, the top one and other top five key predictive features of the random survival forest model were identified, namely age, stage, regional nodes positive, latency, radiotherapy, and surgery. CONCLUSIONS The random survival forest model outperformed CoxPH and other machine learning models in predicting the overall survival of patients with SPBC, which was helpful for the monitoring of high-risk populations.
Collapse
Affiliation(s)
- Yafei Wu
- School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China; School of Nursing, Faculty of Health and Social Sciences, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Yaheng Zhang
- School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China
| | - Siyu Duan
- School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China
| | - Chenming Gu
- School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China
| | - Chongtao Wei
- School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China
| | - Ya Fang
- School of Public Health, Xiamen University, Xiang'an South Road, Xiang'an District, Xiamen, Fujian 361102, China; Key Laboratory of Health Technology Assessment of Fujian Province, Xiamen, Fujian, China; National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian, China.
| |
Collapse
|
2
|
Nwaiwu CA, Rivera Perla KM, Abel LB, Sears IJ, Barton AT, Peterson RC, Liu YZ, Khatri IS, Sarkar IN, Shah N. Predicting Colonic Neoplasia Surgical Complications: A Machine Learning Approach. Dis Colon Rectum 2024; 67:700-713. [PMID: 38319746 DOI: 10.1097/dcr.0000000000003166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
BACKGROUND A range of statistical approaches have been used to help predict outcomes associated with colectomy. The multifactorial nature of complications suggests that machine learning algorithms may be more accurate in determining postoperative outcomes by detecting nonlinear associations, which are not readily measured by traditional statistics. OBJECTIVE The aim of this study was to investigate the utility of machine learning algorithms to predict complications in patients undergoing colectomy for colonic neoplasia. DESIGN Retrospective analysis using decision tree, random forest, and artificial neural network classifiers to predict postoperative outcomes. SETTINGS National Inpatient Sample database (2003-2017). PATIENTS Adult patients who underwent elective colectomy with anastomosis for neoplasia. MAIN OUTCOME MEASURES Performance was quantified using sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve to predict the incidence of anastomotic leak, prolonged length of stay, and inpatient mortality. RESULTS A total of 14,935 patients (4731 laparoscopic, 10,204 open) were included. They had an average age of 67 ± 12.2 years, and 53% of patients were women. The 3 machine learning models successfully identified patients who developed the measured complications. Although differences between model performances were largely insignificant, the neural network scored highest for most outcomes: predicting anastomotic leak, area under the receiver operating characteristic curve 0.88/0.93 (open/laparoscopic, 95% CI, 0.73-0.92/0.80-0.96); prolonged length of stay, area under the receiver operating characteristic curve 0.84/0.88 (open/laparoscopic, 95% CI, 0.82-0.85/0.85-0.91); and inpatient mortality, area under the receiver operating characteristic curve 0.90/0.92 (open/laparoscopic, 95% CI, 0.85-0.96/0.86-0.98). LIMITATIONS The patients from the National Inpatient Sample database may not be an accurate sample of the population of all patients undergoing colectomy for colonic neoplasia and does not account for specific institutional and patient factors. CONCLUSIONS Machine learning predicted postoperative complications in patients with colonic neoplasia undergoing colectomy with good performance. Although validation using external data and optimization of data quality will be required, these machine learning tools show great promise in assisting surgeons with risk-stratification of perioperative care to improve postoperative outcomes. See Video Abstract . PREDICCIN DE LAS COMPLICACIONES QUIRRGICAS DE LA NEOPLASIA DE COLON UN ENFOQUE DE MODELO DE APRENDIZAJE AUTOMTICO ANTECEDENTES:Se han utilizado una variedad de enfoques estadísticos para ayudar a predecir los resultados asociados con la colectomía. La naturaleza multifactorial de las complicaciones sugiere que los algoritmos de aprendizaje automático pueden ser más precisos en determinar los resultados posoperatorios al detectar asociaciones no lineales, que generalmente no se miden en las estadísticas tradicionales.OBJETIVO:El objetivo de este estudio fue investigar la utilidad de los algoritmos de aprendizaje automático para predecir complicaciones en pacientes sometidos a colectomía por neoplasia de colon.DISEÑO:Análisis retrospectivo utilizando clasificadores de árboles de decisión, bosques aleatorios y redes neuronales artificiales para predecir los resultados posoperatorios.AJUSTE:Base de datos de la Muestra Nacional de Pacientes Hospitalizados (2003-2017).PACIENTES:Pacientes adultos sometidos a colectomía electiva con anastomosis por neoplasia.INTERVENCIONES:N/A.PRINCIPALES MEDIDAS DE RESULTADO:El rendimiento se cuantificó utilizando la sensibilidad, especificidad, precisión y la característica operativa del receptor del área bajo la curva para predecir la incidencia de fuga anastomótica, duración prolongada de la estancia hospitalaria y mortalidad de los pacientes hospitalizados.RESULTADOS:Se incluyeron un total de 14.935 pacientes (4.731 laparoscópicos, 10.204 abiertos). Presentaron una edad promedio de 67 ± 12,2 años y el 53% eran mujeres. Los tres modelos de aprendizaje automático identificaron con éxito a los pacientes que desarrollaron las complicaciones medidas. Aunque las diferencias entre el rendimiento del modelo fueron en gran medida insignificantes, la red neuronal obtuvo la puntuación más alta para la mayoría de los resultados: predicción de fuga anastomótica, característica operativa del receptor del área bajo la curva 0,88/0,93 (abierta/laparoscópica, IC del 95%: 0,73-0,92/0,80-0,96); duración prolongada de la estancia hospitalaria, característica operativa del receptor del área bajo la curva 0,84/0,88 (abierta/laparoscópica, IC del 95%: 0,82-0,85/0,85-0,91); y mortalidad de pacientes hospitalizados, característica operativa del receptor del área bajo la curva 0,90/0,92 (abierto/laparoscópico, IC del 95%: 0,85-0,96/0,86-0,98).LIMITACIONES:Los pacientes de la base de datos de la Muestra Nacional de Pacientes Hospitalizados pueden no ser una muestra precisa de la población de todos los pacientes sometidos a colectomía por neoplasia de colon y no tienen en cuenta factores institucionales y específicos del paciente.CONCLUSIONES:El aprendizaje automático predijo con buen rendimiento las complicaciones postoperatorias en pacientes con neoplasia de colon sometidos a colectomía. Aunque será necesaria la validación mediante datos externos y la optimización de la calidad de los datos, estas herramientas de aprendizaje automático son muy prometedoras para ayudar a los cirujanos con la estratificación de riesgos de la atención perioperatoria para mejorar los resultados posoperatorios. (Traducción-Dr. Fidel Ruiz Healy ).
Collapse
Affiliation(s)
- Chibueze A Nwaiwu
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | - Krissia M Rivera Perla
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Logan B Abel
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | - Isaac J Sears
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | - Andrew T Barton
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | | | - Yao Z Liu
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | - Ishaani S Khatri
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| | - Indra N Sarkar
- Center for Biomedical Informatics, Brown University, Providence, Rhode Island
- Rhode Island Quality Institute, Providence, Rhode Island
| | - Nishit Shah
- Department of Surgery, Warren Alpert Medical School of Brown University, Rhode Island Hospital, Providence, Rhode Island
| |
Collapse
|
3
|
Huguet N, Chen J, Parikh RB, Marino M, Flocke SA, Likumahuwa-Ackman S, Bekelman J, DeVoe JE. Applying Machine Learning Techniques to Implementation Science. Online J Public Health Inform 2024; 16:e50201. [PMID: 38648094 DOI: 10.2196/50201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 11/15/2023] [Accepted: 03/14/2024] [Indexed: 04/25/2024] Open
Abstract
Machine learning (ML) approaches could expand the usefulness and application of implementation science methods in clinical medicine and public health settings. The aim of this viewpoint is to introduce a roadmap for applying ML techniques to address implementation science questions, such as predicting what will work best, for whom, under what circumstances, and with what predicted level of support, and what and when adaptation or deimplementation are needed. We describe how ML approaches could be used and discuss challenges that implementation scientists and methodologists will need to consider when using ML throughout the stages of implementation.
Collapse
Affiliation(s)
- Nathalie Huguet
- Department of Family Medicine, Oregon Health & Science University, Portland, OR, United States
- BRIDGE-C2 Implementation Science Center for Cancer Control, Oregon Health & Science University, Portland, OR, United States
| | - Jinying Chen
- Section of Preventive Medicine and Epidemiology, Department of Medicine, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States
- Data Science Core, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, United States
- iDAPT Implementation Science Center for Cancer Control, Wake Forest School of Medicine, Winston-Salem, NC, United States
| | - Ravi B Parikh
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Miguel Marino
- Department of Family Medicine, Oregon Health & Science University, Portland, OR, United States
- BRIDGE-C2 Implementation Science Center for Cancer Control, Oregon Health & Science University, Portland, OR, United States
| | - Susan A Flocke
- Department of Family Medicine, Oregon Health & Science University, Portland, OR, United States
- BRIDGE-C2 Implementation Science Center for Cancer Control, Oregon Health & Science University, Portland, OR, United States
| | - Sonja Likumahuwa-Ackman
- Department of Family Medicine, Oregon Health & Science University, Portland, OR, United States
- BRIDGE-C2 Implementation Science Center for Cancer Control, Oregon Health & Science University, Portland, OR, United States
| | - Justin Bekelman
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Penn Center for Cancer Care Innovation, Abramson Cancer Center, Penn Medicine, Philadelphia, PA, United States
| | - Jennifer E DeVoe
- Department of Family Medicine, Oregon Health & Science University, Portland, OR, United States
- BRIDGE-C2 Implementation Science Center for Cancer Control, Oregon Health & Science University, Portland, OR, United States
| |
Collapse
|
4
|
Xia Y, Zhang B, Zhang Y. Deep survival analysis using pseudo values and its application to predict the recurrence of stage IV colorectal cancer after tumor resection. Comput Methods Biomech Biomed Engin 2023:1-10. [PMID: 37916498 DOI: 10.1080/10255842.2023.2275246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 10/18/2023] [Indexed: 11/03/2023]
Abstract
An improved DeepSurv model is proposed for predicting the prognosis of colorectal cancer patients at stage IV. Our model, called as PseudoDeepSurv, is optimized by a novel loss function, which is the combination of the average negative log partial likelihood and the mean-squared error derived from the pseudo-observations approach. The public BioStudies dataset including 999 patients was utilized for performance evaluation. Our PseudoDeepSurv model produced a C-index of 0.684 and 0.633 on the training and testing dataset, respectively. While for the original DeepSurv model, the corresponding values are 0.671 and 0.618, respectively.
Collapse
Affiliation(s)
- Yi Xia
- School of Electrical Engineering and Automation, Anhui University, Hefei, China
| | - Baifu Zhang
- School of Electrical Engineering and Automation, Anhui University, Hefei, China
| | - Yongliang Zhang
- Health Management Center, The First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
5
|
Collatuzzo G, Ferrante M, Ippolito A, Di Prima A, Colarossi C, Scarpulla S, Boffetta P, Sciacca S. Second Primary Cancers following Colorectal Cancer in Sicily, Italy. Cancers (Basel) 2022; 14:cancers14215204. [PMID: 36358623 PMCID: PMC9657763 DOI: 10.3390/cancers14215204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/19/2022] [Accepted: 10/20/2022] [Indexed: 11/26/2022] Open
Abstract
Simple Summary This study addressed the under-investigated issue of second primary cancer occurring in colorectal cancer survivors. Our aim was to explore whether patients recovered from a first colorectal cancer were at higher risk of developing a subsequent primary cancer. The hypothesis was that exposure to cancer treatment, enhanced health surveillance and shared risk factors may lead to an excess risk of second primary cancer in this population. The number of cases of second primary cancer exceeded the expected in this population, mainly driven by female genital cancers, and especially observed in the first years after colorectal cancer diagnosis. Our findings are overall consistent with previous studies, providing valuable information to better characterize and predict mortality from second primary cancer in subjects who suffered from first colorectal cancer. Abstract Background: Cancer survivors are at risk of developing second primary cancers (SPC). We investigated the risk of SPC in colorectal cancer (CRC) survivors in Sicily, Southern Italy. Methods: We analyzed data from the Eastern Sicily cancer registry covering 2.5 million people diagnosed and followed up between 2003 and 2017. We calculated the standardized incidence ratio (SIR) and 95% confidence interval (CI) of SPC overall and by cancer type, using the general Sicily population rates as reference. Results: A total of 19,040 cases of CRC and 1453 cases of SPC were included in the analysis. Mean age of occurrence of SPC was 68.1. The SIR for any SPC was 1.11 (95% CI 1.05–1.17); it was higher in women (1.18; 95% CI 1.08–1.29) than in men (1.07; 95% CI 0.97–1.14, p-value of difference 0.07). The SIR was increased for SPC from the ovary (SIR 2.01; 95% CI 1.33–2.95), kidney (SIR 2.00; 95% CI 1.54–2.56), endometrium (SIR 1.94; 95% CI 1.45–2.54), bladder (SIR 1.22, 95% CI 1.04–1.43) and stomach (1.29; 95% CI 0.98–1.66). The SIR for CRC as SPC was 0.84 (95% CI 0.70–1.01). No increased incidence was found for lung, prostate, breast, thyroid and liver cancer. The SIR for SPC overall and several cancers decreased with time of follow-up. Conclusions: In this population, CRC survivors have an 11% higher risk of developing a SPC than the general population, particularly cancers of the ovary, kidney, endometrium, bladder and stomach. Follow-up for SPC is required, especially during the first 5 years from CRC diagnosis.
Collapse
Affiliation(s)
- Giulia Collatuzzo
- Department of Medical and Surgical Sciences, University of Bologna, 40138 Bologna, Italy
| | - Margherita Ferrante
- Department of Medical, Surgical and Advanced Technologies “G.F. Ingrassia”, University of Catania, Via Santa Sofia 87, 95123 Catania, Italy
- Cancer Registry of Catania, Messina, Syracuse and Enna, Via Santa Sofia 87, 95123 Catania, Italy
| | - Antonella Ippolito
- Cancer Registry of Catania, Messina, Syracuse and Enna, Via Santa Sofia 87, 95123 Catania, Italy
| | - Alessia Di Prima
- Cancer Registry of Catania, Messina, Syracuse and Enna, Via Santa Sofia 87, 95123 Catania, Italy
| | - Cristina Colarossi
- Mediterranean Institute of Oncology (IOM), Viagrande, 95029 Catania, Italy
| | | | - Paolo Boffetta
- Department of Medical and Surgical Sciences, University of Bologna, 40138 Bologna, Italy
- Stony Brook Cancer Center, Stony Brook University, Stony Brook, NY 11794, USA
- Correspondence:
| | - Salvatore Sciacca
- Mediterranean Institute of Oncology (IOM), Viagrande, 95029 Catania, Italy
| |
Collapse
|
6
|
Khair S, Dort JC, Quan ML, Cheung WY, Sauro KM, Nakoneshny SC, Popowich BL, Liu P, Wu G, Xu Y. Validated algorithms for identifying timing of second event of oropharyngeal squamous cell carcinoma using real-world data. Head Neck 2022; 44:1909-1917. [PMID: 35653151 DOI: 10.1002/hed.27109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 04/29/2022] [Accepted: 05/18/2022] [Indexed: 11/07/2022] Open
Abstract
BACKGROUND Understanding occurrence and timing of second events (recurrence and second primary cancer) is essential for cancer specific survival analysis. However, this information is not readily available in administrative data. METHODS Alberta Cancer Registry, physician claims, and other administrative data were used. Timing of second event was estimated based on our developed algorithm. For validation, the difference, in days between the algorithm estimated and the chart-reviewed timing of second event. Further, the result of Cox-regression modeling cancer-free survival was compared to chart review data. RESULTS Majority (74.3%) of the patients had a difference between the chart-reviewed and algorithm-estimated timing of second event falling within the 0-60 days window. Kaplan-Meier curves generated from the estimated data and chart review data were comparable with a 5-year second-event-free survival rate of 75.4% versus 72.5%. CONCLUSION The algorithm provided an estimated timing of second event similar to that of the chart review.
Collapse
Affiliation(s)
- Shahreen Khair
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Joseph C Dort
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Department of Surgery, Cumming School of Medicine, University of Calgary, North Tower, Foothills Medical Centre, Calgary, Alberta, Canada
| | - May Lynn Quan
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Department of Surgery, Cumming School of Medicine, University of Calgary, North Tower, Foothills Medical Centre, Calgary, Alberta, Canada.,Department of Oncology, Cumming School of Medicine, University of Calgary, Tom Baker, Cancer Centre, Calgary, Alberta, Canada
| | - Winson Y Cheung
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Department of Surgery, Cumming School of Medicine, University of Calgary, North Tower, Foothills Medical Centre, Calgary, Alberta, Canada
| | - Khara M Sauro
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Department of Surgery, Cumming School of Medicine, University of Calgary, North Tower, Foothills Medical Centre, Calgary, Alberta, Canada.,Department of Oncology, Cumming School of Medicine, University of Calgary, Tom Baker, Cancer Centre, Calgary, Alberta, Canada
| | - Steven C Nakoneshny
- The Ohlson Research Initiative, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Alberta, Canada
| | - Brittany Lynn Popowich
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Teaching Research and Wellness (TRW), Calgary, Alberta, Canada
| | - Ping Liu
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Guosong Wu
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Teaching Research and Wellness (TRW), Calgary, Alberta, Canada
| | - Yuan Xu
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada.,Department of Surgery, Cumming School of Medicine, University of Calgary, North Tower, Foothills Medical Centre, Calgary, Alberta, Canada.,Department of Oncology, Cumming School of Medicine, University of Calgary, Tom Baker, Cancer Centre, Calgary, Alberta, Canada.,Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Teaching Research and Wellness (TRW), Calgary, Alberta, Canada
| |
Collapse
|
7
|
Wu X, Guan Q, Cheng ASK, Guan C, Su Y, Jiang J, Wang B, Zeng L, Zeng Y. Comparison of machine learning models for predicting the risk of breast cancer-related lymphedema in Chinese women. Asia Pac J Oncol Nurs 2022; 9:100101. [PMID: 36276882 PMCID: PMC9579303 DOI: 10.1016/j.apjon.2022.100101] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Accepted: 05/30/2022] [Indexed: 11/21/2022] Open
Abstract
Objective Predictive models for the occurrence of cancer symptoms by using machine learning (ML) algorithms could be used to aid clinical decision-making in order to enhance the quality of cancer care. This study aimed to develop and validate a selection of classification models that used ML algorithms to predict the occurrence of breast cancer-related lymphedema (BCRL) among Chinese women. Methods This was a retrospective cohort study of consecutive cases that had been diagnosed with breast cancer, stages I-IV. Forty-eight variables were grouped into five feature sets. Five classification models with ML algorithms were developed, and the models' performance and the variables’ relative importance were assessed accordingly. Results Of 370 eligible female participants, 91 had BCRL (24.6%). The mean age of this study sample was 49.89 (SD = 7.45). All participants had had breast cancer surgery, and more than half of them had had a modified radical mastectomy (n = 206, 55.5%). The mean follow-up time after breast cancer surgery was 28.73 months (SD = 11.71). Most of the tumors were either stage I (n = 49, 31.2%) or stage II (n = 252, 68.1%). More than half of the sample had had postoperative chemotherapy (n = 227, 61.4%). Overall, the logistic regression model achieved the best performance in terms of accuracy (91.6%), precision (82.1%), and recall (91.4%) for BCRL. Although this study included 48 predicting variables, we found that the five models required only 22 variables to achieve predictive performance. The most important variable was the number of positive lymph nodes, followed in descending order by the BCRL occurring on the same side as the surgery, a history of sentinel lymph node biopsy, a dietary preference for meat and fried food, and an exercise frequency of less than three times per week. These factors were the most influential predictors for enhancing the ML models’ performance. Conclusions This study found that in the ML training dataset, the multilayer perceptron model and the logistic regression model were the best discrimination models for predicting the outcome of BCRL, and the k-nearest neighbors and support vector machine models demonstrated good calibration performance in the ML validation dataset. Future research will need to use large-sample datasets to establish a more robust ML model for predicting BCRL deeply and reliably.
Collapse
|
8
|
Atuahene BT, Kanjanabootra S, Gajendran T. Preliminary benefits of big data in the construction industry: a case study. PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-MANAGEMENT PROCUREMENT AND LAW 2022. [DOI: 10.1680/jmapl.21.00027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Big data applications consist of (a) data collection using big data sources, (b) storing and processing data and (c) analysing data to gain insights for creating organisational benefit. The influx of digital technologies and digitisation in the construction process includes big data as one newly emerging digital technology adopted in the construction industry. Big data application is in a nascent stage in construction, and there is a need to understand the tangible benefit(s) that big data can offer the construction industry. This study explores the benefits of big data in the construction industry. Using a qualitative case study design, construction professionals in an Australian construction firm were interviewed. The research highlights that the benefits of big data include reduction of litigation among project stakeholders, enablement of near-to-real-time communication and facilitation of effective subcontractor selection. By implication, on a broader scale, these benefits can improve contract management, procurement and management of construction projects. This study contributes to an ongoing discourse on big data application and, more generally, digitisation in the construction industry.
Collapse
Affiliation(s)
- Bernard Tuffour Atuahene
- Sessional Academic in Construction Management, School of Architecture and Built Environment, The University of Newcastle, Newcastle, Australia
| | - Sittimont Kanjanabootra
- School of Architecture and Built Environment, The University of Newcastle, Newcastle, Australia
| | - Thayaparan Gajendran
- School of Architecture and Built Environment, The University of Newcastle, Newcastle, Australia
| |
Collapse
|
9
|
The Role of Artificial Intelligence in Early Cancer Diagnosis. Cancers (Basel) 2022; 14:cancers14061524. [PMID: 35326674 PMCID: PMC8946688 DOI: 10.3390/cancers14061524] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/08/2022] [Accepted: 03/10/2022] [Indexed: 02/01/2023] Open
Abstract
Improving the proportion of patients diagnosed with early-stage cancer is a key priority of the World Health Organisation. In many tumour groups, screening programmes have led to improvements in survival, but patient selection and risk stratification are key challenges. In addition, there are concerns about limited diagnostic workforces, particularly in light of the COVID-19 pandemic, placing a strain on pathology and radiology services. In this review, we discuss how artificial intelligence algorithms could assist clinicians in (1) screening asymptomatic patients at risk of cancer, (2) investigating and triaging symptomatic patients, and (3) more effectively diagnosing cancer recurrence. We provide an overview of the main artificial intelligence approaches, including historical models such as logistic regression, as well as deep learning and neural networks, and highlight their early diagnosis applications. Many data types are suitable for computational analysis, including electronic healthcare records, diagnostic images, pathology slides and peripheral blood, and we provide examples of how these data can be utilised to diagnose cancer. We also discuss the potential clinical implications for artificial intelligence algorithms, including an overview of models currently used in clinical practice. Finally, we discuss the potential limitations and pitfalls, including ethical concerns, resource demands, data security and reporting standards.
Collapse
|
10
|
A Prediction Model for Tumor Recurrence in Stage II–III Colorectal Cancer Patients: From a Machine Learning Model to Genomic Profiling. Biomedicines 2022; 10:biomedicines10020340. [PMID: 35203549 PMCID: PMC8961774 DOI: 10.3390/biomedicines10020340] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 01/27/2022] [Accepted: 01/28/2022] [Indexed: 01/27/2023] Open
Abstract
Background: Colorectal cancer (CRC) is one of the most prevalent malignant diseases worldwide. Risk prediction for tumor recurrence is important for making effective treatment decisions and for the survival outcomes of patients with CRC after surgery. Herein, we aimed to explore a prediction algorithm and the risk factors for postoperative tumor recurrence using a machine learning (ML) approach with standardized pathology reports for patients with stage II and III CRC. Methods: Pertinent clinicopathological features were compiled from medical records and standardized pathology reports of patients with stage II and III CRC. Four ML models based on logistic regression (LR), random forest (RF), classification and regression decision trees (CARTs), and support vector machine (SVM) were applied for the development of the prediction algorithm. The area under the curve (AUC) of the ML models was determined in order to compare the prediction accuracy. Genomic studies were performed using a panel-targeted next-generation sequencing approach. Results: A total of 1073 patients who received curative intent surgery at the National Cheng Kung University Hospital between January 2004 and January 2019 were included. Based on conventional statistical methods, chemotherapy (p = 0.003), endophytic tumor configuration (p = 0.008), TNM stage III disease (p < 0.001), pT4 (p < 0.001), pN2 (p < 0.001), increased numbers of lymph node metastases (p < 0.001), higher lymph node ratios (LNR) (p < 0.001), lymphovascular invasion (p < 0.001), perineural invasion (p < 0.001), tumor budding (p = 0.004), and neoadjuvant chemoradiotherapy (p = 0.025) were found to be correlated with the tumor recurrence of patients with stage II–III CRC. While comparing the performance of different ML models for predicting cancer recurrence, the AUCs for LR, RF, CART, and SVM were found to be 0.678, 0.639, 0.593, and 0.581, respectively. The LR model had a better accuracy value of 0.87 and a specificity value of 1 in the testing set. Two prognostic factors, age and LNR, were selected by multivariable analysis and the four ML models. In terms of age, older patients received fewer cycles of chemotherapy and radiotherapy (p < 0.001). Right-sided colon tumors (p = 0.002), larger tumor sizes (p = 0.008) and tumor volumes (p = 0.049), TNM stage II disease (p < 0.001), and advanced pT3–4 stage diseases (p = 0.04) were found to be correlated with the older age of patients. However, pN2 diseases (p = 0.005), lymph node metastasis number (p = 0.001), LNR (p = 0.004), perineural invasion (p = 0.018), and overall survival rate (p < 0.001) were found to be decreased in older patients. Furthermore, PIK3CA and DNMT3A mutations (p = 0.032 and 0.039, respectively) were more frequently found in older patients with stage II–III CRC compared to their younger counterparts. Conclusions: This study demonstrated that ML models have a comparable predictive power for determining cancer recurrence in patients with stage II–III CRC after surgery. Advanced age and high LNR were significant risk factors for cancer recurrence, as determined by ML algorithms and multivariable analyses. Distinctive genomic profiles may contribute to discrete clinical behaviors and survival outcomes between patients of different age groups. Studies incorporating complete molecular and genomic profiles in cancer prediction models are beneficial for patients with stage II–III CRC.
Collapse
|
11
|
Li S, Teng Z, Qiu Y, Pan P, Wu C, Jin K, Wang L, Chen J, Tang H, Xiang H, De Leon SA, Huang J, Guo W, Wang B, Wu H. Dissociation Pattern in Default-Mode Network Homogeneity in Drug-Naive Bipolar Disorder. Front Psychiatry 2021; 12:699292. [PMID: 34434127 PMCID: PMC8380964 DOI: 10.3389/fpsyt.2021.699292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 07/13/2021] [Indexed: 11/20/2022] Open
Abstract
Default mode network (DMN) plays a key role in the pathophysiology of in bipolar disorder (BD). However, the homogeneity of this network in BD is still poorly understood. This study aimed to investigate abnormalities in the NH of the DMN at rest and the correlation between the NH of DMN and clinical variables in patients with BD. Forty drug-naive patients with BD and thirty-seven healthy control subjects participated in the study. Network homogeneity (NH) and independent component analysis (ICA) methods were used for data analysis. Support vector machines (SVM) method was used to analyze NH in different brain regions. Compared with healthy controls, significantly increased NH in the left superior medial prefrontal cortex (MPFC) and decreased NH in the right posterior cingulate cortex (PCC) and bilateral precuneus were found in patients with BD. NH in the right PCC was positively correlated with the verbal fluency test and verbal function total scores. NH in the left superior MPFC was negatively correlated with triglyceride (TG). NH in the right PCC was positively correlated with TG but negatively correlated with high-density lipoprotein cholesterol (HDL-C). NH in the bilateral precuneus was positively correlated with cholesterol and low-density lipoprotein cholesterol (LDL-C). In addition, NH in the left superior MPFC showed high sensitivity (80.00%), specificity (71.43%), and accuracy (75.61%) in the SVM results. These findings contribute new evidence of the participation of the altered NH of the DMN in the pathophysiology of BD.
Collapse
Affiliation(s)
- Sujuan Li
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Ziwei Teng
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Yan Qiu
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Pan Pan
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Chujun Wu
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Kun Jin
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Lu Wang
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Jindong Chen
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Hui Tang
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Hui Xiang
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Sara Arenas De Leon
- Department of Biochemistry and Molecular Biology, University of New Mexico Health Sciences Center, Albuquerque, NM, United States
| | - Jing Huang
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Wenbin Guo
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Bolun Wang
- Department of Radiology, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Haishan Wu
- National Clinical Research Center for Mental Disorders, Department of Psychiatry, China National Technology Institute on Mental Disorders, The Second Xiangya Hospital of Central South University, Changsha, China
| |
Collapse
|
12
|
Achilonu OJ, Fabian J, Bebington B, Singh E, Eijkemans MJC, Musenge E. Predicting Colorectal Cancer Recurrence and Patient Survival Using Supervised Machine Learning Approach: A South African Population-Based Study. Front Public Health 2021; 9:694306. [PMID: 34307286 PMCID: PMC8292767 DOI: 10.3389/fpubh.2021.694306] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 05/31/2021] [Indexed: 12/12/2022] Open
Abstract
Background: South Africa (SA) has the highest incidence of colorectal cancer (CRC) in Sub-Saharan Africa (SSA). However, there is limited research on CRC recurrence and survival in SA. CRC recurrence and overall survival are highly variable across studies. Accurate prediction of patients at risk can enhance clinical expectations and decisions within the South African CRC patients population. We explored the feasibility of integrating statistical and machine learning (ML) algorithms to achieve higher predictive performance and interpretability in findings. Methods: We selected and compared six algorithms:- logistic regression (LR), naïve Bayes (NB), C5.0, random forest (RF), support vector machine (SVM) and artificial neural network (ANN). Commonly selected features based on OneR and information gain, within 10-fold cross-validation, were used for model development. The validity and stability of the predictive models were further assessed using simulated datasets. Results: The six algorithms achieved high discriminative accuracies (AUC-ROC). ANN achieved the highest AUC-ROC for recurrence (87.0%) and survival (82.0%), and other models showed comparable performance with ANN. We observed no statistical difference in the performance of the models. Features including radiological stage and patient's age, histology, and race are risk factors of CRC recurrence and patient survival, respectively. Conclusions: Based on other studies and what is known in the field, we have affirmed important predictive factors for recurrence and survival using rigorous procedures. Outcomes of this study can be generalised to CRC patient population elsewhere in SA and other SSA countries with similar patient profiles.
Collapse
Affiliation(s)
- Okechinyere J Achilonu
- Division of Epidemiology and Biostatistics, School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Parktown, Johannesburg, South Africa
| | - June Fabian
- Medical Research Council/Wits University Rural Public Health and Health Transitions Research Unit (Agincourt), School of Public Health, Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa.,Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa
| | - Brendan Bebington
- Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa.,Department of Surgery, Faculty of Health Science University of the Witwatersrand Faculty of Science, Parktown, Johannesburg, South Africa
| | - Elvira Singh
- Division of Epidemiology and Biostatistics, School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Parktown, Johannesburg, South Africa.,National Cancer Registry, National Health Laboratory Service, 1 Modderfontein Road, Sandringham, Johannesburg, South Africa
| | - M J C Eijkemans
- Julius Center for Health Sciences and Primary Care, University Medical Center, Utrecht University, Utrecht, Netherlands
| | - Eustasius Musenge
- Division of Epidemiology and Biostatistics, School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Parktown, Johannesburg, South Africa.,Industrialization, Science, Technology and Innovation Hub, African Union Development Agency (AUDA-NEPAD), Johannesburg, South Africa
| |
Collapse
|