1
|
Izadi Z, Gianfrancesco M, Anastasiou C, Schmajuk G, Yazdany J. Development and validation of a risk scoring system to identify patients with lupus nephritis in electronic health record data. Lupus Sci Med 2024; 11:e001170. [PMID: 38769054 PMCID: PMC11110552 DOI: 10.1136/lupus-2024-001170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 04/20/2024] [Indexed: 05/22/2024]
Abstract
OBJECTIVE Accurate identification of lupus nephritis (LN) cases is essential for patient management, research and public health initiatives. However, LN diagnosis codes in electronic health records (EHRs) are underused, hindering efficient identification. We investigated the current performance of International Classification of Diseases (ICD) codes, 9th and 10th editions (ICD9/10), for identifying prevalent LN, and developed scoring systems to increase identification of LN that are adaptable to settings with and without LN ICD codes. METHODS Training and test sets derived from EHR data from a large health system. An external set comprised data from the EHR of a second large health system. Adults with ICD9/10 codes for SLE were included. LN cases were ascertained through manual chart reviews conducted by rheumatologists. Two definitions of LN were used: strict (definite LN) and inclusive (definite, potential or diagnostic uncertainty). Gradient boosting models including structured EHR fields were used for predictor selection. Two logistic regression-based scoring systems were developed ('LN-Code' included LN ICD codes and 'LN-No Code' did not), calibrated and validated using standard performance metrics. RESULTS A total of 4152 patients from University of California San Francisco Medical Center and 370 patients from Zuckerberg San Francisco General Hospital and Trauma Center met the eligibility criteria. Mean age was 50 years, 87% were female. LN diagnosis codes demonstrated low sensitivity (43-73%) but high specificity (92-97%). LN-Code achieved an area under the curve (AUC) of 0.93 and a sensitivity of 0.88 for identifying LN using the inclusive definition. LN-No Code reached an AUC of 0.91 and a sensitivity of 0.95 (0.97 for the strict definition). Both scoring systems had good external validity, calibration and performance across racial and ethnic groups. CONCLUSIONS This study quantified the underutilisation of LN diagnosis codes in EHRs and introduced two adaptable scoring systems to enhance LN identification. Further validation in diverse healthcare settings is essential to ensure their broader applicability.
Collapse
Affiliation(s)
- Zara Izadi
- University of California San Francisco, San Francisco, California, USA
| | | | | | - Gabriela Schmajuk
- University of California San Francisco, San Francisco, California, USA
- San Francisco VA Medical Center, San Francisco, California, USA
| | - Jinoos Yazdany
- University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
2
|
Tesfie TK, Anlay DZ, Abie B, Chekol YM, Gelaw NB, Tebeje TM, Animut Y. Nomogram to predict risk of neonatal mortality among preterm neonates admitted with sepsis at University of Gondar Comprehensive Specialized Hospital: risk prediction model development and validation. BMC Pregnancy Childbirth 2024; 24:139. [PMID: 38360591 PMCID: PMC10868119 DOI: 10.1186/s12884-024-06306-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 01/29/2024] [Indexed: 02/17/2024] Open
Abstract
BACKGROUND Mortality in premature neonates is a global public health problem. In developing countries, nearly 50% of preterm births ends with death. Sepsis is one of the major causes of death in preterm neonates. Risk prediction model for mortality in preterm septic neonates helps for directing the decision making process made by clinicians. OBJECTIVE We aimed to develop and validate nomogram for the prediction of neonatal mortality. Nomograms are tools which assist the clinical decision making process through early estimation of risks prompting early interventions. METHODS A three year retrospective follow up study was conducted at University of Gondar Comprehensive Specialized Hospital and a total of 603 preterm neonates with sepsis were included. Data was collected using KoboCollect and analyzed using STATA version 16 and R version 4.2.1. Lasso regression was used to select the most potent predictors and to minimize the problem of overfitting. Nomogram was developed using multivariable binary logistic regression analysis. Model performance was evaluated using discrimination and calibration. Internal model validation was done using bootstrapping. Net benefit of the nomogram was assessed through decision curve analysis (DCA) to assess the clinical relevance of the model. RESULT The nomogram was developed using nine predictors: gestational age, maternal history of premature rupture of membrane, hypoglycemia, respiratory distress syndrome, perinatal asphyxia, necrotizing enterocolitis, total bilirubin, platelet count and kangaroo-mother care. The model had discriminatory power of 96.7% (95% CI: 95.6, 97.9) and P-value of 0.165 in the calibration test before and after internal validation with brier score of 0.07. Based on the net benefit analysis the nomogram was found better than treat all and treat none conditions. CONCLUSION The developed nomogram can be used for individualized mortality risk prediction with excellent performance, better net benefit and have been found to be useful in clinical practice with contribution in preterm neonatal mortality reduction by giving better emphasis for those at high risk.
Collapse
Affiliation(s)
- Tigabu Kidie Tesfie
- Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia.
| | - Degefaye Zelalem Anlay
- School of Nursing, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| | - Birhanu Abie
- Department of Pediatrics and Child Health, School of Medicine, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| | - Yazachew Moges Chekol
- Department of Health Information Technology, Mizan Aman College of Health Science, Mizan Aman, Ethiopia
| | - Negalgn Byadgie Gelaw
- Department of Public Health, Mizan Aman College of Health Science, Mizan Aman, Ethiopia
| | - Tsion Mulat Tebeje
- School of Public Health, College of Medicine and Health Sciences, Dilla University, Dilla, Ethiopia
| | - Yaregal Animut
- Department of Epidemiology and Biostatistics, Institute of Public Health, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia
| |
Collapse
|
3
|
Chen YL, Kraus SW, Freeman MJ, Freeman AJ. A Machine-Learning Approach to Assess Factors Associated With Hospitalization of Children and Youths in Psychiatric Crisis. Psychiatr Serv 2023; 74:943-949. [PMID: 36916060 DOI: 10.1176/appi.ps.20220201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
OBJECTIVE The authors used a machine-learning approach to model clinician decision making regarding psychiatric hospitalization of children and youths in crisis and to identify factors associated with the decision to hospitalize. METHODS Data consisted of 4,786 mobile crisis response team assessments of children and youths, ages 4.0-19.5 years (mean±SD=14.0±2.7 years, 56% female), in Nevada. The sample assessments were split into training and testing data sets. A random-forest machine-learning algorithm was used to identify variables related to the decision to hospitalize a child or youth after the crisis assessment. Results from the training sample were externally validated in the testing sample. RESULTS The random-forest model had good performance (area under the curve training sample=0.91, testing sample=0.92). Variables found to be important in the decision to hospitalize a child or youth were acute suicidality, followed by poor judgment or decision making, danger to others, impulsivity, runaway behavior, other risky behaviors, nonsuicidal self-injury, psychotic or depressive symptoms, sleep problems, oppositional behavior, poor functioning at home or with peers, depressive or schizophrenia spectrum disorders, and age. CONCLUSIONS In crisis settings, clinicians were found to mostly focus on acute factors that increased risk for danger to self or others (e.g., suicidality, poor judgment), current psychiatric symptoms (e.g., psychotic symptoms), and functioning (e.g., poor home functioning, problems with peer relationships) when deciding whether to hospitalize or stabilize a child or youth. To reduce psychiatric hospitalization, community-based services should target interventions to address these important factors associated with the need for a higher level of care among youths in psychiatric crisis.
Collapse
Affiliation(s)
- Yen-Ling Chen
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas (Chen, Kraus); Boys and Girls Clubs of Southern Nevada, Las Vegas (M. J. Freeman); Inspiring Children Foundation, Las Vegas (A. J. Freeman)
| | - Shane W Kraus
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas (Chen, Kraus); Boys and Girls Clubs of Southern Nevada, Las Vegas (M. J. Freeman); Inspiring Children Foundation, Las Vegas (A. J. Freeman)
| | - Megan J Freeman
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas (Chen, Kraus); Boys and Girls Clubs of Southern Nevada, Las Vegas (M. J. Freeman); Inspiring Children Foundation, Las Vegas (A. J. Freeman)
| | - Andrew J Freeman
- Department of Psychology, University of Nevada, Las Vegas, Las Vegas (Chen, Kraus); Boys and Girls Clubs of Southern Nevada, Las Vegas (M. J. Freeman); Inspiring Children Foundation, Las Vegas (A. J. Freeman)
| |
Collapse
|
4
|
Chen SL, Chin SC, Chan KC, Ho CY. A Machine Learning Approach to Assess Patients with Deep Neck Infection Progression to Descending Mediastinitis: Preliminary Results. Diagnostics (Basel) 2023; 13:2736. [PMID: 37685275 PMCID: PMC10486957 DOI: 10.3390/diagnostics13172736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/25/2023] [Accepted: 08/22/2023] [Indexed: 09/10/2023] Open
Abstract
BACKGROUND Deep neck infection (DNI) is a serious infectious disease, and descending mediastinitis is a fatal infection of the mediastinum. However, no study has applied artificial intelligence to assess progression to descending mediastinitis in DNI patients. Thus, we developed a model to assess the possible progression of DNI to descending mediastinitis. METHODS Between August 2017 and December 2022, 380 patients with DNI were enrolled; 75% of patients (n = 285) were assigned to the training group for validation, whereas the remaining 25% (n = 95) were assigned to the test group to determine the accuracy. The patients' clinical and computed tomography (CT) parameters were analyzed via the k-nearest neighbor method. The predicted and actual progression of DNI patients to descending mediastinitis were compared. RESULTS In the training and test groups, there was no statistical significance (all p > 0.05) noted at clinical variables (age, gender, chief complaint period, white blood cells, C-reactive protein, diabetes mellitus, and blood sugar), deep neck space (parapharyngeal, submandibular, retropharyngeal, and multiple spaces involved, ≥3), tracheostomy performance, imaging parameters (maximum diameter of abscess and nearest distance from abscess to level of sternum notch), or progression to mediastinitis. The model had a predictive accuracy of 82.11% (78/95 patients), with sensitivity and specificity of 41.67% and 87.95%, respectively. CONCLUSIONS Our model can assess the progression of DNI to descending mediastinitis depending on clinical and imaging parameters. It can be used to identify DNI patients who will benefit from prompt treatment.
Collapse
Affiliation(s)
- Shih-Lung Chen
- Department of Otorhinolaryngology & Head and Neck Surgery, Chang Gung Memorial Hospital, New Taipei City 333, Taiwan
- School of Medicine, Chang Gung University, Taoyuan 333, Taiwan
| | - Shy-Chyi Chin
- School of Medicine, Chang Gung University, Taoyuan 333, Taiwan
- Department of Medical Imaging and Intervention, Chang Gung Memorial Hospital, New Taipei City 333, Taiwan
| | - Kai-Chieh Chan
- Department of Otorhinolaryngology & Head and Neck Surgery, Chang Gung Memorial Hospital, New Taipei City 333, Taiwan
- School of Medicine, Chang Gung University, Taoyuan 333, Taiwan
| | - Chia-Ying Ho
- School of Medicine, Chang Gung University, Taoyuan 333, Taiwan
- Division of Chinese Internal Medicine, Center for Traditional Chinese Medicine, Chang Gung Memorial Hospital, Taoyuan 333, Taiwan
| |
Collapse
|
5
|
Timilsina M, Fey D, Buosi S, Janik A, Costabello L, Carcereny E, Abreu DR, Cobo M, Castro RL, Bernabé R, Minervini P, Torrente M, Provencio M, Nováček V. Synergy between imputed genetic pathway and clinical information for predicting recurrence in early stage non-small cell lung cancer. J Biomed Inform 2023; 144:104424. [PMID: 37352900 DOI: 10.1016/j.jbi.2023.104424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 06/06/2023] [Accepted: 06/11/2023] [Indexed: 06/25/2023]
Abstract
OBJECTIVE Lung cancer exhibits unpredictable recurrence in low-stage tumors and variable responses to different therapeutic interventions. Predicting relapse in early-stage lung cancer can facilitate precision medicine and improve patient survivability. While existing machine learning models rely on clinical data, incorporating genomic information could enhance their efficiency. This study aims to impute and integrate specific types of genomic data with clinical data to improve the accuracy of machine learning models for predicting relapse in early-stage, non-small cell lung cancer patients. METHODS The study utilized a publicly available TCGA lung cancer cohort and imputed genetic pathway scores into the Spanish Lung Cancer Group (SLCG) data, specifically in 1348 early-stage patients. Initially, tumor recurrence was predicted without imputed pathway scores. Subsequently, the SLCG data were augmented with pathway scores imputed from TCGA. The integrative approach aimed to enhance relapse risk prediction performance. RESULTS The integrative approach achieved improved relapse risk prediction with the following evaluation metrics: an area under the precision-recall curve (PR-AUC) score of 0.75, an area under the ROC (ROC-AUC) score of 0.80, an F1 score of 0.61, and a Precision of 0.80. The prediction explanation model SHAP (SHapley Additive exPlanations) was employed to explain the machine learning model's predictions. CONCLUSION We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk while also improving the predictive power by incorporating proxy genomic data not available for specific patients.
Collapse
Affiliation(s)
- Mohan Timilsina
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland.
| | - Dirk Fey
- Systems Biology Ireland, University College Dublin, Ireland.
| | - Samuele Buosi
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland.
| | | | | | - Enric Carcereny
- Catalan Institute of Oncology, Hospital Universitari Germans Trias i Pujol, B-ARGO, IGTP, Badalona, Spain.
| | | | - Manuel Cobo
- Medical Oncology Intercenter Unit. Regional and Virgen de la Victoria University Hospitals. IBIMA. Málaga., Spain.
| | | | - Reyes Bernabé
- Hospital Universitario Virgen del Rocio, Sevilla, Spain.
| | | | - Maria Torrente
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, Madrid, Spain.
| | - Mariano Provencio
- Medical Oncology Department, Hospital Universitario Puerta de Hierro Majadahonda, Madrid, Spain.
| | - Vít Nováček
- Data Science Institute, Insight Centre for Data Analytics, University of Galway, Ireland; Faculty of Informatics, Masaryk University Brno, Czech Republic; Masaryk Memorial Cancer Institute, Brno, Czech Republic.
| |
Collapse
|
6
|
Reyes-Santias F, García-García C, Aibar-Guzmán B, García-Campos A, Cordova-Arevalo O, Mendoza-Pintos M, Cinza-Sanjurjo S, Portela-Romero M, Mazón-Ramos P, Gonzalez-Juanatey JR. Cost Analysis of Magnetic Resonance Imaging and Computed Tomography in Cardiology: A Case Study of a University Hospital Complex in the Euro Region. Healthcare (Basel) 2023; 11:2084. [PMID: 37510526 PMCID: PMC10379578 DOI: 10.3390/healthcare11142084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 07/12/2023] [Accepted: 07/17/2023] [Indexed: 07/30/2023] Open
Abstract
INTRODUCTION In recent years, several hospitals have incorporated MRI equipment managed directly by their cardiology departments. The aim of our work is to determine the total cost per test of both CT and MRI in the setting of a Cardiology Department of a tertiary hospital. MATERIALS AND METHODS The process followed for estimating the costs of CT and MRI tests consists of three phases: (1) Identification of the phases of the testing process; (2) Identification of the resources consumed in carrying out the tests; (3) Quantification and assessment of inputs. RESULTS MRI involves higher personnel (EUR 66.03 vs. EUR 49.17) and equipment (EUR 89.98 vs. EUR 33.73) costs, while CT consumes higher expenditures in consumables (EUR 93.28 vs. EUR 22.95) and overheads (EUR 1.64 vs. EUR 1.55). The total cost of performing each test is higher in MRI (EUR 180.60 vs. EUR 177.73). CONCLUSIONS We can conclude that the unit cost of each CT and MRI performed in that unit are EUR 177.73 and EUR 180.60, respectively, attributable to consumables in the case of CT and to amortization of equipment and staff time in the case of MRI.
Collapse
Affiliation(s)
- Francisco Reyes-Santias
- Servicio de Cardiología, Complejo Hospitalario Universitario de Santiago de Compostela, Choupana s/n, 15706 Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Choupana s/n, 15706 Santiago de Compostela, Spain
- Centro de Investigación Biomédica en Red-Enfermedades Cardiovasculares (CIBERCV), Av. Monforte de Lemos, 3-5. Pabellón 11. Planta 0, 28029 Madrid, Spain
- Department of Business, University of Vigo, 36310 Vigo, Spain
| | - Carlos García-García
- Department of Pharmacology, Pharmacy and Pharmaceutical Technology, R+D Pharma Group (GI-1645), Faculty of Pharmacy, Health Research Institute of Santiago de Compostela (IDIS), University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Beatriz Aibar-Guzmán
- Departamento de Economía Financiera y Contabilidad, Facultad de Ciencias Económicas y Empresariales, Universidad de Santiago de Compostela, Av. Burgo, s/n, 15782 Santiago Compostela, Spain
| | - Ana García-Campos
- Servicio de Cardiología, Complejo Hospitalario Universitario de Santiago de Compostela, Choupana s/n, 15706 Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Choupana s/n, 15706 Santiago de Compostela, Spain
- Centro de Investigación Biomédica en Red-Enfermedades Cardiovasculares (CIBERCV), Av. Monforte de Lemos, 3-5. Pabellón 11. Planta 0, 28029 Madrid, Spain
| | | | | | - Sergio Cinza-Sanjurjo
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Choupana s/n, 15706 Santiago de Compostela, Spain
- Centro de Investigación Biomédica en Red-Enfermedades Cardiovasculares (CIBERCV), Av. Monforte de Lemos, 3-5. Pabellón 11. Planta 0, 28029 Madrid, Spain
- CS Milladoiro, Área Sanitaria Integrada Santiago de Compostela, 15895 Travesía do Porto, Spain
| | - Manuel Portela-Romero
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Choupana s/n, 15706 Santiago de Compostela, Spain
- Centro de Investigación Biomédica en Red-Enfermedades Cardiovasculares (CIBERCV), Av. Monforte de Lemos, 3-5. Pabellón 11. Planta 0, 28029 Madrid, Spain
- CS Concepción Arenal, Área Sanitaria Integrada Santiago de Compostela, Rúa de Santiago León de Caracas, 12, 15701 Santiago de Compostela, Spain
| | - Pilar Mazón-Ramos
- Servicio de Cardiología, Complejo Hospitalario Universitario de Santiago de Compostela, Choupana s/n, 15706 Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Choupana s/n, 15706 Santiago de Compostela, Spain
- Centro de Investigación Biomédica en Red-Enfermedades Cardiovasculares (CIBERCV), Av. Monforte de Lemos, 3-5. Pabellón 11. Planta 0, 28029 Madrid, Spain
| | - Jose Ramon Gonzalez-Juanatey
- Servicio de Cardiología, Complejo Hospitalario Universitario de Santiago de Compostela, Choupana s/n, 15706 Santiago de Compostela, Spain
- Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Choupana s/n, 15706 Santiago de Compostela, Spain
- Centro de Investigación Biomédica en Red-Enfermedades Cardiovasculares (CIBERCV), Av. Monforte de Lemos, 3-5. Pabellón 11. Planta 0, 28029 Madrid, Spain
| |
Collapse
|
7
|
Wang B, Tian P, Sun Q, Zhang H, Han L, Zhu B. A novel, effective machine learning-based RNA editing profile for predicting the prognosis of lower-grade gliomas. Heliyon 2023; 9:e18075. [PMID: 37483735 PMCID: PMC10362151 DOI: 10.1016/j.heliyon.2023.e18075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Revised: 07/02/2023] [Accepted: 07/05/2023] [Indexed: 07/25/2023] Open
Abstract
Patients with low-grade glioma (LGG) may survive for long time periods, but their tumors often progress to higher-grade lesions. Currently, no cure for LGG is available. A-to-I RNA editing accounts for nearly 90% of all RNA editing events in humans and plays a role in tumorigenesis in various cancers. However, little is known regarding its prognostic role in LGG. On the basis of The Cancer Genome Atlas (TCGA) data, we used LASSO and univariate Cox regression to construct an RNA editing site signature. The results derived from the TCGA dataset were further validated with Gene Expression Omnibus (GEO) and Chinese Glioma Genome Atlas (CGGA) datasets. Five machine learning algorithms (Decision Trees C5.0, XGboost, GBDT, Lightgbm, and Catboost) were used to confirm the prognosis associated with the RNA editing site signature. Finally, we explored immune function, immunotherapy, and potential therapeutic agents in the high- and low-risk groups by using multiple biological prediction websites. A total of 22,739 RNA editing sites were identified, and a signature model consisting of four RNA editing sites (PRKCSH|chr19:11561032, DSEL|chr18:65174489, UGGT1|chr2:128952084, and SOD2|chr6:160101723) was established. Cox regression analysis indicated that the RNA editing signature was an independent prognostic factor, according to the ROC curve (AUC = 0.823), and the nomogram model had good predictive power (C-index = 0.824). In addition, the predictive ability of the RNA editing signature was confirmed with the machine learning model. The sensitivity of PCI-34051 and Elephantin was significantly higher in the high-risk group than the low-risk group, thus potentially providing a marker to predict the effects of lung cancer drug treatment. RNA editing may serve as a novel survival prediction tool, thus offering hope for developing editing-based therapeutic strategies to combat LGG progression. In addition, this tool may help optimize survival risk assessment and individualized care for patients with low-grade gliomas.
Collapse
Affiliation(s)
- Boshen Wang
- Jiangsu Provincial Center for Disease Prevention and Control, Nanjing 210000, Jiangsu, China
- Key Laboratory of Environmental Medicine Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, Jiangsu, China
| | - Peijie Tian
- Department of Pathology, Weifang Medical University, China
| | - Qianyu Sun
- Key Laboratory of Environmental Medicine Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, Jiangsu, China
| | - Hengdong Zhang
- Jiangsu Provincial Center for Disease Prevention and Control, Nanjing 210000, Jiangsu, China
| | - Lei Han
- Jiangsu Provincial Center for Disease Prevention and Control, Nanjing 210000, Jiangsu, China
| | - Baoli Zhu
- Jiangsu Provincial Center for Disease Prevention and Control, Nanjing 210000, Jiangsu, China
- Key Laboratory of Environmental Medicine Engineering of Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, Jiangsu, China
| |
Collapse
|
8
|
Liu YS, Thaliffdeen R, Han S, Park C. Use of machine learning to predict bladder cancer survival outcomes: a systematic literature review. Expert Rev Pharmacoecon Outcomes Res 2023; 23:761-771. [PMID: 37306511 DOI: 10.1080/14737167.2023.2224963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 06/09/2023] [Indexed: 06/13/2023]
Abstract
INTRODUCTION The objective of this systematic review is to summarize the use of machine learning (ML) in predicting overall survival (OS) in patients with bladder cancer. METHODS Search terms for bladder cancer, ML algorithms, and mortality were used to identify studies in PubMed and Web of Science as of February 2022. Notable inclusion/exclusion criteria contained the inclusion of studies that utilized patient-level datasets and exclusion of primary gene expression-related dataset studies. Study quality and bias were assessed using the International Journal of Medical Informatics (IJMEDI) checklist. RESULTS Of the 14 included studies, the most common algorithms were artificial neural networks (n = 8) and logistic regression (n = 4). Nine articles described missing data handling, with five articles removing patients with missing data entirely. With respect to feature selection, the most common sociodemographic variables were age (n = 9), gender (n = 9), and smoking status (n = 3), with clinical variables most commonly including tumor stage (n = 8), grade (n = 7), and lymph node involvement (n = 6). Most studies (n = 10) were of medium IJMEDI quality, with common areas of improvement being the descriptions of data preparation and deployment. CONCLUSIONS ML holds promise for optimizing bladder cancer care through accurate OS predictions, but challenges related to data processing, feature selection, and data source quality must be resolved to develop robust models. While this review is limited by its inability to compare models across studies, this systematic review will inform decision-making by various stakeholders to improve understanding of ML-based OS prediction in bladder cancer and foster interpretability of future models.
Collapse
Affiliation(s)
- Yi-Shao Liu
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| | - Ryan Thaliffdeen
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| | - Sola Han
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| | - Chanhyun Park
- College of Pharmacy, The University of Texas at Austin, 2409 University Ave, Austin, TX, USA
| |
Collapse
|
9
|
Popović Krneta M, Šobić Šaranović D, Mijatović Teodorović L, Krajčinović N, Avramović N, Bojović Ž, Bukumirić Z, Marković I, Rajšić S, Djorović BB, Artiko V, Karličić M, Tanić M. Prediction of Cervical Lymph Node Metastasis in Clinically Node-Negative T1 and T2 Papillary Thyroid Carcinoma Using Supervised Machine Learning Approach. J Clin Med 2023; 12:jcm12113641. [PMID: 37297835 DOI: 10.3390/jcm12113641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 05/19/2023] [Accepted: 05/22/2023] [Indexed: 06/12/2023] Open
Abstract
Papillary thyroid carcinoma (PTC) is generally considered an indolent cancer. However, patients with cervical lymph node metastasis (LNM) have a higher risk of local recurrence. This study evaluated and compared four machine learning (ML)-based classifiers to predict the presence of cervical LNM in clinically node-negative (cN0) T1 and T2 PTC patients. The algorithm was developed using clinicopathological data from 288 patients who underwent total thyroidectomy and prophylactic central neck dissection, with sentinel lymph node biopsy performed to identify lateral LNM. The final ML classifier was selected based on the highest specificity and the lowest degree of overfitting while maintaining a sensitivity of 95%. Among the models evaluated, the k-Nearest Neighbor (k-NN) classifier was found to be the best fit, with an area under the receiver operating characteristic curve of 0.72, and sensitivity, specificity, positive and negative predictive values, F1 and F2 scores of 98%, 27%, 56%, 93%, 72%, and 85%, respectively. A web application based on a sensitivity-optimized kNN classifier was also created to predict the potential of cervical LNM, allowing users to explore and potentially build upon the model. These findings suggest that ML can improve the prediction of LNM in cN0 T1 and T2 PTC patients, thereby aiding in individual treatment planning.
Collapse
Affiliation(s)
- Marina Popović Krneta
- Department of Nuclear Medicine, Institute for Oncology and Radiology of Serbia, 11 000 Belgrade, Serbia
| | - Dragana Šobić Šaranović
- Faculty of Medicine, University of Belgrade, 11 000 Belgrade, Serbia
- Center for Nuclear Medicine with PET, University Clinical Center of Serbia, 11 000 Belgrade, Serbia
| | - Ljiljana Mijatović Teodorović
- Department of Nuclear Medicine, Institute for Oncology and Radiology of Serbia, 11 000 Belgrade, Serbia
- Faculty of Medical Sciences, University of Kragujevac, 34 000 Kragujevac, Serbia
| | - Nemanja Krajčinović
- Department of Power, Electronics and Telecommunications, Faculty of Technical Sciences, University of Novi Sad, 21 000 Novi Sad, Serbia
| | - Nataša Avramović
- Department of Power, Electronics and Telecommunications, Faculty of Technical Sciences, University of Novi Sad, 21 000 Novi Sad, Serbia
| | - Živko Bojović
- Department of Power, Electronics and Telecommunications, Faculty of Technical Sciences, University of Novi Sad, 21 000 Novi Sad, Serbia
| | - Zoran Bukumirić
- Institute of Medical Statistics and Informatics, Faculty of Medicine, University of Belgrade, 11 000 Belgrade, Serbia
| | - Ivan Marković
- Faculty of Medicine, University of Belgrade, 11 000 Belgrade, Serbia
- Surgical Oncology Clinic, Institute for Oncology and Radiology of Serbia, 11 000 Belgrade, Serbia
| | - Saša Rajšić
- Department of Anesthesiology and Intensive Care Medicine, Medical University Innsbruck, 6020 Innsbruck, Austria
| | - Biljana Bazić Djorović
- Department of Nuclear Medicine, Institute for Oncology and Radiology of Serbia, 11 000 Belgrade, Serbia
| | - Vera Artiko
- Faculty of Medicine, University of Belgrade, 11 000 Belgrade, Serbia
- Center for Nuclear Medicine with PET, University Clinical Center of Serbia, 11 000 Belgrade, Serbia
| | - Mihajlo Karličić
- School of Electrical Engineering, University of Belgrade, 11 000 Belgrade, Serbia
| | - Miljana Tanić
- Department of Experimental Oncology, Institute for Oncology and Radiology of Serbia, 11 000 Belgrade, Serbia
- UCL Cancer Institute, London WC1E 6DD, UK
| |
Collapse
|
10
|
Brehon K, Carriere J, Churchill K, Loyola-Sanchez A, Papathanassoglou E, MacIsaac R, Tavakoli M, Ho C, Manhas KP. Evaluating Efficiency of a Provincial Telerehabilitation Service in Improving Access to Care During the COVID-19 Pandemic. Int J Telerehabil 2023; 15:e6523. [PMID: 38046552 PMCID: PMC10687995 DOI: 10.5195/ijt.2023.6523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2023] Open
Abstract
Scope Early in the COVID-19 pandemic, community rehabilitation stakeholders from a provincial health system designed a novel telerehabilitation service. The service provided wayfinding and self-management advice to individuals with musculoskeletal concerns, neurological conditions, or post-COVID-19 recovery needs. This study evaluated the efficiency of the service in improving access to care. Methodology We used multiple methods including secondary data analyses of call metrics, narrative analyses of clinical notes using artificial intelligence (AI) and machine learning (ML), and qualitative interviews. Conclusions Interviews revealed that the telerehabilitation service had the potential to positively impact access to rehabilitation during the COVID-19 pandemic, for individuals living rurally, and for individuals on wait lists. Call metric analyses revealed that efficiency may be enhanced if call handling time was reduced. AI/ML analyses found that pain was the most frequently-mentioned keyword in clinical notes, suggesting an area for additional telerehabilitation resources to ensure efficiency.
Collapse
Affiliation(s)
- Katelyn Brehon
- Department of Physical Therapy, University of Alberta, Edmonton, Alberta, Canada
| | - Jay Carriere
- Department of Electrical and Software Engineering, University of Calgary, Calgary, Alberta, Canada
| | - Katie Churchill
- Allied Health Professional Practice and Education, Alberta Health Services, Alberta, Canada
- Department of Occupational Therapy, University of Alberta, Edmonton, Alberta, Canada
| | | | - Elizabeth Papathanassoglou
- Neurosciences, Rehabilitation, and Vision Strategic Clinical Network, Alberta Health Services, Alberta, Canada
- Faculty of Nursing, University of Alberta, Edmonton, Alberta, Canada
| | - Rob MacIsaac
- Spinal Cord Injury Alberta, Edmonton, Alberta, Canada
| | - Mahdi Tavakoli
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada
| | - Chester Ho
- Department of Medicine, University of Alberta, Edmonton, Alberta, Canada
- Neurosciences, Rehabilitation, and Vision Strategic Clinical Network, Alberta Health Services, Alberta, Canada
| | - Kiran Pohar Manhas
- Neurosciences, Rehabilitation, and Vision Strategic Clinical Network, Alberta Health Services, Alberta, Canada
- Faculty of Nursing, University of Alberta, Edmonton, Alberta, Canada
- Department of Community Health Sciences, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
11
|
Manuel Román-Belmonte J, De la Corte-Rodríguez H, Adriana Rodríguez-Damiani B, Carlos Rodríguez-Merchán E. Artificial Intelligence in Musculoskeletal Conditions. ARTIF INTELL 2023. [DOI: 10.5772/intechopen.110696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
Abstract
Artificial intelligence (AI) refers to computer capabilities that resemble human intelligence. AI implies the ability to learn and perform tasks that have not been specifically programmed. Moreover, it is an iterative process involving the ability of computerized systems to capture information, transform it into knowledge, and process it to produce adaptive changes in the environment. A large labeled database is needed to train the AI system and generate a robust algorithm. Otherwise, the algorithm cannot be applied in a generalized way. AI can facilitate the interpretation and acquisition of radiological images. In addition, it can facilitate the detection of trauma injuries and assist in orthopedic and rehabilitative processes. The applications of AI in musculoskeletal conditions are promising and are likely to have a significant impact on the future management of these patients.
Collapse
|
12
|
Sheehy J, Rutledge H, Acharya UR, Loh HW, Gururajan R, Tao X, Zhou X, Li Y, Gurney T, Kondalsamy-Chennakesavan S. Gynecological cancer prognosis using machine learning techniques: A systematic review of last three decades (1990–2022). Artif Intell Med 2023; 139:102536. [PMID: 37100507 DOI: 10.1016/j.artmed.2023.102536] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/30/2023]
Abstract
OBJECTIVE Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs. METHODS Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly. RESULTS Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes. CONCLUSION There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.
Collapse
|
13
|
Scott-Fordsmand JJ, Amorim MJB. Using Machine Learning to make nanomaterials sustainable. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 859:160303. [PMID: 36410486 DOI: 10.1016/j.scitotenv.2022.160303] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 11/06/2022] [Accepted: 11/15/2022] [Indexed: 06/16/2023]
Abstract
Sustainable development is a key challenge for contemporary human societies; failure to achieve sustainability could threaten human survival. In this review article, we illustrate how Machine Learning (ML) could support more sustainable development, covering the basics of data gathering through each step of the Environmental Risk Assessment (ERA). The literature provides several examples showing how ML can be employed in most steps of a typical ERA.A key observation is that there are currently no clear guidance for using such autonomous technologies in ERAs or which standards/checks are required. Steering thus seems to be the most important task for supporting the use of ML in the ERA of nano- and smart-materials. Resources should be devoted to developing a strategy for implementing ML in ERA with a strong emphasis on data foundations, methodologies, and the related sensitivities/uncertainties. We should recognise historical errors and biases (e.g., in data) to avoid embedding them during ML programming.
Collapse
Affiliation(s)
| | - Mónica J B Amorim
- Department of Biology & CESAM, University of Aveiro, 3810-193 Aveiro, Portugal.
| |
Collapse
|
14
|
Burnett B, Zhou SM, Brophy S, Davies P, Ellis P, Kennedy J, Bandyopadhyay A, Parker M, Lyons RA. Machine Learning in Colorectal Cancer Risk Prediction from Routinely Collected Data: A Review. Diagnostics (Basel) 2023; 13:301. [PMID: 36673111 PMCID: PMC9858109 DOI: 10.3390/diagnostics13020301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/05/2023] [Accepted: 01/07/2023] [Indexed: 01/15/2023] Open
Abstract
The inclusion of machine-learning-derived models in systematic reviews of risk prediction models for colorectal cancer is rare. Whilst such reviews have highlighted methodological issues and limited performance of the models included, it is unclear why machine-learning-derived models are absent and whether such models suffer similar methodological problems. This scoping review aims to identify machine-learning models, assess their methodology, and compare their performance with that found in previous reviews. A literature search of four databases was performed for colorectal cancer prediction and prognosis model publications that included at least one machine-learning model. A total of 14 publications were identified for inclusion in the scoping review. Data was extracted using an adapted CHARM checklist against which the models were benchmarked. The review found similar methodological problems with machine-learning models to that observed in systematic reviews for non-machine-learning models, although model performance was better. The inclusion of machine-learning models in systematic reviews is required, as they offer improved performance despite similar methodological omissions; however, to achieve this the methodological issues that affect many prediction models need to be addressed.
Collapse
Affiliation(s)
- Bruce Burnett
- Swansea University Medical School, Swansea SA2 8PP, UK
| | - Shang-Ming Zhou
- Faculty of Health, University of Plymouth, Plymouth PL4 8AA, UK
| | - Sinead Brophy
- Swansea University Medical School, Swansea SA2 8PP, UK
| | | | | | | | | | | | | |
Collapse
|
15
|
Kotsyfakis S, Iliaki-Giannakoudaki E, Anagnostopoulos A, Papadokostaki E, Giannakoudakis K, Goumenakis M, Kotsyfakis M. The application of machine learning to imaging in hematological oncology: A scoping review. Front Oncol 2022; 12:1080988. [PMID: 36605438 PMCID: PMC9808781 DOI: 10.3389/fonc.2022.1080988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 12/05/2022] [Indexed: 12/24/2022] Open
Abstract
Background Here, we conducted a scoping review to (i) establish which machine learning (ML) methods have been applied to hematological malignancy imaging; (ii) establish how ML is being applied to hematological cancer radiology; and (iii) identify addressable research gaps. Methods The review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis Extension for Scoping Reviews guidelines. The inclusion criteria were (i) pediatric and adult patients with suspected or confirmed hematological malignancy undergoing imaging (population); (ii) any study using ML techniques to derive models using radiological images to apply to the clinical management of these patients (concept); and (iii) original research articles conducted in any setting globally (context). Quality Assessment of Diagnostic Accuracy Studies 2 criteria were used to assess diagnostic and segmentation studies, while the Newcastle-Ottawa scale was used to assess the quality of observational studies. Results Of 53 eligible studies, 33 applied diverse ML techniques to diagnose hematological malignancies or to differentiate them from other diseases, especially discriminating gliomas from primary central nervous system lymphomas (n=18); 11 applied ML to segmentation tasks, while 9 applied ML to prognostication or predicting therapeutic responses, especially for diffuse large B-cell lymphoma. All studies reported discrimination statistics, but no study calculated calibration statistics. Every diagnostic/segmentation study had a high risk of bias due to their case-control design; many studies failed to provide adequate details of the reference standard; and only a few studies used independent validation. Conclusion To deliver validated ML-based models to radiologists managing hematological malignancies, future studies should (i) adhere to standardized, high-quality reporting guidelines such as the Checklist for Artificial Intelligence in Medical Imaging; (ii) validate models in independent cohorts; (ii) standardize volume segmentation methods for segmentation tasks; (iv) establish comprehensive prospective studies that include different tumor grades, comparisons with radiologists, optimal imaging modalities, sequences, and planes; (v) include side-by-side comparisons of different methods; and (vi) include low- and middle-income countries in multicentric studies to enhance generalizability and reduce inequity.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Michail Kotsyfakis
- Biology Center of the Czech Academy of Sciences, Budweis (Ceske Budejovice), Czechia,*Correspondence: Michail Kotsyfakis,
| |
Collapse
|
16
|
Butner JD, Dogra P, Chung C, Pasqualini R, Arap W, Lowengrub J, Cristini V, Wang Z. Mathematical modeling of cancer immunotherapy for personalized clinical translation. NATURE COMPUTATIONAL SCIENCE 2022; 2:785-796. [PMID: 38126024 PMCID: PMC10732566 DOI: 10.1038/s43588-022-00377-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 11/14/2022] [Indexed: 12/23/2023]
Abstract
Encouraging advances are being made in cancer immunotherapy modeling, especially in the key areas of developing personalized treatment strategies based on individual patient parameters, predicting treatment outcomes and optimizing immunotherapy synergy when used in combination with other treatment approaches. Here we present a focused review of the most recent mathematical modeling work on cancer immunotherapy with a focus on clinical translatability. It can be seen that this field is transitioning from pure basic science to applications that can make impactful differences in patients' lives. We discuss how researchers are integrating experimental and clinical data to fully inform models so that they can be applied for clinical predictions, and present the challenges that remain to be overcome if widespread clinical adaptation is to be realized. Lastly, we discuss the most promising future applications and areas that are expected to be the focus of extensive upcoming modeling studies.
Collapse
Affiliation(s)
- Joseph D. Butner
- Mathematics in Medicine Program, Houston Methodist Research Institute, Houston, TX, USA
| | - Prashant Dogra
- Mathematics in Medicine Program, Houston Methodist Research Institute, Houston, TX, USA
| | - Caroline Chung
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Renata Pasqualini
- Rutgers Cancer Institute of New Jersey, Newark, NJ, USA
- Department of Radiation Oncology, Division of Cancer Biology, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Wadih Arap
- Rutgers Cancer Institute of New Jersey, Newark, NJ, USA
- Department of Medicine, Division of Hematology/Oncology, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - John Lowengrub
- Department of Mathematics, University of California at Irvine, Irvine, CA, USA
| | - Vittorio Cristini
- Mathematics in Medicine Program, Houston Methodist Research Institute, Houston, TX, USA
- Neal Cancer Center, Houston Methodist Research Institute, Houston, TX, USA
- Department of Imaging Physics, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Physiology, Biophysics, and Systems Biology Program, Graduate School of Medical Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Zhihui Wang
- Mathematics in Medicine Program, Houston Methodist Research Institute, Houston, TX, USA
- Neal Cancer Center, Houston Methodist Research Institute, Houston, TX, USA
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| |
Collapse
|
17
|
Hanis TM, Ruhaiyem NIR, Arifin WN, Haron J, Wan Abdul Rahman WF, Abdullah R, Musa KI. Over-the-Counter Breast Cancer Classification Using Machine Learning and Patient Registration Records. Diagnostics (Basel) 2022; 12:diagnostics12112826. [PMID: 36428886 PMCID: PMC9689364 DOI: 10.3390/diagnostics12112826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 10/13/2022] [Accepted: 10/15/2022] [Indexed: 11/18/2022] Open
Abstract
This study aims to determine the feasibility of machine learning (ML) and patient registration record to be utilised to develop an over-the-counter (OTC) screening model for breast cancer risk estimation. Data were retrospectively collected from women who came to the Hospital Universiti Sains Malaysia, Malaysia for breast-related problems. Eight ML models were used: k-nearest neighbour (kNN), elastic-net logistic regression, multivariate adaptive regression splines, artificial neural network, partial least square, random forest, support vector machine (SVM), and extreme gradient boosting. Features utilised for the development of the screening models were limited to information in the patient registration form. The final model was evaluated in terms of performance across a mammographic density. Additionally, the feature importance of the final model was assessed using the model agnostic approach. kNN had the highest Youden J index, precision, and PR-AUC, while SVM had the highest F2 score. The kNN model was selected as the final model. The model had a balanced performance in terms of sensitivity, specificity, and PR-AUC across the mammographic density groups. The most important feature was the age at examination. In conclusion, this study showed that ML and patient registration information are feasible to be used as the OTC screening model for breast cancer.
Collapse
Affiliation(s)
- Tengku Muhammad Hanis
- Department of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
- Correspondence: (T.M.H.); (K.I.M.)
| | | | - Wan Nor Arifin
- Biostatistics and Research Methodology Unit, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Juhara Haron
- Department of Radiology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
- Breast Cancer Awareness and Research Unit, Hospital Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Wan Faiziah Wan Abdul Rahman
- Breast Cancer Awareness and Research Unit, Hospital Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
- Department of Pathology, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
| | - Rosni Abdullah
- School of Computer Sciences, Universiti Sains Malaysia, Gelugor 11800, Penang, Malaysia
| | - Kamarul Imran Musa
- Department of Community Medicine, School of Medical Sciences, Universiti Sains Malaysia, Kubang Kerian 16150, Kelantan, Malaysia
- Correspondence: (T.M.H.); (K.I.M.)
| |
Collapse
|
18
|
Bai Z, Bai Y, Fang C, Chen W. Oxidative stress-related patterns determination for establishment of prognostic models, and characteristics of tumor microenvironment infiltration. Front Surg 2022; 9:1013794. [PMID: 36386530 PMCID: PMC9665876 DOI: 10.3389/fsurg.2022.1013794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 10/14/2022] [Indexed: 12/02/2022] Open
Abstract
Oxidative stress-mediated excessive accumulation of ROS in the body destroys cell homeostasis and participates in various diseases. However, the relationship between oxidative stress-related genes (ORGs) and tumor microenvironment (TME) in gastric cancer remains poorly understood. For improving the treatment strategy of GC, it is necessary to explore the relationship among them. We describe the changes of ORGs in 732 gastric cancer samples from two data sets. The two different molecular subtypes revealed that the changes of ORGs were associated with clinical features, prognosis, and TME. Subsequently, the OE_score was related to RFS, as confirmed by the correlation between OE_score and TME, TMB, MSI, immunotherapy, stem cell analysis, chemotherapeutic drugs, etc. OE_score can be used as an independent predictive marker for the treatment and prognosis of gastric cancer. Further, a Norman diagram was established to improve clinical practicability. Our research showed a potential role of ORGs in clinical features, prognosis, and tumor microenvironment of gastric cancer. Our research findings broaden the understanding of gastric cancer ORGs as a potential target for individualized treatment of gastric cancer and a new direction to evaluate the prognosis.
Collapse
Affiliation(s)
- Zihao Bai
- Graduate Department, Shanxi Medical University, Taiyuan, China
| | - Yihua Bai
- Graduate Department, Shanxi Medical University, Taiyuan, China
| | - Changzhong Fang
- Graduate Department, Shanxi Medical University, Taiyuan, China
| | - Wenliang Chen
- Department of General Surgery, The 2nd Affiliated Hospital of Shanxi Medical University, Taiyuan, China,Correspondence: Wenliang Chen
| |
Collapse
|
19
|
Briggs E, de Kamps M, Hamilton W, Johnson O, McInerney CD, Neal RD. Machine Learning for Risk Prediction of Oesophago-Gastric Cancer in Primary Care: Comparison with Existing Risk-Assessment Tools. Cancers (Basel) 2022; 14:cancers14205023. [PMID: 36291807 PMCID: PMC9600097 DOI: 10.3390/cancers14205023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/12/2022] [Accepted: 10/12/2022] [Indexed: 11/18/2022] Open
Abstract
Simple Summary Oesophago-gastric cancer is one of the commonest cancers worldwide, yet it can be particularly difficult to diagnose given that initial symptoms are often non-specific and routine screening is not available. Cancer risk-assessment tools, which calculate cancer risk based on symptoms and other risk factors present in the primary care record, can aid decisions on referrals for cancer investigations, facilitating earlier diagnosis. Diagnosing common cancers earlier could help improve survival rates. Using UK primary care electronic health record data, we compared five different machine learning techniques for probabilistic classification of cancer patients against a current widely used UK primary care cancer risk-assessment tool. The machine learning algorithms outperformed the current risk-assessment tool, with a higher overall accuracy and an ability to reasonably identify 11–25% more cancer patients. We conclude that machine-learning-based risk-assessment tools could help better identify suitable patients for further investigation and support earlier diagnosis. Abstract Oesophago-gastric cancer is difficult to diagnose in the early stages given its typical non-specific initial manifestation. We hypothesise that machine learning can improve upon the diagnostic performance of current primary care risk-assessment tools by using advanced analytical techniques to exploit the wealth of evidence available in the electronic health record. We used a primary care electronic health record dataset derived from the UK General Practice Research Database (7471 cases; 32,877 controls) and developed five probabilistic machine learning classifiers: Support Vector Machine, Random Forest, Logistic Regression, Naïve Bayes, and Extreme Gradient Boosted Decision Trees. Features included basic demographics, symptoms, and lab test results. The Logistic Regression, Support Vector Machine, and Extreme Gradient Boosted Decision Tree models achieved the highest performance in terms of accuracy and AUROC (0.89 accuracy, 0.87 AUROC), outperforming a current UK oesophago-gastric cancer risk-assessment tool (ogRAT). Machine learning also identified more cancer patients than the ogRAT: 11.0% more with little to no effect on false positives, or up to 25.0% more with a slight increase in false positives (for Logistic Regression, results threshold-dependent). Feature contribution estimates and individual prediction explanations indicated clinical relevance. We conclude that machine learning could improve primary care cancer risk-assessment tools, potentially helping clinicians to identify additional cancer cases earlier. This could, in turn, improve survival outcomes.
Collapse
Affiliation(s)
- Emma Briggs
- School of Computing, University of Leeds, Leeds LS2 9JT, UK
- Correspondence:
| | - Marc de Kamps
- School of Computing, University of Leeds, Leeds LS2 9JT, UK
- Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
- The Alan Turing Institute, London NW1 2DB, UK
| | - Willie Hamilton
- Department of Health and Community Sciences, University of Exeter, Exeter EX1 2LU, UK
| | - Owen Johnson
- School of Computing, University of Leeds, Leeds LS2 9JT, UK
- Leeds Institute for Data Analytics, University of Leeds, Leeds LS2 9NL, UK
| | - Ciarán D. McInerney
- Academic Unit of Primary Medical Care, University of Sheffield, Sheffield S10 2TN, UK
| | - Richard D. Neal
- Department of Health and Community Sciences, University of Exeter, Exeter EX1 2LU, UK
| |
Collapse
|
20
|
Izadi Z, Gianfrancesco MA, Aguirre A, Strangfeld A, Mateus EF, Hyrich KL, Gossec L, Carmona L, Lawson‐Tovey S, Kearsley‐Fleet L, Schaefer M, Seet AM, Schmajuk G, Jacobsohn L, Katz P, Rush S, Al‐Emadi S, Sparks JA, Hsu TY, Patel NJ, Wise L, Gilbert E, Duarte‐García A, Valenzuela‐Almada MO, Ugarte‐Gil MF, Ribeiro SLE, de Oliveira Marinho A, de Azevedo Valadares LD, Giuseppe DD, Hasseli R, Richter JG, Pfeil A, Schmeiser T, Isnardi CA, Reyes Torres AA, Alle G, Saurit V, Zanetti A, Carrara G, Labreuche J, Barnetche T, Herasse M, Plassart S, Santos MJ, Rodrigues AM, Robinson PC, Machado PM, Sirotich E, Liew JW, Hausmann JS, Sufka P, Grainger R, Bhana S, Costello W, Wallace ZS, Yazdany J. Development of a Prediction Model for COVID-19 Acute Respiratory Distress Syndrome in Patients With Rheumatic Diseases: Results From the Global Rheumatology Alliance Registry. ACR Open Rheumatol 2022; 4:872-882. [PMID: 35869686 PMCID: PMC9350083 DOI: 10.1002/acr2.11481] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 05/31/2022] [Indexed: 11/09/2022] Open
Abstract
OBJECTIVE Some patients with rheumatic diseases might be at higher risk for coronavirus disease 2019 (COVID-19) acute respiratory distress syndrome (ARDS). We aimed to develop a prediction model for COVID-19 ARDS in this population and to create a simple risk score calculator for use in clinical settings. METHODS Data were derived from the COVID-19 Global Rheumatology Alliance Registry from March 24, 2020, to May 12, 2021. Seven machine learning classifiers were trained on ARDS outcomes using 83 variables obtained at COVID-19 diagnosis. Predictive performance was assessed in a US test set and was validated in patients from four countries with independent registries using area under the curve (AUC), accuracy, sensitivity, and specificity. A simple risk score calculator was developed using a regression model incorporating the most influential predictors from the best performing classifier. RESULTS The study included 8633 patients from 74 countries, of whom 523 (6%) had ARDS. Gradient boosting had the highest mean AUC (0.78; 95% confidence interval [CI]: 0.67-0.88) and was considered the top performing classifier. Ten predictors were identified as key risk factors and were included in a regression model. The regression model that predicted ARDS with 71% (95% CI: 61%-83%) sensitivity in the test set, and with sensitivities ranging from 61% to 80% in countries with independent registries, was used to develop the risk score calculator. CONCLUSION We were able to predict ARDS with good sensitivity using information readily available at COVID-19 diagnosis. The proposed risk score calculator has the potential to guide risk stratification for treatments, such as monoclonal antibodies, that have potential to reduce COVID-19 disease progression.
Collapse
Affiliation(s)
| | | | | | | | - Elsa F. Mateus
- Portuguese League Against Rheumatic DiseasesLisbonPortugal
| | - Kimme L. Hyrich
- The University of Manchester and National Institute for Health Research Manchester Biomedical Research Centre, Manchester University and NHS Foundation TrustManchesterUK
| | - Laure Gossec
- INSERM, Sorbonne Universite and Hopital Universitaire Pitie Salpetriere, AP‐HPParisFrance
| | | | - Saskia Lawson‐Tovey
- The University of Manchester and National Institute for Health Research Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust and Manchester Academic Health Science CentreManchesterUK
| | - Lianne Kearsley‐Fleet
- The University of Manchester and Manchester Academic Health Science CentreManchesterUK
| | | | | | - Gabriela Schmajuk
- University of CaliforniaSan Francisco and San Francisco Department of Veterans Affairs Medical Center
| | | | | | | | | | - Jeffrey A. Sparks
- Brigham and Women's Hospital and Harvard Medical SchoolBostonMassachusetts
| | - Tiffany Y‐T Hsu
- Brigham and Women's Hospital and Harvard Medical SchoolBostonMassachusetts
| | - Naomi J. Patel
- Massachusetts General Hospital and Harvard Medical SchoolBoston
| | - Leanna Wise
- University of Southern CaliforniaLos Angeles
| | | | | | | | - Manuel F. Ugarte‐Gil
- Universidad Científica del Sur and Hospital Nacional Guillermo Almenara IrigoyenEsSalud, LimaPeru
| | | | | | | | | | - Rebecca Hasseli
- Justus‐Liebig University Giessen, Campus KerckhoffGiessenGermany
| | | | - Alexander Pfeil
- Jena University Hospital and Friedrich Schiller University JenaJenaGermany
| | - Tim Schmeiser
- Rheumatology im Veedel (Private Practice)CologneGermany
| | | | | | | | | | - Anna Zanetti
- Italian Society for Rheumatology and University of Milano‐BicoccaMilanItaly
| | - Greta Carrara
- Italian Society for Rheumatology and University of Milano‐BicoccaMilanItaly
| | | | - Thomas Barnetche
- FHU ACRONIM, Centre for Autoimmune Systemic Rare Diseases, Bordeaux University HospitalBordeauxFrance
| | - Muriel Herasse
- Filière des Maladies Autoimmunes et Autoinflammatoires Rares, Hôpital Huriez, Centre Hospitalier Universitaire de LilleLilleFrance
| | - Samira Plassart
- Filière des Maladies Autoimmunes et Autoinflammatoires Rares, Hôpital Huriez, Centre Hospitalier Universitaire de LilleLilleFrance
| | - Maria José Santos
- Hospital Garcia de Orta, Almada, Portugal, and Instituto de Medicina Molecular Faculdade Medicina and Rheumatic Diseases Portuguese RegisterLisbonPortugal
| | - Ana Maria Rodrigues
- Rheumatic Diseases Portuguese Register, Sociedade Portuguesa de Reumatologia, Nova Medical School, and Hospital dos LusiadasLisbonPortugal
| | - Philip C. Robinson
- The University of Queensland, Brisbane, Queensland, Australia, and Royal Brisbane and Women's Hospital, Metro North Hospital and Health ServiceHerstonQueenslandAustralia
| | - Pedro M. Machado
- University College London, University College London Hospitals NHS Foundation Trust and Northwick Park Hospital, London North West University Healthcare NHS TrustLondonUK
| | - Emily Sirotich
- McMaster University, Hamilton, Ontario, Canada, and Canadian Arthritis Patient AllianceTorontoOntarioCanada
| | - Jean W. Liew
- Boston University School of MedicineBostonMassachusetts
| | - Jonathan S. Hausmann
- Beth Israel Deaconess Medical Center, Harvard Medical School and Boston Children's HospitalBostonMassachusetts
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Roman-Belmonte JM, De la Corte-Rodriguez H, Rodriguez-Merchan EC, Vazquez-Sasot A, Rodriguez-Damiani BA, Resino-Luís C, Sanchez-Laguna F. The three horizons model applied to medical science. Postgrad Med 2022; 134:776-783. [DOI: 10.1080/00325481.2022.2124086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Juan M. Roman-Belmonte
- Department of Physical Medicine and Rehabilitation, Cruz Roja San José y Santa Adela University Hospital, Madrid, Spain
| | | | - E. Carlos Rodriguez-Merchan
- Department of Orthopedic Surgery, La Paz University Hospital, Madrid, Spain
- Osteoarticular Surgery Research, Hospital La Paz Institute for Health Research – IdiPAZ (La Paz University Hospital – Autonomous University of Madrid), Madrid, Spain
| | - Aranzazu Vazquez-Sasot
- Department of Physical Medicine and Rehabilitation, Cruz Roja San José y Santa Adela University Hospital, Madrid, Spain
| | - Beatriz A. Rodriguez-Damiani
- Department of Physical Medicine and Rehabilitation, Cruz Roja San José y Santa Adela University Hospital, Madrid, Spain
| | - Cristina Resino-Luís
- Department of Physical Medicine and Rehabilitation, Cruz Roja San José y Santa Adela University Hospital, Madrid, Spain
| | | |
Collapse
|
22
|
Lui TKL, Cheung KS, Leung WK. Machine learning models in the prediction of 1-year mortality in patients with advanced hepatocellular cancer on immunotherapy: a proof-of-concept study. Hepatol Int 2022; 16:879-891. [PMID: 35779202 DOI: 10.1007/s12072-022-10370-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 05/22/2022] [Indexed: 11/28/2022]
Abstract
INTRODUCTION Immunotherapy is a new promising treatment for patients with advanced hepatocellular carcinoma (HCC), but is costly and potentially associated with considerable side effects. This study aimed to evaluate the role of machine learning (ML) models in predicting the 1-year cancer-related mortality in advanced HCC patients treated with immunotherapy. METHOD 395 HCC patients who had received immunotherapy (including nivolumab, pembrolizumab or ipilimumab) between 2014 and 2019 in Hong Kong were included. The whole data sets were randomly divided into training (n = 316) and internal validation (n = 79) set. The data set, including 47 clinical variables, was used to construct six different ML models in predicting the risk of 1-year mortality. The performances of ML models were measured by the area under receiver operating characteristic curve (AUC) and their performances were compared with C-Reactive protein and Alpha Fetoprotein in ImmunoTherapY score (CRAFITY) and albumin-bilirubin (ALBI) score. The ML models were further validated with an external cohort between 2020 and 2021. RESULTS The 1-year cancer-related mortality was 51.1%. Of the six ML models, the random forest (RF) has the highest AUC of 0.92 (95% CI 0.87-0.98), which was better than logistic regression (0.82, p = 0.01) as well as the CRAFITY (0.68, p < 0.01) and ALBI score (0.84, p = 0.04). RF had the lowest false positive (2.0%) and false negative rate (5.2%), and performed better than CRAFITY score in the external validation cohort (0.91 vs 0.66, p < 0.01). High baseline AFP, bilirubin and alkaline phosphatase were three common risk factors identified by all ML models. CONCLUSION ML models could predict 1-year cancer-related mortality in HCC patients treated with immunotherapy, which may help to select patients who would benefit from this treatment.
Collapse
Affiliation(s)
- Thomas Ka Luen Lui
- Department of Medicine, University of Hong Kong, 4/F, Professorial Block, Queen Mary Hospital, 102 Pokfulam Road, Hong Kong, China
| | - Ka Shing Cheung
- Department of Medicine, University of Hong Kong, 4/F, Professorial Block, Queen Mary Hospital, 102 Pokfulam Road, Hong Kong, China
| | - Wai Keung Leung
- Department of Medicine, University of Hong Kong, 4/F, Professorial Block, Queen Mary Hospital, 102 Pokfulam Road, Hong Kong, China.
| |
Collapse
|
23
|
Exploring the Utility of Anonymized EHR Datasets in Machine Learning Experiments in the Context of the MODELHealth Project. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12125942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
The object of this paper was the application of machine learning to a clinical dataset that was anonymized using the Mondrian algorithm. (1) Background: The preservation of patient privacy is a necessity rising from the increasing digitization of health data; however, the effect of data anonymization on the performance of machine learning models remains to be explored. (2) Methods: The original EHR derived dataset was subjected to anonymization by applying the Mondrian algorithm for various k values and quasi identifier (QI) set attributes. The logistic regression, decision trees, k-nearest neighbors, Gaussian naive Bayes and support vector machine models were applied to the different dataset versions. (3) Results: The classifiers demonstrated different degrees of resilience to the anonymization, with the decision tree and the KNN models showing remarkably stable performance, as opposed to the Gaussian naïve Bayes model. The choice of the QI set attributes and the generalized information loss value played a more important role than the size of the QI set or the k value. (4) Conclusions: Data anonymization can reduce the performance of certain machine learning models, although the appropriate selection of classifier and parameter values can mitigate this effect.
Collapse
|
24
|
Greenberg JK, Otun A, Ghogawala Z, Yen PY, Molina CA, Limbrick DD, Foraker RE, Kelly MP, Ray WZ. Translating Data Analytics Into Improved Spine Surgery Outcomes: A Roadmap for Biomedical Informatics Research in 2021. Global Spine J 2022; 12:952-963. [PMID: 33973491 PMCID: PMC9344511 DOI: 10.1177/21925682211008424] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
STUDY DESIGN Narrative review. OBJECTIVES There is growing interest in the use of biomedical informatics and data analytics tools in spine surgery. Yet despite the rapid growth in research on these topics, few analytic tools have been implemented in routine spine practice. The purpose of this review is to provide a health information technology (HIT) roadmap to help translate data assets and analytics tools into measurable advances in spine surgical care. METHODS We conducted a narrative review of PubMed and Google Scholar to identify publications discussing data assets, analytical approaches, and implementation strategies relevant to spine surgery practice. RESULTS A variety of data assets are available for spine research, ranging from commonly used datasets, such as administrative billing data, to emerging resources, such as mobile health and biobanks. Both regression and machine learning techniques are valuable for analyzing these assets, and researchers should recognize the particular strengths and weaknesses of each approach. Few studies have focused on the implementation of HIT, and a variety of methods exist to help translate analytic tools into clinically useful interventions. Finally, a number of HIT-related challenges must be recognized and addressed, including stakeholder acceptance, regulatory oversight, and ethical considerations. CONCLUSIONS Biomedical informatics has the potential to support the development of new HIT that can improve spine surgery quality and outcomes. By understanding the development life-cycle that includes identifying an appropriate data asset, selecting an analytic approach, and leveraging an effective implementation strategy, spine researchers can translate this potential into measurable advances in patient care.
Collapse
Affiliation(s)
- Jacob K. Greenberg
- Department of Neurological Surgery, Washington University School of Medicine,
St. Louis, MO, USA,Jacob K. Greenberg, Department of
Neurosurgery, Washington University School of Medicine, 660S. Euclid Ave., Box
8057, St. Louis, MO 63 110, USA.
| | - Ayodamola Otun
- Department of Neurological Surgery, Washington University School of Medicine,
St. Louis, MO, USA
| | - Zoher Ghogawala
- Department of Neurosurgery, Lahey Hospital and Medical Center, Burlington, MA, USA
| | - Po-Yin Yen
- Institute for Informatics, Washington University School of Medicine,
St. Louis, MO, USA
| | - Camilo A. Molina
- Department of Neurological Surgery, Washington University School of Medicine,
St. Louis, MO, USA
| | - David D. Limbrick
- Department of Neurological Surgery, Washington University School of Medicine,
St. Louis, MO, USA
| | - Randi E Foraker
- Institute for Informatics, Washington University School of Medicine,
St. Louis, MO, USA
| | - Michael P. Kelly
- Department of Orthopaedic Surgery, Washington University School of Medicine,
St. Louis, MO, USA
| | - Wilson Z. Ray
- Department of Neurological Surgery, Washington University School of Medicine,
St. Louis, MO, USA
| |
Collapse
|
25
|
Prediction of Trypanosoma evansi infection in dromedaries using artificial neural network (ANN). Vet Parasitol 2022; 306:109716. [DOI: 10.1016/j.vetpar.2022.109716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 05/05/2022] [Accepted: 05/06/2022] [Indexed: 11/20/2022]
|
26
|
Kwong JC, Khondker A, Kim JK, Chua M, Keefe DT, Dos Santos J, Skreta M, Erdman L, D'Souza N, Selman AF, Weaver J, Weiss DA, Long C, Tasian G, Teoh CW, Rickard M, Lorenzo AJ. Posterior Urethral Valves Outcomes Prediction (PUVOP): a machine learning tool to predict clinically relevant outcomes in boys with posterior urethral valves. Pediatr Nephrol 2022; 37:1067-1074. [PMID: 34686914 DOI: 10.1007/s00467-021-05321-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 09/11/2021] [Accepted: 09/28/2021] [Indexed: 10/20/2022]
Abstract
BACKGROUND Early kidney and anatomic features may be predictive of future progression and need for additional procedures in patients with posterior urethral valve (PUV). The objective of this study was to use machine learning (ML) to predict clinically relevant outcomes in these patients. METHODS Patients diagnosed with PUV with kidney function measurements at our institution between 2000 and 2020 were included. Pertinent clinical measures were abstracted, including estimated glomerular filtration rate (eGFR) at each visit, initial vesicoureteral reflux grade, and renal dysplasia at presentation. ML models were developed to predict clinically relevant outcomes: progression in CKD stage, initiation of kidney replacement therapy (KRT), and need for clean-intermittent catheterization (CIC). Model performance was assessed by concordance index (c-index) and the model was externally validated. RESULTS A total of 103 patients were included with a median follow-up of 5.7 years. Of these patients, 26 (25%) had CKD progression, 18 (17%) required KRT, and 32 (31%) were prescribed CIC. Additionally, 22 patients were included for external validation. The ML model predicted CKD progression (c-index = 0.77; external C-index = 0.78), KRT (c-index = 0.95; external C-index = 0.89) and indicated CIC (c-index = 0.70; external C-index = 0.64), and all performed better than Cox proportional-hazards regression. The models have been packaged into a simple easy-to-use tool, available at https://share.streamlit.io/jcckwong/puvop/main/app.py CONCLUSION: ML-based approaches for predicting clinically relevant outcomes in PUV are feasible. Further validation is warranted, but this implementable model can act as a decision-making aid. A higher resolution version of the Graphical abstract is available as Supplementary information.
Collapse
Affiliation(s)
- Jethro Cc Kwong
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada.,Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Adree Khondker
- Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada.,Temerty Faculty of Medicine, University of Toronto, Toronto, ON, Canada
| | - Jin Kyu Kim
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada.,Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Michael Chua
- Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Daniel T Keefe
- Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Joana Dos Santos
- Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Marta Skreta
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Lauren Erdman
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON, Canada
| | - Neeta D'Souza
- Division of Urology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | | | - John Weaver
- Division of Urology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Dana A Weiss
- Division of Urology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Christopher Long
- Division of Urology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Gregory Tasian
- Division of Urology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Chia Wei Teoh
- Division of Nephrology, Hospital for Sick Children, Toronto, ON, Canada.,Department of Paediatrics, University of Toronto, Toronto, ON, Canada
| | - Mandy Rickard
- Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Armando J Lorenzo
- Division of Urology, Department of Surgery, University of Toronto, Toronto, ON, Canada. .,Division of Urology, Department of Surgery, Hospital for Sick Children, 555 University Avenue, Toronto, ON, M5G 1X8, Canada.
| |
Collapse
|
27
|
Appelbaum L, Kaplan ID, Palchuk MB, Kundrot S, Winer-Jones JP, Rinard M. Development and Experience with Cancer Risk Prediction Models Using Federated Databases and Electronic Health Records. Digit Health 2022. [DOI: 10.36255/exon-publications-digital-health-federated-databases] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
28
|
Lin KW, Ang TL, Li JW. Role of artificial intelligence in early detection and screening for pancreatic adenocarcinoma. Artif Intell Med Imaging 2022; 3:21-32. [DOI: 10.35711/aimi.v3.i2.21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/12/2022] [Accepted: 03/17/2022] [Indexed: 02/06/2023] Open
Abstract
Pancreatic adenocarcinoma remains to be one of the deadliest malignancies in the world despite treatment advancement over the past few decades. Its low survival rates and poor prognosis can be attributed to ambiguity in recommendations for screening and late symptom onset, contributing to its late presentation. In the recent years, artificial intelligence (AI) as emerged as a field to aid in the process of clinical decision making. Considerable efforts have been made in the realm of AI to screen for and predict future development of pancreatic ductal adenocarcinoma. This review discusses the use of AI in early detection and screening for pancreatic adenocarcinoma, and factors which may limit its use in a clinical setting.
Collapse
Affiliation(s)
- Kenneth Weicong Lin
- Department of Gastroenterology and Hepatology, Changi General Hospital, Singapore 529889, Singapore
| | - Tiing Leong Ang
- Department of Gastroenterology and Hepatology, Changi General Hospital, Singapore 529889, Singapore
| | - James Weiquan Li
- Department of Gastroenterology and Hepatology, Changi General Hospital, Singapore 529889, Singapore
| |
Collapse
|
29
|
A review on machine learning techniques for the assessment of image grading in breast mammogram. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01546-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
30
|
Qiu H, Ding S, Liu J, Wang L, Wang X. Applications of Artificial Intelligence in Screening, Diagnosis, Treatment, and Prognosis of Colorectal Cancer. Curr Oncol 2022; 29:1773-1795. [PMID: 35323346 PMCID: PMC8947571 DOI: 10.3390/curroncol29030146] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 02/28/2022] [Accepted: 03/03/2022] [Indexed: 12/29/2022] Open
Abstract
Colorectal cancer (CRC) is one of the most common cancers worldwide. Accurate early detection and diagnosis, comprehensive assessment of treatment response, and precise prediction of prognosis are essential to improve the patients’ survival rate. In recent years, due to the explosion of clinical and omics data, and groundbreaking research in machine learning, artificial intelligence (AI) has shown a great application potential in clinical field of CRC, providing new auxiliary approaches for clinicians to identify high-risk patients, select precise and personalized treatment plans, as well as to predict prognoses. This review comprehensively analyzes and summarizes the research progress and clinical application value of AI technologies in CRC screening, diagnosis, treatment, and prognosis, demonstrating the current status of the AI in the main clinical stages. The limitations, challenges, and future perspectives in the clinical implementation of AI are also discussed.
Collapse
Affiliation(s)
- Hang Qiu
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China;
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
- Correspondence: (H.Q.); (X.W.)
| | - Shuhan Ding
- School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA;
| | - Jianbo Liu
- West China School of Medicine, Sichuan University, Chengdu 610041, China;
- Department of Gastrointestinal Surgery, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Liya Wang
- Big Data Research Center, University of Electronic Science and Technology of China, Chengdu 611731, China;
| | - Xiaodong Wang
- West China School of Medicine, Sichuan University, Chengdu 610041, China;
- Department of Gastrointestinal Surgery, West China Hospital, Sichuan University, Chengdu 610041, China
- Correspondence: (H.Q.); (X.W.)
| |
Collapse
|
31
|
Machine Learning-Based Risk Prediction of Critical Care Unit Admission for Advanced Stage High Grade Serous Ovarian Cancer Patients Undergoing Cytoreductive Surgery: The Leeds-Natal Score. J Clin Med 2021; 11:jcm11010087. [PMID: 35011828 PMCID: PMC8745521 DOI: 10.3390/jcm11010087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/20/2021] [Accepted: 12/22/2021] [Indexed: 12/12/2022] Open
Abstract
Achieving complete surgical cytoreduction in advanced stage high grade serous ovarian cancer (HGSOC) patients warrants an availability of Critical Care Unit (CCU) beds. Machine Learning (ML) could be helpful in monitoring CCU admissions to improve standards of care. We aimed to improve the accuracy of predicting CCU admission in HGSOC patients by ML algorithms and developed an ML-based predictive score. A cohort of 291 advanced stage HGSOC patients with fully curated data was selected. Several linear and non-linear distances, and quadratic discriminant ML methods, were employed to derive prediction information for CCU admission. When all the variables were included in the model, the prediction accuracies were higher for linear discriminant (0.90) and quadratic discriminant (0.93) methods compared with conventional logistic regression (0.84). Feature selection identified pre-treatment albumin, surgical complexity score, estimated blood loss, operative time, and bowel resection with stoma as the most significant prediction features. The real-time prediction accuracy of the Graphical User Interface CCU calculator reached 95%. Limited, potentially modifiable, mostly intra-operative factors contributing to CCU admission were identified and suggest areas for targeted interventions. The accurate quantification of CCU admission patterns is critical information when counseling patients about peri-operative risks related to their cytoreductive surgery.
Collapse
|
32
|
Francisco ME, Carvajal TM, Ryo M, Nukazawa K, Amalin DM, Watanabe K. Dengue disease dynamics are modulated by the combined influences of precipitation and landscape: A machine learning approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 792:148406. [PMID: 34157535 DOI: 10.1016/j.scitotenv.2021.148406] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Revised: 05/25/2021] [Accepted: 06/08/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND Dengue is an endemic vector-borne disease influenced by environmental factors such as landscape and climate. Previous studies separately assessed the effects of landscape and climate factors on mosquito occurrence and dengue incidence. However, both factors concurrently coexist in time and space and can interact, affecting mosquito development and dengue disease transmission. For example, eggs laid in a suitable environment can hatch after being submerged in rain water. It has been difficult for conventional statistical modeling approaches to demonstrate these combined influences due to mathematical constraints. OBJECTIVES To investigate the combined influences of landscape and climate factors on mosquito occurrence and dengue incidence. METHODS Entomological, epidemiological, and landscape data from the rainy season (July-December) were obtained from respective government agencies in Metropolitan Manila, Philippines, from 2012 to 2014. Temperature, precipitation and vegetation data were obtained through remote sensing. A random forest algorithm was used to select the landscape and climate variables. Afterward, using the identified key variables, a model-based (MOB) recursive partitioning was implemented to test the combined influences of landscape and climate factors on ovitrap index (vector mosquito occurrence) and dengue incidence. RESULTS The MOB recursive partitioning for ovitrap index indicated a high sensitivity of vector mosquito occurrence on environmental conditions generated by a combination of high residential density areas with low precipitation. Moreover, the MOB recursive partitioning indicated high sensitivity of dengue incidence to the effects of precipitation in areas with high proportions of residential density and commercial areas. CONCLUSIONS Dengue dynamics are not solely influenced by individual effects of either climate or landscape, but rather by their synergistic or combined effects. The presented findings have the potential to target vector surveillance in areas identified as suitable for mosquito occurrence under specific climatic conditions and may be relevant as part of urban planning strategies to control dengue.
Collapse
Affiliation(s)
- Micanaldo Ernesto Francisco
- Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama 790-8577, Japan; Graduate School of Science and Engineering, Ehime University, Matsuyama 790-8577, Japan
| | - Thaddeus M Carvajal
- Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama 790-8577, Japan; Graduate School of Science and Engineering, Ehime University, Matsuyama 790-8577, Japan; Biology Department, De La Salle University, Taft Ave, Manila 1004, Philippines; Biological Control Research Unit, Center for Natural Science and Environmental Research, De La Salle University, Taft Ave, Manila, Philippines
| | - Masahiro Ryo
- Leibniz Centre for Agricultural Landscape Research (ZALF), Eberswalder Str. 84, 15374 Müncheberg, Germany; Environment and Natural Sciences, Brandenburg University of Technology Cottbus-Senftenberg, 03046 Cottbus, Germany
| | - Kei Nukazawa
- Department of Civil and Environmental Engineering, University of Miyazaki, Miyazaki 889-2192, Japan
| | - Divina M Amalin
- Biology Department, De La Salle University, Taft Ave, Manila 1004, Philippines; Biological Control Research Unit, Center for Natural Science and Environmental Research, De La Salle University, Taft Ave, Manila, Philippines
| | - Kozo Watanabe
- Center for Marine Environmental Studies (CMES), Ehime University, Matsuyama 790-8577, Japan; Graduate School of Science and Engineering, Ehime University, Matsuyama 790-8577, Japan; Biology Department, De La Salle University, Taft Ave, Manila 1004, Philippines; Biological Control Research Unit, Center for Natural Science and Environmental Research, De La Salle University, Taft Ave, Manila, Philippines.
| |
Collapse
|
33
|
MI-MOTE: Multiple imputation-based minority oversampling technique for imbalanced and incomplete data classification. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.06.043] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
34
|
Machine Learning in the Differentiation of Soft Tissue Neoplasms: Comparison of Fat-Suppressed T2WI and Apparent Diffusion Coefficient (ADC) Features-Based Models. J Digit Imaging 2021; 34:1146-1155. [PMID: 34545474 PMCID: PMC8554992 DOI: 10.1007/s10278-021-00513-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 08/18/2021] [Accepted: 08/22/2021] [Indexed: 12/26/2022] Open
Abstract
Machine learning has been widely used in the characterization of tumors recently. This article aims to explore the feasibility of the whole tumor fat-suppressed (FS) T2WI and ADC features-based least absolute shrinkage and selection operator (LASSO)-logistic predictive models in the differentiation of soft tissue neoplasms (STN). The clinical and MR findings of 160 cases with 161 histologically proven STN were reviewed, retrospectively, 75 with diffusion-weighted imaging (DWI with b values of 50, 400, and 800 s/mm2). They were divided into benign and malignant groups and further divided into training (70%) and validation (30%) cohorts. The MR FS T2WI and ADC features-based LASSO-logistic models were built and compared. The AUC of the FS T2WI features-based LASSO-logistic regression model for benign and malignant prediction was 0.65 and 0.75 for the training and validation cohorts. The model's sensitivity, specificity, and accuracy of the validation cohort were 55%, 96%, and 76.6%. While the AUC of the ADC features-based model was 0.932 and 0.955 for the training and validation cohorts. The model's sensitivity, specificity, and accuracy were 83.3%, 100%, and 91.7%. The performances of these models were also validated by decision curve analysis (DCA). The AUC of the whole tumor ADC features-based LASSO-logistic regression predictive model was larger than that of FS T2WI features (p = 0.017). The whole tumor fat-suppressed T2WI and ADC features-based LASSO-logistic predictive models both can serve as useful tools in the differentiation of STN. ADC features-based LASSO-logistic regression predictive model did better than that of FS T2WI features.
Collapse
|
35
|
Abstract
OBJECTIVES The purpose of this scoping review is to: (1) identify existing supervised machine learning (ML) approaches on the prediction of cancer in asymptomatic adults; (2) to compare the performance of ML models with each other and (3) to identify potential gaps in research. DESIGN Scoping review using the population, concept and context approach. SEARCH STRATEGY PubMed search engine was used from inception to 10 November 2020 to identify literature meeting following inclusion criteria: (1) a general adult (≥18 years) population, either sex, asymptomatic (population); (2) any study using ML techniques to derive predictive models for future cancer risk using clinical and/or demographic and/or basic laboratory data (concept) and (3) original research articles conducted in all settings in any region of the world (context). RESULTS The search returned 627 unique articles, of which 580 articles were excluded because they did not meet the inclusion criteria, were duplicates or were related to benign neoplasm. Full-text reviews were conducted for 47 articles and a final set of 10 articles were included in this scoping review. These 10 very heterogeneous studies used ML to predict future cancer risk in asymptomatic individuals. All studies reported area under the receiver operating characteristics curve (AUC) values as metrics of model performance, but no study reported measures of model calibration. CONCLUSIONS Research gaps that must be addressed in order to deliver validated ML-based models to assist clinical decision-making include: (1) establishing model generalisability through validation in independent cohorts, including those from low-income and middle-income countries; (2) establishing models for all cancer types; (3) thorough comparisons of ML models with best available clinical tools to ensure transparency of their potential clinical utility; (4) reporting of model calibration performance and (5) comparisons of different methods on the same cohort to reveal important information about model generalisability and performance.
Collapse
Affiliation(s)
- Asma Abdullah Alfayez
- Institute of Health Informatics, University College London, London, UK
- King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Holger Kunz
- Institute of Health Informatics, University College London, London, UK
| | - Alvina Grace Lai
- Institute of Health Informatics, University College London, London, UK
| |
Collapse
|
36
|
Oei RW, Lyu Y, Ye L, Kong F, Du C, Zhai R, Xu T, Shen C, He X, Kong L, Hu C, Ying H. Progression-Free Survival Prediction in Patients with Nasopharyngeal Carcinoma after Intensity-Modulated Radiotherapy: Machine Learning vs. Traditional Statistics. J Pers Med 2021; 11:jpm11080787. [PMID: 34442430 PMCID: PMC8398698 DOI: 10.3390/jpm11080787] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/08/2021] [Accepted: 08/10/2021] [Indexed: 12/24/2022] Open
Abstract
Background: The Cox proportional hazards (CPH) model is the most commonly used statistical method for nasopharyngeal carcinoma (NPC) prognostication. Recently, machine learning (ML) models are increasingly adopted for this purpose. However, only a few studies have compared the performances between CPH and ML models. This study aimed at comparing CPH with two state-of-the-art ML algorithms, namely, conditional survival forest (CSF) and DeepSurv for disease progression prediction in NPC. Methods: From January 2010 to March 2013, 412 eligible NPC patients were reviewed. The entire dataset was split into training cohort and testing cohort in a ratio of 90%:10%. Ten features from patient-related, disease-related, and treatment-related data were used to train the models for progression-free survival (PFS) prediction. The model performance was compared using the concordance index (c-index), Brier score, and log-rank test based on the risk stratification results. Results: DeepSurv (c-index = 0.68, Brier score = 0.13, log-rank test p = 0.02) achieved the best performance compared to CSF (c-index = 0.63, Brier score = 0.14, log-rank test p = 0.38) and CPH (c-index = 0.57, Brier score = 0.15, log-rank test p = 0.81). Conclusions: Both CSF and DeepSurv outperformed CPH in our relatively small dataset. ML-based survival prediction may guide physicians in choosing the most suitable treatment strategy for NPC patients.
Collapse
Affiliation(s)
- Ronald Wihal Oei
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Yingchen Lyu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Lulu Ye
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Fangfang Kong
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Chengrun Du
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Ruiping Zhai
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Tingting Xu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Chunying Shen
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Xiayun He
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Lin Kong
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Chaosu Hu
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
| | - Hongmei Ying
- Department of Radiation Oncology, Fudan University Shanghai Cancer Center, Shanghai 200032, China; (R.W.O.); (Y.L.); (L.Y.); (F.K.); (C.D.); (R.Z.); (T.X.); (C.S.); (X.H.); (L.K.); (C.H.)
- Department of Oncology, Shanghai Medical College, Fudan University, Shanghai 200032, China
- Correspondence: ; Tel.: +86-21-64175590; Fax: +86-21-6417477
| |
Collapse
|
37
|
Nordin N, Zainol Z, Mohd Noor MH, Lai Fong C. A comparative study of machine learning techniques for suicide attempts predictive model. Health Informatics J 2021; 27:1460458221989395. [PMID: 33745355 DOI: 10.1177/1460458221989395] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Current suicide risk assessments for predicting suicide attempts are time consuming, of low predictive value and have inadequate reliability. This paper aims to develop a predictive model for suicide attempts among patients with depression using machine learning algorithms as well as presents a comparative study on single predictive models with ensemble predictive models for differentiating depressed patients with suicide attempts from non-suicide attempters. We applied and trained eight different machine learning algorithms using a dataset that consists of 75 patients diagnosed with a depressive disorder. A recursive feature elimination was used to reduce the features via three-fold cross validation. An ensemble predictive models outperformed the single predictive models. Voting and bagging revealed the highest accuracy of 92% compared to other machine learning algorithms. Our findings indicate that history of suicide attempt, religion, race, suicide ideation and severity of clinical depression are useful factors for prediction of suicide attempts.
Collapse
Affiliation(s)
| | | | | | - Chan Lai Fong
- National University of Malaysia Medical Centre, Malaysia
| |
Collapse
|
38
|
Chakraborty D, Ivan C, Amero P, Khan M, Rodriguez-Aguayo C, Başağaoğlu H, Lopez-Berestein G. Explainable Artificial Intelligence Reveals Novel Insight into Tumor Microenvironment Conditions Linked with Better Prognosis in Patients with Breast Cancer. Cancers (Basel) 2021; 13:3450. [PMID: 34298668 PMCID: PMC8303703 DOI: 10.3390/cancers13143450] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 07/06/2021] [Accepted: 07/06/2021] [Indexed: 12/29/2022] Open
Abstract
We investigated the data-driven relationship between immune cell composition in the tumor microenvironment (TME) and the ≥5-year survival rates of breast cancer patients using explainable artificial intelligence (XAI) models. We acquired TCGA breast invasive carcinoma data from the cbioPortal and retrieved immune cell composition estimates from bulk RNA sequencing data from TIMER2.0 based on EPIC, CIBERSORT, TIMER, and xCell computational methods. Novel insights derived from our XAI model showed that B cells, CD8+ T cells, M0 macrophages, and NK T cells are the most critical TME features for enhanced prognosis of breast cancer patients. Our XAI model also revealed the inflection points of these critical TME features, above or below which ≥5-year survival rates improve. Subsequently, we ascertained the conditional probabilities of ≥5-year survival under specific conditions inferred from the inflection points. In particular, the XAI models revealed that the B cell fraction (relative to all cells in a sample) exceeding 0.025, M0 macrophage fraction (relative to the total immune cell content) below 0.05, and NK T cell and CD8+ T cell fractions (based on cancer type-specific arbitrary units) above 0.075 and 0.25, respectively, in the TME could enhance the ≥5-year survival in breast cancer patients. The findings could lead to accurate clinical predictions and enhanced immunotherapies, and to the design of innovative strategies to reprogram the breast TME.
Collapse
Affiliation(s)
- Debaditya Chakraborty
- Department of Construction Science, The University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Cristina Ivan
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (C.I.); (P.A.); (C.R.-A.); (G.L.-B.)
- Center for RNA Interference and Non-Coding RNA, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Paola Amero
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (C.I.); (P.A.); (C.R.-A.); (G.L.-B.)
| | - Maliha Khan
- Department of Lymphoma and Myeloma, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA;
| | - Cristian Rodriguez-Aguayo
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (C.I.); (P.A.); (C.R.-A.); (G.L.-B.)
- Center for RNA Interference and Non-Coding RNA, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | | | - Gabriel Lopez-Berestein
- Department of Experimental Therapeutics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA; (C.I.); (P.A.); (C.R.-A.); (G.L.-B.)
- Center for RNA Interference and Non-Coding RNA, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
39
|
A Classification Approach for Cancer Survivors from Those Cancer-Free, Based on Health Behaviors: Analysis of the Lifelines Cohort. Cancers (Basel) 2021; 13:cancers13102335. [PMID: 34066093 PMCID: PMC8151639 DOI: 10.3390/cancers13102335] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 05/06/2021] [Accepted: 05/07/2021] [Indexed: 12/29/2022] Open
Abstract
Simple Summary Health behaviors affect health status in cancer survivors. We aimed to identify such key health behaviors using nonlinear algorithms and compare their classification performance with logistic regression, for distinguishing cancer survivors from those cancer-free in a population-based cohort. We used health behaviors and socioeconomic factors for analysis. Participants from the Lifelines population-based cohort were binary classified as cancer survivors or cancer-free using nonlinear algorithms or logistic regression. Data were collected for 107,624 cancer-free participants and 2760 cancer survivors. Using all variables, algorithms obtained an area under the receiver operator curve (AUC) of 0.75 ± 0.01. Using only health behaviors, the algorithms differentiated cancer survivors from cancer-free participants with AUCs of 0.62 ± 0.01 and 0.60 ± 0.01, respectively. In the case–control analyses, both algorithms produced AUCs of 0.52 ± 0.01. The main distinctive classifier was age. No key health behaviors were identified by linear and nonlinear algorithms to differentiate cancer survivors from cancer-free participants. Abstract Health behaviors affect health status in cancer survivors. We hypothesized that nonlinear algorithms would identify distinct key health behaviors compared to a linear algorithm and better classify cancer survivors. We aimed to use three nonlinear algorithms to identify such key health behaviors and compare their performances with that of a logistic regression for distinguishing cancer survivors from those without cancer in a population-based cohort study. We used six health behaviors and three socioeconomic factors for analysis. Participants from the Lifelines population-based cohort were binary classified into a cancer-survivors group and a cancer-free group using either nonlinear algorithms or logistic regression, and their performances were compared by the area under the curve (AUC). In addition, we performed case–control analyses (matched by age, sex, and education level) to evaluate classification performance only by health behaviors. Data were collected for 107,624 cancer free participants and 2760 cancer survivors. Using all variables resulted an AUC of 0.75 ± 0.01, using only six health behaviors, the logistic regression and nonlinear algorithms differentiated cancer survivors from cancer-free participants with AUCs of 0.62 ± 0.01 and 0.60 ± 0.01, respectively. The main distinctive classifier was age. Though not relevant to classification, the main distinctive health behaviors were body mass index and alcohol consumption. In the case–control analyses, algorithms produced AUCs of 0.52 ± 0.01. No key health behaviors were identified by linear and nonlinear algorithms to differentiate cancer survivors from cancer-free participants in this population-based cohort.
Collapse
|
40
|
Gupta S, Gupta MK. A comprehensive data‐level investigation of cancer diagnosis on imbalanced data. Comput Intell 2021. [DOI: 10.1111/coin.12452] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Surbhi Gupta
- School of Computer Science and Engineering Department Shri Mata Vaishno Devi University Katra Jammu and Kashmir India
| | - Manoj Kumar Gupta
- School of Computer Science and Engineering Department Shri Mata Vaishno Devi University Katra Jammu and Kashmir India
| |
Collapse
|
41
|
Prediction of Incident Cancers in the Lifelines Population-Based Cohort. Cancers (Basel) 2021; 13:cancers13092133. [PMID: 33925159 PMCID: PMC8125183 DOI: 10.3390/cancers13092133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 04/23/2021] [Indexed: 12/23/2022] Open
Abstract
Simple Summary The accurate prediction of incident cancers could be relevant to understanding and reducing cancer incidence. The aim of this study was to develop machine learning (ML) models that could predict an incident diagnosis of cancer. Data were available for 116,188 cancer-free participants and 4232 incident cancer cases. The main outcome was an incident cancer (excluding skin cancer) during follow-up assessment in a population-based cohort. The performance of three ML algorithms was evaluated using supervised binary classification to identify incident cancers among participants. An overall area under the receiver operator curve (AUC) < 0.75 was obtained; the highest AUC was for prostate cancer AUC > 0.80. Linear and non-linear ML algorithms including socioeconomic, lifestyle, and clinical variables produced a moderate predictive performance of incident cancers in the Lifelines cohort. Abstract Cancer incidence is rising, and accurate prediction of incident cancers could be relevant to understanding and reducing cancer incidence. The aim of this study was to develop machine learning (ML) models that could predict an incident diagnosis of cancer. Participants without any history of cancer within the Lifelines population-based cohort were followed for a median of 7 years. Data were available for 116,188 cancer-free participants and 4232 incident cancer cases. At baseline, socioeconomic, lifestyle, and clinical variables were assessed. The main outcome was an incident cancer during follow-up (excluding skin cancer), based on linkage with the national pathology registry. The performance of three ML algorithms was evaluated using supervised binary classification to identify incident cancers among participants. Elastic net regularization and Gini index were used for variables selection. An overall area under the receiver operator curve (AUC) <0.75 was obtained, the highest AUC value was for prostate cancer (random forest AUC = 0.82 (95% CI 0.77–0.87), logistic regression AUC = 0.81 (95% CI 0.76–0.86), and support vector machines AUC = 0.83 (95% CI 0.78–0.88), respectively); age was the most important predictor in these models. Linear and non-linear ML algorithms including socioeconomic, lifestyle, and clinical variables produced a moderate predictive performance of incident cancers in the Lifelines cohort.
Collapse
|
42
|
Li J, Zhou Z, Dong J, Fu Y, Li Y, Luan Z, Peng X. Predicting breast cancer 5-year survival using machine learning: A systematic review. PLoS One 2021; 16:e0250370. [PMID: 33861809 PMCID: PMC8051758 DOI: 10.1371/journal.pone.0250370] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2021] [Accepted: 04/06/2021] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Accurately predicting the survival rate of breast cancer patients is a major issue for cancer researchers. Machine learning (ML) has attracted much attention with the hope that it could provide accurate results, but its modeling methods and prediction performance remain controversial. The aim of this systematic review is to identify and critically appraise current studies regarding the application of ML in predicting the 5-year survival rate of breast cancer. METHODS In accordance with the PRISMA guidelines, two researchers independently searched the PubMed (including MEDLINE), Embase, and Web of Science Core databases from inception to November 30, 2020. The search terms included breast neoplasms, survival, machine learning, and specific algorithm names. The included studies related to the use of ML to build a breast cancer survival prediction model and model performance that can be measured with the value of said verification results. The excluded studies in which the modeling process were not explained clearly and had incomplete information. The extracted information included literature information, database information, data preparation and modeling process information, model construction and performance evaluation information, and candidate predictor information. RESULTS Thirty-one studies that met the inclusion criteria were included, most of which were published after 2013. The most frequently used ML methods were decision trees (19 studies, 61.3%), artificial neural networks (18 studies, 58.1%), support vector machines (16 studies, 51.6%), and ensemble learning (10 studies, 32.3%). The median sample size was 37256 (range 200 to 659820) patients, and the median predictor was 16 (range 3 to 625). The accuracy of 29 studies ranged from 0.510 to 0.971. The sensitivity of 25 studies ranged from 0.037 to 1. The specificity of 24 studies ranged from 0.008 to 0.993. The AUC of 20 studies ranged from 0.500 to 0.972. The precision of 6 studies ranged from 0.549 to 1. All of the models were internally validated, and only one was externally validated. CONCLUSIONS Overall, compared with traditional statistical methods, the performance of ML models does not necessarily show any improvement, and this area of research still faces limitations related to a lack of data preprocessing steps, the excessive differences of sample feature selection, and issues related to validation. Further optimization of the performance of the proposed model is also needed in the future, which requires more standardization and subsequent validation.
Collapse
Affiliation(s)
- Jiaxin Li
- School of Nursing, Jilin University, Jilin, China
| | - Zijun Zhou
- Breast Surgery, Jilin Province Tumor Hospital, Jilin, China
| | - Jianyu Dong
- School of Nursing, Jilin University, Jilin, China
| | - Ying Fu
- School of Nursing, Jilin University, Jilin, China
| | - Yuan Li
- School of Nursing, Jilin University, Jilin, China
| | - Ze Luan
- School of Nursing, Jilin University, Jilin, China
| | - Xin Peng
- School of Nursing, Jilin University, Jilin, China
- * E-mail:
| |
Collapse
|
43
|
Hwangbo S, Kim SI, Kim JH, Eoh KJ, Lee C, Kim YT, Suh DS, Park T, Song YS. Development of Machine Learning Models to Predict Platinum Sensitivity of High-Grade Serous Ovarian Carcinoma. Cancers (Basel) 2021; 13:cancers13081875. [PMID: 33919797 PMCID: PMC8070756 DOI: 10.3390/cancers13081875] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Revised: 04/02/2021] [Accepted: 04/12/2021] [Indexed: 01/07/2023] Open
Abstract
To support the implementation of individualized disease management, we aimed to develop machine learning models predicting platinum sensitivity in patients with high-grade serous ovarian carcinoma (HGSOC). We reviewed the medical records of 1002 eligible patients. Patients' clinicopathologic characteristics, surgical findings, details of chemotherapy, treatment response, and survival outcomes were collected. Using the stepwise selection method, based on the area under the receiver operating characteristic curve (AUC) values, six variables associated with platinum sensitivity were selected: age, initial serum CA-125 levels, neoadjuvant chemotherapy, pelvic lymph node status, involvement of pelvic tissue other than the uterus and tubes, and involvement of the small bowel and mesentery. Based on these variables, predictive models were constructed using four machine learning algorithms, logistic regression (LR), random forest, support vector machine, and deep neural network; the model performance was evaluated with the five-fold cross-validation method. The LR-based model performed best at identifying platinum-resistant cases with an AUC of 0.741. Adding the FIGO stage and residual tumor size after debulking surgery did not improve model performance. Based on the six-variable LR model, we also developed a web-based nomogram. The presented models may be useful in clinical practice and research.
Collapse
Affiliation(s)
- Suhyun Hwangbo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea; (S.H.); (C.L.)
| | - Se Ik Kim
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul 03080, Korea;
| | - Ju-Hyun Kim
- Department of Obstetrics and Gynecology, Graduate School of Medicine, University of Ulsan, Seoul 05505, Korea;
| | - Kyung Jin Eoh
- Department of Obstetrics and Gynecology, Yongin Severance Hospital, Yonsei University College of Medicine, Yongin-si 17046, Korea;
| | - Chanhee Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea; (S.H.); (C.L.)
| | - Young Tae Kim
- Department of Obstetrics and Gynecology, Institute of Women’s Life Medical Science, Yonsei University College of Medicine, Seoul 03722, Korea;
| | - Dae-Shik Suh
- Department of Obstetrics and Gynecology, Asan Medical Center, University of Ulsan College of Medicine, Seoul 05505, Korea;
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea; (S.H.); (C.L.)
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Correspondence: (T.P.); (Y.S.S.); Tel.: +82-2-880-8924 (T.P.); +82-2-2072-2822 (Y.S.S.)
| | - Yong Sang Song
- Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul 03080, Korea;
- Cancer Research Institute, Seoul National University College of Medicine, Seoul 03080, Korea
- Correspondence: (T.P.); (Y.S.S.); Tel.: +82-2-880-8924 (T.P.); +82-2-2072-2822 (Y.S.S.)
| |
Collapse
|
44
|
Cirillo D, Núñez‐Carpintero I, Valencia A. Artificial intelligence in cancer research: learning at different levels of data granularity. Mol Oncol 2021; 15:817-829. [PMID: 33533192 PMCID: PMC8024732 DOI: 10.1002/1878-0261.12920] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/20/2020] [Accepted: 01/10/2021] [Indexed: 02/06/2023] Open
Abstract
From genome-scale experimental studies to imaging data, behavioral footprints, and longitudinal healthcare records, the convergence of big data in cancer research and the advances in Artificial Intelligence (AI) is paving the way to develop a systems view of cancer. Nevertheless, this biomedical area is largely characterized by the co-existence of big data and small data resources, highlighting the need for a deeper investigation about the crosstalk between different levels of data granularity, including varied sample sizes, labels, data types, and other data descriptors. This review introduces the current challenges, limitations, and solutions of AI in the heterogeneous landscape of data granularity in cancer research. Such a variety of cancer molecular and clinical data calls for advancing the interoperability among AI approaches, with particular emphasis on the synergy between discriminative and generative models that we discuss in this work with several examples of techniques and applications.
Collapse
Affiliation(s)
| | | | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC)BarcelonaSpain
- ICREABarcelonaSpain
| |
Collapse
|
45
|
Yang DX, Khera R, Miccio JA, Jairam V, Chang E, Yu JB, Park HS, Krumholz HM, Aneja S. Prevalence of Missing Data in the National Cancer Database and Association With Overall Survival. JAMA Netw Open 2021; 4:e211793. [PMID: 33755165 PMCID: PMC7988369 DOI: 10.1001/jamanetworkopen.2021.1793] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
IMPORTANCE Cancer registries are important real-world data sources consisting of data abstraction from the medical record; however, patients with unknown or missing data are underrepresented in studies that use such data sources. OBJECTIVE To assess the prevalence of missing data and its association with overall survival among patients with cancer. DESIGN, SETTING, AND PARTICIPANTS In this retrospective cohort study, all variables within the National Cancer Database were reviewed for missing or unknown values for patients with the 3 most common cancers in the US who received diagnoses from January 1, 2006, to December 31, 2015. The prevalence of patient records with missing data and the association with overall survival were assessed. Data analysis was performed from February to August 2020. EXPOSURES Any missing data field within a patient record among 63 variables of interest from more than 130 total variables in the National Cancer Database. MAIN OUTCOMES AND MEASURES Prevalence of missing data in the medical records of patients with cancer and associated 2-year overall survival. RESULTS A total of 1 198 749 patients with non-small cell lung cancer (mean [SD] age, 68.5 [10.9] years; 628 811 men [52.5%]), 2 120 775 patients with breast cancer (mean [SD] age, 61.0 [13.3] years; 2 101 758 women [99.1%]), and 1 158 635 patients with prostate cancer (mean [SD] age, 65.2 [9.0] years; 100% men) were included in the analysis. Among those with non-small cell lung cancer, 851 295 patients (71.0%) were missing data for variables of interest; 2-year overall survival was 33.2% for patients with missing data and 51.6% for patients with complete data (P < .001). Among those with breast cancer, 1 161 096 patients (54.7%) were missing data for variables of interest; 2-year overall survival was 93.2% for patients with missing data and 93.9% for patients with complete data (P < .001). Among those with prostate cancer, 460 167 patients (39.7%) were missing data for variables of interest; 2-year overall survival was 91.0% for patients with missing data and 95.6% for patients with complete data (P < .001). CONCLUSIONS AND RELEVANCE This study found that within a large cancer registry-based real-world data source, there was a high prevalence of missing data that were unable to be ascertained from the medical record. The prevalence of missing data among patients with cancer was associated with heterogeneous differences in overall survival. Improvements in documentation and data quality are necessary to make optimal use of real-world data for clinical advancements.
Collapse
Affiliation(s)
- Daniel X. Yang
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
| | - Rohan Khera
- Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut
- Center for Outcomes Research and Evaluation, Yale School of Medicine, New Haven, Connecticut
| | - Joseph A. Miccio
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
| | - Vikram Jairam
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
| | - Enoch Chang
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
| | - James B. Yu
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
| | - Henry S. Park
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
| | - Harlan M. Krumholz
- Department of Internal Medicine, Yale School of Medicine, New Haven, Connecticut
- Center for Outcomes Research and Evaluation, Yale School of Medicine, New Haven, Connecticut
| | - Sanjay Aneja
- Department of Therapeutic Radiology, Yale School of Medicine, New Haven, Connecticut
- Center for Outcomes Research and Evaluation, Yale School of Medicine, New Haven, Connecticut
| |
Collapse
|
46
|
Majnarić LT, Babič F, O’Sullivan S, Holzinger A. AI and Big Data in Healthcare: Towards a More Comprehensive Research Framework for Multimorbidity. J Clin Med 2021; 10:jcm10040766. [PMID: 33672914 PMCID: PMC7918668 DOI: 10.3390/jcm10040766] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 02/02/2021] [Accepted: 02/11/2021] [Indexed: 12/11/2022] Open
Abstract
Multimorbidity refers to the coexistence of two or more chronic diseases in one person. Therefore, patients with multimorbidity have multiple and special care needs. However, in practice it is difficult to meet these needs because the organizational processes of current healthcare systems tend to be tailored to a single disease. To improve clinical decision making and patient care in multimorbidity, a radical change in the problem-solving approach to medical research and treatment is needed. In addition to the traditional reductionist approach, we propose interactive research supported by artificial intelligence (AI) and advanced big data analytics. Such research approach, when applied to data routinely collected in healthcare settings, provides an integrated platform for research tasks related to multimorbidity. This may include, for example, prediction, correlation, and classification problems based on multiple interaction factors. However, to realize the idea of this paradigm shift in multimorbidity research, the optimization, standardization, and most importantly, the integration of electronic health data into a common national and international research infrastructure is needed. Ultimately, there is a need for the integration and implementation of efficient AI approaches, particularly deep learning, into clinical routine directly within the workflows of the medical professionals.
Collapse
Affiliation(s)
- Ljiljana Trtica Majnarić
- Department of Internal Medicine, Family Medicine and the History of Medicine, Faculty of Medicine, University Josip Juraj Strossmayer, 31000 Osijek, Croatia;
- Department of Public Health, Faculty of Dental Medicine, University Josip Juraj Strossmayer, 31000 Osijek, Croatia
| | - František Babič
- Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, 066 01 Košice, Slovakia
- Correspondence: ; Tel.: +421-55-602-4220
| | - Shane O’Sullivan
- Department of Pathology, Faculdade de Medicina, Universidade de São Paulo, 05508-220 São Paulo, Brazil;
| | - Andreas Holzinger
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, 8036 Graz, Austria;
| |
Collapse
|
47
|
Harrison JH, Gilbertson JR, Hanna MG, Olson NH, Seheult JN, Sorace JM, Stram MN. Introduction to Artificial Intelligence and Machine Learning for Pathology. Arch Pathol Lab Med 2021; 145:1228-1254. [PMID: 33493264 DOI: 10.5858/arpa.2020-0541-cp] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2020] [Indexed: 11/06/2022]
Abstract
CONTEXT.— Recent developments in machine learning have stimulated intense interest in software that may augment or replace human experts. Machine learning may impact pathology practice by offering new capabilities in analysis, interpretation, and outcomes prediction using images and other data. The principles of operation and management of machine learning systems are unfamiliar to pathologists, who anticipate a need for additional education to be effective as expert users and managers of the new tools. OBJECTIVE.— To provide a background on machine learning for practicing pathologists, including an overview of algorithms, model development, and performance evaluation; to examine the current status of machine learning in pathology and consider possible roles and requirements for pathologists in local deployment and management of machine learning systems; and to highlight existing challenges and gaps in deployment methodology and regulation. DATA SOURCES.— Sources include the biomedical and engineering literature, white papers from professional organizations, government reports, electronic resources, and authors' experience in machine learning. References were chosen when possible for accessibility to practicing pathologists without specialized training in mathematics, statistics, or software development. CONCLUSIONS.— Machine learning offers an array of techniques that in recent published results show substantial promise. Data suggest that human experts working with machine learning tools outperform humans or machines separately, but the optimal form for this combination in pathology has not been established. Significant questions related to the generalizability of machine learning systems, local site verification, and performance monitoring remain to be resolved before a consensus on best practices and a regulatory environment can be established.
Collapse
Affiliation(s)
- James H Harrison
- From the Department of Pathology, University of Virginia School of Medicine, Charlottesville (Harrison)
| | - John R Gilbertson
- the Departments of Biomedical Informatics and Pathology, University of Pittsburgh, Pittsburgh, Pennsylvania (Gilbertson)
| | - Matthew G Hanna
- the Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, New York (Hanna)
| | - Niels H Olson
- the Defense Innovation Unit, Mountain View, California (Olson).,the Department of Pathology, Uniformed Services University, Bethesda, Maryland (Olson)
| | - Jansen N Seheult
- the Department of Pathology, University of Pittsburgh, and Vitalant Specialty Labs, Pittsburgh, Pennsylvania (Seheult)
| | - James M Sorace
- the US Department of Health and Human Services, retired, Lutherville, Maryland (Sorace)
| | - Michelle N Stram
- the Department of Forensic Medicine, New York University, and Office of Chief Medical Examiner, New York, New York (Stram)
| |
Collapse
|
48
|
de Winter MA, van Es N, Büller HR, Visseren FLJ, Nijkeuter M. Prediction models for recurrence and bleeding in patients with venous thromboembolism: A systematic review and critical appraisal. Thromb Res 2021; 199:85-96. [PMID: 33485094 DOI: 10.1016/j.thromres.2020.12.031] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 12/07/2020] [Accepted: 12/31/2020] [Indexed: 12/23/2022]
Abstract
INTRODUCTION Prediction models for recurrence and bleeding are infrequently used when deciding on anticoagulant treatment duration after venous thromboembolism (VTE) due to concerns about performance and validity. Our aim was to critically appraise these models by systematically summarizing data from derivation and validation studies. MATERIALS AND METHODS MEDLINE and CENTRAL were searched until November 15th, 2019. Studies on prediction models for recurrence or bleeding after at least 3 months of anticoagulation in adult patients with VTE were included. The PROBAST, ROBINS-I and RoB2 tools were used to assess risk of bias and applicability. RESULTS Selection yielded 18 studies evaluating 8 models for recurrence (7 on development; 9 on validation; 1 update). Generally, models for recurrent VTE appeared to perform poorly to moderately in external validation studies (C-statistics 0.39-0.66, one 0.83). However, impact studies show that HERDOO2 and Vienna prediction model may identify patients with unprovoked VTE at low recurrence risk. Sixteen studies evaluating 14 models for anticoagulation-related bleeding were identified (7 on development; 9 on validation). Although some models seemed promising in development studies, their predictive performance was poor to moderate in external validation (C-statistics 0.52-0.71). All but 3 studies were considered at high risk of bias, mainly due to limitations in the statistical analysis. CONCLUSIONS Prognostic models for recurrence and anticoagulation-related bleeding risk often have important methodological limitations and insufficient predictive accuracy. These findings do not support their use in clinical practice to weigh risks of recurrence and bleeding when deciding on continuing anticoagulation after initial treatment of VTE.
Collapse
Affiliation(s)
- Maria A de Winter
- University Medical Center, Utrecht, Department of Acute Internal Medicine, Heidelberglaan 100, 3584CX Utrecht, the Netherlands.
| | - Nick van Es
- Amsterdam UMC, Department of Vascular Medicine; Meibergdreef 9, 1105 AZ Amsterdam, the Netherlands.
| | - Harry R Büller
- Amsterdam UMC, Department of Vascular Medicine; Meibergdreef 9, 1105 AZ Amsterdam, the Netherlands.
| | - Frank L J Visseren
- University Medical Center, Utrecht, Department of Vascular Medicine, Heidelberglaan 100, 3584CX Utrecht, the Netherlands.
| | - Mathilde Nijkeuter
- University Medical Center, Utrecht, Department of Acute Internal Medicine, Heidelberglaan 100, 3584CX Utrecht, the Netherlands.
| |
Collapse
|
49
|
Shorten C, Khoshgoftaar TM, Furht B. Deep Learning applications for COVID-19. JOURNAL OF BIG DATA 2021; 8:18. [PMID: 33457181 PMCID: PMC7797891 DOI: 10.1186/s40537-020-00392-9] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 12/04/2020] [Indexed: 05/17/2023]
Abstract
This survey explores how Deep Learning has battled the COVID-19 pandemic and provides directions for future research on COVID-19. We cover Deep Learning applications in Natural Language Processing, Computer Vision, Life Sciences, and Epidemiology. We describe how each of these applications vary with the availability of big data and how learning tasks are constructed. We begin by evaluating the current state of Deep Learning and conclude with key limitations of Deep Learning for COVID-19 applications. These limitations include Interpretability, Generalization Metrics, Learning from Limited Labeled Data, and Data Privacy. Natural Language Processing applications include mining COVID-19 research for Information Retrieval and Question Answering, as well as Misinformation Detection, and Public Sentiment Analysis. Computer Vision applications cover Medical Image Analysis, Ambient Intelligence, and Vision-based Robotics. Within Life Sciences, our survey looks at how Deep Learning can be applied to Precision Diagnostics, Protein Structure Prediction, and Drug Repurposing. Deep Learning has additionally been utilized in Spread Forecasting for Epidemiology. Our literature review has found many examples of Deep Learning systems to fight COVID-19. We hope that this survey will help accelerate the use of Deep Learning for COVID-19 research.
Collapse
Affiliation(s)
- Connor Shorten
- Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431 USA
| | | | - Borko Furht
- Florida Atlantic University, 777 Glades Road, Boca Raton, FL 33431 USA
| |
Collapse
|
50
|
Chu J, Dong W, Wang J, He K, Huang Z. Treatment effect prediction with adversarial deep learning using electronic health records. BMC Med Inform Decis Mak 2020; 20:139. [PMID: 33317502 PMCID: PMC7735418 DOI: 10.1186/s12911-020-01151-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 06/08/2020] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Treatment effect prediction (TEP) plays an important role in disease management by ensuring that the expected clinical outcomes are obtained after performing specialized and sophisticated treatments on patients given their personalized clinical status. In recent years, the wide adoption of electronic health records (EHRs) has provided a comprehensive data source for intelligent clinical applications including the TEP investigated in this study. METHOD We examined the problem of using a large volume of heterogeneous EHR data to predict treatment effects and developed an adversarial deep treatment effect prediction model to address the problem. Our model employed two auto-encoders for learning the representative and discriminative features of both patient characteristics and treatments from EHR data. The discriminative power of the learned features was further enhanced by decoding the correlational information between the patient characteristics and subsequent treatments by means of a generated adversarial learning strategy. Thereafter, a logistic regression layer was appended on the top of the resulting feature representation layer for TEP. RESULT The proposed model was evaluated on two real clinical datasets collected from the cardiology department of a Chinese hospital. In particular, on acute coronary syndrome (ACS) dataset, the proposed adversarial deep treatment effect prediction (ADTEP) (0.662) exhibited 1.4, 2.2, and 6.3% performance gains in terms of the area under the ROC curve (AUC) over deep treatment effect prediction (DTEP) (0.653), logistic regression (LR) (0.648), and support vector machine (SVM) (0.621), respectively. As for heart failure (HF) case study, the proposed ADTEP also outperformed all benchmarks. The experimental results demonstrated that our proposed model achieved competitive performance compared to state-of-the-art models in tackling the TEP problem. CONCLUSION In this work, we propose a novel model to address the TEP problem by utilizing a large volume of observational data from EHR. With adversarial learning strategy, our proposed model can further explore the correlational information between patient statuses and treatments to extract more robust and discriminative representation of patient samples from their EHR data. Such representation finally benefits the model on TEP. The experimental results of two case studies demonstrate the superiority of our proposed method compared to state-of-the-art methods.
Collapse
Affiliation(s)
- Jiebin Chu
- College of Biomedical Engineering and Instrumental Science, Zhejiang University, Hangzhou, China
| | - Wei Dong
- Department of Cardiology, Chinese PLA General Hospital, Beijing, China
| | | | - Kunlun He
- Department of Cardiology, Chinese PLA General Hospital, Beijing, China.
| | - Zhengxing Huang
- College of Biomedical Engineering and Instrumental Science, Zhejiang University, Hangzhou, China.
| |
Collapse
|