1
|
Yang H, Liao Z, Zou H, Li K, Zhou Y, Gao Z, Mao Y, Song C. Machine learning-based gait adaptation dysfunction identification using CMill-based gait data. Front Neurorobot 2024; 18:1421401. [PMID: 39136036 PMCID: PMC11317473 DOI: 10.3389/fnbot.2024.1421401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 07/15/2024] [Indexed: 08/15/2024] Open
Abstract
Background Combining machine learning (ML) with gait analysis is widely applicable for diagnosing abnormal gait patterns. Objective To analyze gait adaptability characteristics in stroke patients, develop ML models to identify individuals with GAD, and select optimal diagnostic models and key classification features. Methods This study was investigated with 30 stroke patients (mean age 42.69 years, 60% male) and 50 healthy adults (mean age 41.34 years, 58% male). Gait adaptability was assessed using a CMill treadmill on gait adaptation tasks: target stepping, slalom walking, obstacle avoidance, and speed adaptation. The preliminary analysis of variables in both groups was conducted using t-tests and Pearson correlation. Features were extracted from demographics, gait kinematics, and gait adaptability datasets. ML models based on Support Vector Machine, Decision Tree, Multi-layer Perceptron, K-Nearest Neighbors, and AdaCost algorithm were trained to classify individuals with and without GAD. Model performance was evaluated using accuracy (ACC), sensitivity (SEN), F1-score and the area under the receiver operating characteristic (ROC) curve (AUC). Results The stroke group showed a significantly decreased gait speed (p = 0.000) and step length (SL) (p = 0.000), while the asymmetry of SL (p = 0.000) and ST (p = 0.000) was higher compared to the healthy group. The gait adaptation tasks significantly decreased in slalom walking (p = 0.000), obstacle avoidance (p = 0.000), and speed adaptation (p = 0.000). Gait speed (p = 0.000) and obstacle avoidance (p = 0.000) were significantly correlated with global F-A score in stroke patients. The AdaCost demonstrated better classification performance with an ACC of 0.85, SEN of 0.80, F1-score of 0.77, and ROC-AUC of 0.75. Obstacle avoidance and gait speed were identified as critical features in this model. Conclusion Stroke patients walk slower with shorter SL and more asymmetry of SL and ST. Their gait adaptability was decreased, particularly in obstacle avoidance and speed adaptation. The faster gait speed and better obstacle avoidance were correlated with better functional mobility. The AdaCost identifies individuals with GAD and facilitates clinical decision-making. This advances the future development of user-friendly interfaces and computer-aided diagnosis systems.
Collapse
Affiliation(s)
- Hang Yang
- Department of Rehabilitation Medicine, the First Affiliated Hospital of Zhejiang Chinese Medical University, Zhejiang, China
| | - Zhenyi Liao
- Department of Rehabilitation Medicine, the First Affiliated Hospital of Zhejiang Chinese Medical University, Zhejiang, China
| | - Hailei Zou
- College of Science, China Jiliang University, Zhejiang, China
| | - Kuncheng Li
- MeritData Technology Co., Ltd., Shanxi, China
| | - Ye Zhou
- Department of Rehabilitation Medicine, the First Affiliated Hospital of Zhejiang Chinese Medical University, Zhejiang, China
| | - Zhenzhen Gao
- Department of Rehabilitation Medicine, the First Affiliated Hospital of Zhejiang Chinese Medical University, Zhejiang, China
| | - Yajun Mao
- Department of Rehabilitation Medicine, the First Affiliated Hospital of Zhejiang Chinese Medical University, Zhejiang, China
| | - Caiping Song
- Department of Rehabilitation Medicine, the First Affiliated Hospital of Zhejiang Chinese Medical University, Zhejiang, China
| |
Collapse
|
2
|
Garrido NJ, González-Martínez F, Losada S, Plaza A, del Olmo E, Mateo J. Innovation through Artificial Intelligence in Triage Systems for Resource Optimization in Future Pandemics. Biomimetics (Basel) 2024; 9:440. [PMID: 39056881 PMCID: PMC11274710 DOI: 10.3390/biomimetics9070440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 07/12/2024] [Accepted: 07/16/2024] [Indexed: 07/28/2024] Open
Abstract
Artificial intelligence (AI) systems are already being used in various healthcare areas. Similarly, they can offer many advantages in hospital emergency services. The objective of this work is to demonstrate that through the novel use of AI, a trained system can be developed to detect patients at potential risk of infection in a new pandemic more quickly than standardized triage systems. This identification would occur in the emergency department, thus allowing for the early implementation of organizational preventive measures to block the chain of transmission. MATERIALS AND METHODS In this study, we propose the use of a machine learning system in emergency department triage during pandemics to detect patients at the highest risk of death and infection using the COVID-19 era as an example, where rapid decision making and comprehensive support have becoming increasingly crucial. All patients who consecutively presented to the emergency department were included, and more than 89 variables were automatically analyzed using the extreme gradient boosting (XGB) algorithm. RESULTS The XGB system demonstrated the highest balanced accuracy at 91.61%. Additionally, it obtained results more quickly than traditional triage systems. The variables that most influenced mortality prediction were procalcitonin level, age, and oxygen saturation, followed by lactate dehydrogenase (LDH) level, C-reactive protein, the presence of interstitial infiltrates on chest X-ray, and D-dimer. Our system also identified the importance of oxygen therapy in these patients. CONCLUSIONS These results highlight that XGB is a useful and novel tool in triage systems for guiding the care pathway in future pandemics, thus following the example set by the well-known COVID-19 pandemic.
Collapse
Affiliation(s)
- Nicolás J. Garrido
- Internal Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
| | - Félix González-Martínez
- Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Expert Medical Analysis Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Susana Losada
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
| | - Adrián Plaza
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
| | - Eneida del Olmo
- Department of Emergency Medicine, Virgen de la Luz Hospital, 16002 Cuenca, Spain
| | - Jorge Mateo
- Expert Medical Analysis Group, Institute of Technology, University of Castilla-La Mancha, 16071 Cuenca, Spain
- Expert Medical Analysis Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| |
Collapse
|
3
|
Zhou Y, Zhang Z, Li Q, Mao G, Zhou Z. Construction and validation of machine learning algorithm for predicting depression among home-quarantined individuals during the large-scale COVID-19 outbreak: based on Adaboost model. BMC Psychol 2024; 12:230. [PMID: 38659077 PMCID: PMC11044386 DOI: 10.1186/s40359-024-01696-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 03/29/2024] [Indexed: 04/26/2024] Open
Abstract
OBJECTIVES COVID-19 epidemics often lead to elevated levels of depression. To accurately identify and predict depression levels in home-quarantined individuals during a COVID-19 epidemic, this study constructed a depression prediction model based on multiple machine learning algorithms and validated its effectiveness. METHODS A cross-sectional method was used to examine the depression status of individuals quarantined at home during the epidemic via the network. Characteristics included variables on sociodemographics, COVID-19 and its prevention and control measures, impact on life, work, health and economy after the city was sealed off, and PHQ-9 scale scores. The home-quarantined subjects were randomly divided into training set and validation set according to the ratio of 7:3, and the performance of different machine learning models were compared by 10-fold cross-validation, and the model algorithm with the best performance was selected from 15 models to construct and validate the depression prediction model for home-quarantined subjects. The validity of different models was compared based on accuracy, precision, receiver operating characteristic (ROC) curve, and area under the ROC curve (AUC), and the best model suitable for the data framework of this study was identified. RESULTS The prevalence of depression among home-quarantined individuals during the epidemic was 31.66% (202/638), and the constructed Adaboost depression prediction model had an ACC of 0.7917, an accuracy of 0.7180, and an AUC of 0.7803, which was better than the other 15 models on the combination of various performance measures. In the validation sets, the AUC was greater than 0.83. CONCLUSIONS The Adaboost machine learning algorithm developed in this study can be used to construct a depression prediction model for home-quarantined individuals that has better machine learning performance, as well as high effectiveness, robustness, and generalizability.
Collapse
Affiliation(s)
- Yiwei Zhou
- Business School, University of Shanghai for Science and Technology, 200093, Shanghai, China
- School of Intelligent Emergency Management, University of Shanghai for Science and Technology, 200093, Shanghai, China
- Smart Urban Mobility Institute, University of Shanghai for Science and Technology, 200093, Shanghai, China
| | - Zejie Zhang
- Wenzhou Center for Disease Control and Prevention, 325000, Wenzhou, China
| | - Qin Li
- The Affiliated Kangning Hospital of Wenzhou Medical University Zhejiang Provincial Clinical Research Center for Mental Disorders, 325007, Wenzhou, China
| | - Guangyun Mao
- Department of Preventive Medicine, School of Public Health, Wenzhou Medical University, 325035, Wenzhou, China
| | - Zumu Zhou
- The Affiliated Kangning Hospital of Wenzhou Medical University Zhejiang Provincial Clinical Research Center for Mental Disorders, 325007, Wenzhou, China.
| |
Collapse
|
4
|
Gadár L, Abonyi J. Explainable prediction of node labels in multilayer networks: a case study of turnover prediction in organizations. Sci Rep 2024; 14:9036. [PMID: 38641683 PMCID: PMC11031594 DOI: 10.1038/s41598-024-59690-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 04/14/2024] [Indexed: 04/21/2024] Open
Abstract
In real-world classification problems, it is important to build accurate prediction models and provide information that can improve decision-making. Decision-support tools are often based on network models, and this article uses information encoded by social networks to solve the problem of employer turnover. However, understanding the factors behind black-box prediction models can be challenging. Our question was about the predictability of employee turnover, given information from the multilayer network that describes collaborations and perceptions that assess the performance of organizations that indicate the success of cooperation. Our goal was to develop an accurate prediction procedure, preserve the interpretability of the classification, and capture the wide variety of specific reasons that explain positive cases. After a feature engineering, we identified variables with the best predictive power using decision trees and ranked them based on their added value considering their frequent co-occurrence. We applied the Random Forest using the SMOTE balancing technique for prediction. We calculated the SHAP values to identify the variables that contribute the most to individual predictions. As a last step, we clustered the sample based on SHAP values to fine-tune the explanations for quitting due to different background factors.
Collapse
Affiliation(s)
- László Gadár
- HUN-REN-PE Complex Systems Monitoring Research Group, University of Pannonia, Veszprém, Hungary.
| | - János Abonyi
- HUN-REN-PE Complex Systems Monitoring Research Group, University of Pannonia, Veszprém, Hungary
| |
Collapse
|
5
|
Ali S, Akhlaq F, Imran AS, Kastrati Z, Daudpota SM, Moosa M. The enlightening role of explainable artificial intelligence in medical & healthcare domains: A systematic literature review. Comput Biol Med 2023; 166:107555. [PMID: 37806061 DOI: 10.1016/j.compbiomed.2023.107555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 08/13/2023] [Accepted: 09/28/2023] [Indexed: 10/10/2023]
Abstract
In domains such as medical and healthcare, the interpretability and explainability of machine learning and artificial intelligence systems are crucial for building trust in their results. Errors caused by these systems, such as incorrect diagnoses or treatments, can have severe and even life-threatening consequences for patients. To address this issue, Explainable Artificial Intelligence (XAI) has emerged as a popular area of research, focused on understanding the black-box nature of complex and hard-to-interpret machine learning models. While humans can increase the accuracy of these models through technical expertise, understanding how these models actually function during training can be difficult or even impossible. XAI algorithms such as Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP) can provide explanations for these models, improving trust in their predictions by providing feature importance and increasing confidence in the systems. Many articles have been published that propose solutions to medical problems by using machine learning models alongside XAI algorithms to provide interpretability and explainability. In our study, we identified 454 articles published from 2018-2022 and analyzed 93 of them to explore the use of these techniques in the medical domain.
Collapse
Affiliation(s)
- Subhan Ali
- Department of Computer Science, Norwegian University of Science & Technology (NTNU), Gjøvik, 2815, Norway.
| | - Filza Akhlaq
- Department of Computer Science, Sukkur IBA University, Sukkur, 65200, Sindh, Pakistan.
| | - Ali Shariq Imran
- Department of Computer Science, Norwegian University of Science & Technology (NTNU), Gjøvik, 2815, Norway.
| | - Zenun Kastrati
- Department of Informatics, Linnaeus University, Växjö, 351 95, Sweden.
| | | | - Muhammad Moosa
- Department of Computer Science, Norwegian University of Science & Technology (NTNU), Gjøvik, 2815, Norway.
| |
Collapse
|
6
|
Kwon HJ, Park UH, Goh CJ, Park D, Lim YG, Lee IK, Do WJ, Lee KJ, Kim H, Yun SY, Joo J, Min NY, Lee S, Um SW, Lee MS. Enhancing Lung Cancer Classification through Integration of Liquid Biopsy Multi-Omics Data with Machine Learning Techniques. Cancers (Basel) 2023; 15:4556. [PMID: 37760525 PMCID: PMC10526503 DOI: 10.3390/cancers15184556] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 08/30/2023] [Accepted: 09/07/2023] [Indexed: 09/29/2023] Open
Abstract
Early detection of lung cancer is crucial for patient survival and treatment. Recent advancements in next-generation sequencing (NGS) analysis enable cell-free DNA (cfDNA) liquid biopsy to detect changes, like chromosomal rearrangements, somatic mutations, and copy number variations (CNVs), in cancer. Machine learning (ML) analysis using cancer markers is a highly promising tool for identifying patterns and anomalies in cancers, making the development of ML-based analysis methods essential. We collected blood samples from 92 lung cancer patients and 80 healthy individuals to analyze the distinction between them. The detection of lung cancer markers Cyfra21 and carcinoembryonic antigen (CEA) in blood revealed significant differences between patients and controls. We performed machine learning analysis to obtain AUC values via Adaptive Boosting (AdaBoost), Multi-Layer Perceptron (MLP), and Logistic Regression (LR) using cancer markers, cfDNA concentrations, and CNV screening. Furthermore, combining the analysis of all multi-omics data for ML showed higher AUC values compared with analyzing each element separately, suggesting the potential for a highly accurate diagnosis of cancer. Overall, our results from ML analysis using multi-omics data obtained from blood demonstrate a remarkable ability of the model to distinguish between lung cancer and healthy individuals, highlighting the potential for a diagnostic model against lung cancer.
Collapse
Affiliation(s)
- Hyuk-Jung Kwon
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
- Department of Computer Science and Engineering, Incheon National University (INU), Incheon 22012, Republic of Korea
| | - Ui-Hyun Park
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Chul Jun Goh
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Dabin Park
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Yu Gyeong Lim
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Isaac Kise Lee
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
- Department of Computer Science and Engineering, Incheon National University (INU), Incheon 22012, Republic of Korea
- NGENI Foundation, San Diego, CA 92123, USA
| | - Woo-Jung Do
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Kyoung Joo Lee
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Hyojung Kim
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Seon-Young Yun
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Joungsu Joo
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Na Young Min
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Sunghoon Lee
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
| | - Sang-Won Um
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-ro, Gangnam-gu, Seoul 06351, Republic of Korea;
| | - Min-Seob Lee
- Eone-Diagnomics Genome Center, Inc., 143, Gaetbeol-ro, Yeonsu-gu, Incheon 21999, Republic of Korea; (H.-J.K.); (U.-H.P.); (C.J.G.); (D.P.); (Y.G.L.); (I.K.L.); (W.-J.D.); (K.J.L.); (H.K.); (N.Y.M.)
- Diagnomics, Inc., 5795 Kearny Villa Rd., San Diego, CA 92123, USA
| |
Collapse
|
7
|
Suárez M, Martínez R, Torres AM, Ramón A, Blasco P, Mateo J. A Machine Learning-Based Method for Detecting Liver Fibrosis. Diagnostics (Basel) 2023; 13:2952. [PMID: 37761319 PMCID: PMC10529519 DOI: 10.3390/diagnostics13182952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 09/03/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
Cholecystectomy and Metabolic-associated steatotic liver disease (MASLD) are prevalent conditions in gastroenterology, frequently co-occurring in clinical practice. Cholecystectomy has been shown to have metabolic consequences, sharing similar pathological mechanisms with MASLD. A database of MASLD patients who underwent cholecystectomy was analysed. This study aimed to develop a tool to identify the risk of liver fibrosis after cholecystectomy. For this purpose, the extreme gradient boosting (XGB) algorithm was used to construct an effective predictive model. The factors associated with a better predictive method were platelet level, followed by dyslipidaemia and type-2 diabetes (T2DM). Compared to other ML methods, our proposed method, XGB, achieved higher accuracy values. The XGB method had the highest balanced accuracy (93.16%). XGB outperformed KNN in accuracy (93.16% vs. 84.45%) and AUC (0.92 vs. 0.84). These results demonstrate that the proposed XGB method can be used as an automatic diagnostic aid for MASLD patients based on machine-learning techniques.
Collapse
Affiliation(s)
- Miguel Suárez
- Gastroenterology Department, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Medical Analysis Expert Group, Institute of Technology, Universidad de Castilla-La Mancha, 16071 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Raquel Martínez
- Gastroenterology Department, Virgen de la Luz Hospital, 16002 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Ana María Torres
- Medical Analysis Expert Group, Institute of Technology, Universidad de Castilla-La Mancha, 16071 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| | - Antonio Ramón
- Department of Pharmacy, General University Hospital, 46014 Valencia, Spain
| | - Pilar Blasco
- Department of Pharmacy, General University Hospital, 46014 Valencia, Spain
| | - Jorge Mateo
- Medical Analysis Expert Group, Institute of Technology, Universidad de Castilla-La Mancha, 16071 Cuenca, Spain
- Medical Analysis Expert Group, Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 45071 Toledo, Spain
| |
Collapse
|
8
|
Shen L, Du L, Hu Y, Chen X, Hou Z, Yan Z, Wang X. MRI-based radiomics model for distinguishing Stage I endometrial carcinoma from endometrial polyp: a multicenter study. Acta Radiol 2023; 64:2651-2658. [PMID: 37291882 DOI: 10.1177/02841851231175249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
BACKGROUND Patients with early endometrial carcinoma (EC) have a good prognosis, but it is difficult to distinguish from endometrial polyps (EPs). PURPOSE To develop and assess magnetic resonance imaging (MRI)-based radiomics models for discriminating Stage I EC from EP in a multicenter setting. MATERIAL AND METHODS Patients with Stage I EC (n = 202) and EP (n = 99) who underwent preoperative MRI scans were collected in three centers (seven devices). The images from devices 1-3 were utilized for training and validation, and the images from devices 4-7 were utilized for testing, leading to three models. They were evaluated by the area under the receiver operating characteristic curve (AUC) and metrics including accuracy, sensitivity, and specificity. Two radiologists evaluated the endometrial lesions and compared them with the three models. RESULTS The AUCs of device 1, 2_ada, device 1, 3_ada, and device 2, 3_ada for discriminating Stage I EC from EP were 0.951, 0.912, and 0.896 for the training set, 0.755, 0.928, and 1.000 for the validation set, and 0.883, 0.956, and 0.878 for the external validation set, respectively. The specificity of the three models was higher, but the accuracy and sensitivity were lower than those of radiologists. CONCLUSION Our MRI-based models showed good potential in differentiating Stage I EC from EP and had been validated in multiple centers. Their specificity was higher than that of radiologists and may be used for computer-aided diagnosis in the future to assist clinical diagnosis.
Collapse
Affiliation(s)
- Liting Shen
- Department of Radiology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, PR China
| | - Lixin Du
- Department of Medical Imaging, Shenzhen Longhua District Central Hospital, Shenzhen, PR China
| | - Yumin Hu
- Department of Radiology, Lishui Central Hospital, Zhejiang, PR China
| | - Xiaojun Chen
- Department of Radiology, Affiliated Jinhua Hospital, Zhejiang University School of Medicine, Jinhua, PR China
| | - Zujun Hou
- Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, PR China
| | - Zhihan Yan
- Department of Radiology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, PR China
| | - Xue Wang
- Department of Radiology, the Second Affiliated Hospital and Yuying Children's Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, PR China
| |
Collapse
|
9
|
Alotaibi NS, Ahmed HI, Kamel SOM. Dynamic Adaptation Attack Detection Model for a Distributed Multi-Access Edge Computing Smart City. SENSORS (BASEL, SWITZERLAND) 2023; 23:7135. [PMID: 37631671 PMCID: PMC10459074 DOI: 10.3390/s23167135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 08/05/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023]
Abstract
The internet of things (IoT) technology presents an intelligent way to improve our lives and contributes to many fields such as industry, communications, agriculture, etc. Unfortunately, IoT networks are exposed to many attacks that may destroy the entire network and consume network resources. This paper aims to propose intelligent process automation and an auto-configured intelligent automation detection model (IADM) to detect and prevent malicious network traffic and behaviors/events at distributed multi-access edge computing in an IoT-based smart city. The proposed model consists of two phases. The first phase relies on the intelligent process automation (IPA) technique and contains five modules named, specifically, dataset collection and pre-processing module, intelligent automation detection module, analysis module, detection rules and action module, and database module. In the first phase, each module composes an intelligent connecting module to give feedback reports about each module and send information to the next modules. Therefore, any change in each process can be easily detected and labeled as an intrusion. The intelligent connection module (ICM) may reduce the search time, increase the speed, and increase the security level. The second phase is the dynamic adaptation of the attack detection model based on reinforcement one-shot learning. The first phase is based on a multi-classification technique using Random Forest Trees (RFT), k-Nearest Neighbor (K-NN), J48, AdaBoost, and Bagging. The second phase can learn the new changed behaviors based on reinforced learning to detect zero-day attacks and malicious events in IoT-based smart cities. The experiments are implemented using a UNSW-NB 15 dataset. The proposed model achieves high accuracy rates using RFT, K-NN, and AdaBoost of approximately 98.8%. It is noted that the accuracy rate of the J48 classifier achieves 85.51%, which is lower than the others. Subsequently, the accuracy rates of AdaBoost and Bagging based on J48 are 98.9% and 91.41%, respectively. Additionally, the error rates of RFT, K-NN, and AdaBoost are very low. Similarly, the proposed model achieves high precision, recall, and F1-measure high rates using RFT, K-NN, AdaBoost, and Bagging. The second phase depends on creating an auto-adaptive model through the dynamic adaptation of the attack detection model based on reinforcement one-shot learning using a small number of instances to conserve the memory of any smart device in an IoT network. The proposed auto-adaptive model may reduce false rates of reporting by the intrusion detection system (IDS). It can detect any change in the behaviors of smart devices quickly and easily. The IADM can improve the performance rates for IDS by maintaining the memory consumption, time consumption, and speed of the detection process.
Collapse
Affiliation(s)
- Nouf Saeed Alotaibi
- Computer Science Department, Shaqra University, Dawadmi City 11911, Saudi Arabia
| | - Hassan Ibrahim Ahmed
- Informatics Department, Electronic Research Institute, Cairo 12622, Egypt; (H.I.A.); (S.O.M.K.)
| | - Samah Osama M. Kamel
- Informatics Department, Electronic Research Institute, Cairo 12622, Egypt; (H.I.A.); (S.O.M.K.)
| |
Collapse
|
10
|
Kufel J, Bargieł-Łączek K, Kocot S, Koźlik M, Bartnikowska W, Janik M, Czogalik Ł, Dudek P, Magiera M, Lis A, Paszkiewicz I, Nawrat Z, Cebula M, Gruszczyńska K. What Is Machine Learning, Artificial Neural Networks and Deep Learning?-Examples of Practical Applications in Medicine. Diagnostics (Basel) 2023; 13:2582. [PMID: 37568945 PMCID: PMC10417718 DOI: 10.3390/diagnostics13152582] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 07/19/2023] [Accepted: 08/01/2023] [Indexed: 08/13/2023] Open
Abstract
Machine learning (ML), artificial neural networks (ANNs), and deep learning (DL) are all topics that fall under the heading of artificial intelligence (AI) and have gained popularity in recent years. ML involves the application of algorithms to automate decision-making processes using models that have not been manually programmed but have been trained on data. ANNs that are a part of ML aim to simulate the structure and function of the human brain. DL, on the other hand, uses multiple layers of interconnected neurons. This enables the processing and analysis of large and complex databases. In medicine, these techniques are being introduced to improve the speed and efficiency of disease diagnosis and treatment. Each of the AI techniques presented in the paper is supported with an example of a possible medical application. Given the rapid development of technology, the use of AI in medicine shows promising results in the context of patient care. It is particularly important to keep a close eye on this issue and conduct further research in order to fully explore the potential of ML, ANNs, and DL, and bring further applications into clinical use in the future.
Collapse
Affiliation(s)
- Jakub Kufel
- Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, 41-808 Zabrze, Poland;
| | - Katarzyna Bargieł-Łączek
- Paediatric Radiology Students’ Scientific Association at the Division of Diagnostic Imaging, Department of Radiology and Nuclear Medicine, Faculty of Medical Science in Katowice, Medical University of Silesia, 40-752 Katowice, Poland; (K.B.-Ł.); (W.B.)
| | - Szymon Kocot
- Bright Coders’ Factory, Technologiczna 2, 45-839 Opole, Poland
| | - Maciej Koźlik
- Division of Cardiology and Structural Heart Disease, Medical University of Silesia, 40-635 Katowice, Poland;
| | - Wiktoria Bartnikowska
- Paediatric Radiology Students’ Scientific Association at the Division of Diagnostic Imaging, Department of Radiology and Nuclear Medicine, Faculty of Medical Science in Katowice, Medical University of Silesia, 40-752 Katowice, Poland; (K.B.-Ł.); (W.B.)
| | - Michał Janik
- Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
| | - Łukasz Czogalik
- Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
| | - Piotr Dudek
- Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
| | - Mikołaj Magiera
- Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
| | - Anna Lis
- Cardiology Students’ Scientific Association at the III Department of Cardiology, Faculty of Medical Sciences in Katowice, Medical University of Silesia, 40-635 Katowice, Poland;
| | - Iga Paszkiewicz
- Student Scientific Association Named after Professor Zbigniew Religa at the Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Jordana 19, 41-808 Zabrze, Poland; (M.J.); (Ł.C.); (P.D.); (M.M.); (I.P.)
| | - Zbigniew Nawrat
- Department of Biophysics, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, 41-808 Zabrze, Poland;
| | - Maciej Cebula
- Individual Specialist Medical Practice Maciej Cebula, 40-754 Katowice, Poland;
| | - Katarzyna Gruszczyńska
- Department of Radiodiagnostics, Invasive Radiology and Nuclear Medicine, Department of Radiology and Nuclear Medicine, School of Medicine in Katowice, Medical University of Silesia, Medyków 14, 40-752 Katowice, Poland;
| |
Collapse
|
11
|
Ishfaq M, Shah SZA, Ahmad I, Rahman Z. Multinomial classification of NLRP3 inhibitory compounds based on large scale machine learning approaches. Mol Divers 2023:10.1007/s11030-023-10690-y. [PMID: 37418166 DOI: 10.1007/s11030-023-10690-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 07/03/2023] [Indexed: 07/08/2023]
Abstract
The role of NLRP3 inflammasome in innate immunity is newly recognized. The NLRP3 protein is a family of nucleotide-binding and oligomerization domain-like receptors as well as a pyrin domain-containing protein. It has been shown that NLRP3 may contribute to the development and progression of a variety of diseases, such as multiple sclerosis, metabolic disorders, inflammatory bowel disease, and other auto-immune and auto-inflammatory conditions. The use of machine learning methods in pharmaceutical research has been widespread for several decades. An important objective of this study is to apply machine learning approaches for the multinomial classification of NLRP3 inhibitors. However, data imbalances can affect machine learning. Therefore, a synthetic minority oversampling technique (SMOTE) has been developed to increase the sensitivity of classifiers to minority groups. The QSAR modelling was performed using 154 molecules retrieved from the ChEMBL database (version 29). The accuracy of the multiclass classification top six models was found to fall within ranges of 0.99 to 0.86, and log loss ranges of 0.2 to 2.3, respectively. The results showed that the receiver operating characteristic curve (ROC) plot values significantly improved when tuning parameters were adjusted and imbalanced data was handled. Moreover, the results demonstrated that SMOTE offers a significant advantage in handling imbalanced datasets as well as substantial improvements in overall accuracy of machine learning models. The top models were then used to predict data from unseen datasets. In summary, these QSAR classification models exhibited robust statistical results and were interpretable, which strongly supported their use for rapid screening of NLRP3 inhibitors.
Collapse
Affiliation(s)
- Muhammad Ishfaq
- College of Computer Science, Huanggang Normal University, Huanggang, 438000, China
| | - Syed Zahid Ali Shah
- Department of Pathology, Faculty of Veterinary and Animal Sciences, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Ijaz Ahmad
- The University of Agriculture Peshawar, Peshawar, 25130, Khyber Pakhtunkhwa, Pakistan
| | - Ziaur Rahman
- College of Computer Science, Huanggang Normal University, Huanggang, 438000, China.
| |
Collapse
|
12
|
Chen M, Lan Q, Nie S, Hu L, Fang Y, Cui W, Bai X, Liu L, Zhu B. Forensic efficiencies of individual identification, kinship testing and ancestral inference in three Yunnan groups based on a self-developed multiple DIP panel. Front Genet 2023; 13:1057231. [PMID: 36685924 PMCID: PMC9845582 DOI: 10.3389/fgene.2022.1057231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/25/2022] [Indexed: 01/06/2023] Open
Abstract
Deletion/insertion polymorphism (DIP), as a short insertion/deletion sequence polymorphic genetic marker, has attracted the attention of forensic genetic scientist due to its lack of stutter, short amplicon and abundant ancestral information. In this study, based on a self-developed 43 autosomal deletion/insertion polymorphism (A-DIP) loci panel which could meet the forensic application purposes of individual identification, kinship testing and ancestral inference to some extent, we evaluated the forensic efficiencies of the above three forensic objectives in Chinese Yi, Hani and Miao groups of Yunnan province. The cumulative match probability (CPM) and combined probability of exclusion (CPE) of these three groups were 1.11433E-18, 8.24299E-19, 4.21721E-18; 0.999610217, 0.999629285 and 0.999582084, respectively. Average 96.65% full sibling pairs could be identified from unrelated individual pairs (as likelihood ratios > 1) using this DIP panel, whereas the average false positive rate was 3.69% in three target Yunnan groups. With the biogeographical ancestor prediction models constructed by extreme gradient boosting (XGBoost) and support vector machine (SVM) algorithms, 0.8239 (95% CI 0.7984, 0.8474) of the unrelated individuals could be correctly divided according to the continental origins based on the 43 A-DIPs which were large frequency distribution differentiations among different continental populations. The present results of principal component analysis (PCA), multidimensional scaling (MDS), neighbor joining (NJ) and maximum likelihood (ML) phylogenetic trees and STRUCTURE analyses indicated that these three Yunnan groups had relatively close genetic distances with East Asian populations.
Collapse
Affiliation(s)
- Man Chen
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Qiong Lan
- Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China
| | - Shengjie Nie
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Liping Hu
- School of Forensic Medicine, Kunming Medical University, Kunming, China
| | - Yating Fang
- School of Basic Medical Sciences, Anhui Medical University, Hefei, China
| | - Wei Cui
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Xiaole Bai
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Liu Liu
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| | - Bofeng Zhu
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China,Microbiome Medicine Center, Department of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou, China,Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi’an Jiaotong University, Xi’an, China,*Correspondence: Bofeng Zhu,
| |
Collapse
|
13
|
Pan D, Li B, Wang S. Establishment and validation of a torsade de pointes prediction model based on human iPSC‑derived cardiomyocytes. Exp Ther Med 2022; 25:61. [PMID: 36588805 PMCID: PMC9780517 DOI: 10.3892/etm.2022.11760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 09/26/2022] [Indexed: 12/14/2022] Open
Abstract
Drug-induced cardiotoxicity is one of the main causes of drug failure, which leads to subsequent withdrawal from pharmaceutical development. Therefore, identifying the potential toxic candidate in the early stages of drug development is important. Human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) are a useful tool for assessing candidate compounds for arrhythmias. However, a suitable model using hiPSC-CMs to predict the risk of torsade de pointes (TdP) has not been fully established. The present study aimed to establish a predictive TdP model based on hiPSC-CMs. In the current study, 28 compounds recommended by the Comprehensive in vitro Proarrhythmia Assay (CiPA) were used as training set and models were established in different risk groups, high- and intermediate-risk versus low-risk groups. Subsequently, six endpoints of electrophysiological responses were used as potential model predictors. Accuracy, sensitivity and area under the curve (AUC) were used as evaluation indices of the models and seven compounds with known TdP risk were used to verify model differentiation and calibration. The results showed that among the seven models, the AUC of logistic regression and AdaBoost model was higher and had little difference in both training and test sets, which indicated that the discriminative ability and model stability was good and excellent, respectively. Therefore, these two models were taken as submodels, similar weight was configured and a new TdP risk prediction model was constructed using a soft voting strategy. The classification accuracy, sensitivity and AUC of the new model were 0.93, 0.95 and 0.92 on the training set, respectively and all 1.00 on the test set, which indicated good discrimination ability on both training and test sets. The risk threshold was defined as 0.50 and the consistency between the predicted and observed results were 92.8 and 100% on the training and test sets, respectively. Overall, the present study established a risk prediction model for TdP based on hiPSC-CMs which could be an effective predictive tool for compound-induced arrhythmias.
Collapse
Affiliation(s)
- Dongsheng Pan
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, P.R. China,National Center for Safety Evaluation of Drugs, National Institutes for Food and Drug Control, Beijing 100176, P.R. China
| | - Bo Li
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, P.R. China,National Center for Safety Evaluation of Drugs, National Institutes for Food and Drug Control, Beijing 100176, P.R. China
| | - Sanlong Wang
- National Center for Safety Evaluation of Drugs, National Institutes for Food and Drug Control, Beijing 100176, P.R. China,Correspondence to: Professor Sanlong Wang, National Center for Safety Evaluation of Drugs, National Institutes for Food and Drug Control, A8 Hongda Middle Street, Beijing Economic-Technological Development Area, Beijing 100176, P.R. China
| |
Collapse
|
14
|
Di Martino F, Delmastro F. Explainable AI for clinical and remote health applications: a survey on tabular and time series data. Artif Intell Rev 2022; 56:5261-5315. [PMID: 36320613 PMCID: PMC9607788 DOI: 10.1007/s10462-022-10304-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
AbstractNowadays Artificial Intelligence (AI) has become a fundamental component of healthcare applications, both clinical and remote, but the best performing AI systems are often too complex to be self-explaining. Explainable AI (XAI) techniques are defined to unveil the reasoning behind the system’s predictions and decisions, and they become even more critical when dealing with sensitive and personal health data. It is worth noting that XAI has not gathered the same attention across different research areas and data types, especially in healthcare. In particular, many clinical and remote health applications are based on tabular and time series data, respectively, and XAI is not commonly analysed on these data types, while computer vision and Natural Language Processing (NLP) are the reference applications. To provide an overview of XAI methods that are most suitable for tabular and time series data in the healthcare domain, this paper provides a review of the literature in the last 5 years, illustrating the type of generated explanations and the efforts provided to evaluate their relevance and quality. Specifically, we identify clinical validation, consistency assessment, objective and standardised quality evaluation, and human-centered quality assessment as key features to ensure effective explanations for the end users. Finally, we highlight the main research challenges in the field as well as the limitations of existing XAI methods.
Collapse
|
15
|
Pérez-Jeldres T, Pizarro B, Ascui G, Orellana M, Cerda-Villablanca M, Alvares D, de la Vega A, Cannistra M, Cornejo B, Baéz P, Silva V, Arriagada E, Rivera-Nieves J, Estela R, Hernández-Rocha C, Álvarez-Lobos M, Tobar F. Ethnicity influences phenotype and clinical outcomes: Comparing a South American with a North American inflammatory bowel disease cohort. Medicine (Baltimore) 2022; 101:e30216. [PMID: 36086782 PMCID: PMC10980497 DOI: 10.1097/md.0000000000030216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 07/12/2022] [Indexed: 11/27/2022] Open
Abstract
Inflammatory bowel disease (IBD), including ulcerative colitis (UC) and Crohn disease (CD), has emerged as a global disease with an increasing incidence in developing and newly industrialized regions such as South America. This global rise offers the opportunity to explore the differences and similarities in disease presentation and outcomes across different genetic backgrounds and geographic locations. Our study includes 265 IBD patients. We performed an exploratory analysis of the databases of Chilean and North American IBD patients to compare the clinical phenotypes between the cohorts. We employed an unsupervised machine-learning approach using principal component analysis, uniform manifold approximation, and projection, among others, for each disease. Finally, we predicted the cohort (North American vs Chilean) using a random forest. Several unsupervised machine learning methods have separated the 2 main groups, supporting the differences between North American and Chilean patients with each disease. The variables that explained the loadings of the clinical metadata on the principal components were related to the therapies and disease extension/location at diagnosis. Our random forest models were trained for cohort classification based on clinical characteristics, obtaining high accuracy (0.86 = UC; 0.79 = CD). Similarly, variables related to therapy and disease extension/location had a high Gini index. Similarly, univariate analysis showed a later CD age at diagnosis in Chilean IBD patients (37 vs 24; P = .005). Our study suggests a clinical difference between North American and Chilean IBD patients: later CD age at diagnosis with a predominantly less aggressive phenotype (39% vs 54% B1) and more limited disease, despite fewer biological therapies being used in Chile for both diseases.
Collapse
Affiliation(s)
- Tamara Pérez-Jeldres
- Department of Gastroenterology, Faculty of Medicine, Pontifical Catholic University of Chile, Santiago, Chile
- Instituto Chileno-Japonés, University of Chile, Santiago, Chile
| | - Benjamín Pizarro
- Radiology Department, Hospital Clínico Universidad de Chile, Santiago, Chile
| | - Gabriel Ascui
- La Jolla Institute for Allergy and Immunology, San Diego, CA
| | - Matías Orellana
- Department of Computer Science, Faculty of Physical Sciences and Mathematics of the University of Chile, Santiago, Chile
| | - Mauricio Cerda-Villablanca
- Integrative Biology Program, Institute of Biomedical Sciences, Center for Medical Informatics and Telemedicine, Faculty of Medicine, Universidad de Chile, Santiago, Chile
| | - Danilo Alvares
- Department of Statistics, Pontifical Catholic University of Chile, Santiago, Chile
| | | | - Macarena Cannistra
- Department of Gastroenterology, Faculty of Medicine, Pontifical Catholic University of Chile, Santiago, Chile
| | - Bárbara Cornejo
- Department of Gastroenterology, Faculty of Medicine, Pontifical Catholic University of Chile, Santiago, Chile
| | - Pablo Baéz
- Integrative Biology Program, Institute of Biomedical Sciences, Center for Medical Informatics and Telemedicine, Faculty of Medicine, Universidad de Chile, Santiago, Chile
| | - Verónica Silva
- Instituto Chileno-Japonés, University of Chile, Santiago, Chile
| | | | - Jesús Rivera-Nieves
- Inflammatory Bowel Disease Center, Division of Gastroenterology, University of California, San Diego, La Jolla, CA
| | - Ricardo Estela
- Instituto Chileno-Japonés, University of Chile, Santiago, Chile
| | - Cristián Hernández-Rocha
- Department of Gastroenterology, Faculty of Medicine, Pontifical Catholic University of Chile, Santiago, Chile
| | - Manuel Álvarez-Lobos
- Department of Gastroenterology, Faculty of Medicine, Pontifical Catholic University of Chile, Santiago, Chile
| | - Felipe Tobar
- Initiative for Data & Artificial Intelligence, University of Chile
- Center for Mathematical Modeling, University of Chile, Santiago, Chile
| |
Collapse
|
16
|
Albores-Mendez EM, Aguilera Hernández AD, Melo-González A, Vargas-Hernández MA, Gutierrez de la Cruz N, Vazquez-Guzman MA, Castro-Marín M, Romero-Morelos P, Winkler R. A diagnostic model for overweight and obesity from untargeted urine metabolomics of soldiers. PeerJ 2022; 10:e13754. [PMID: 35898940 PMCID: PMC9310780 DOI: 10.7717/peerj.13754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 06/28/2022] [Indexed: 01/17/2023] Open
Abstract
Soldiers in active military service need optimal physical fitness for successfully carrying out their operations. Therefore, their health status is regularly checked by army doctors. These inspections include physical parameters such as the body-mass index (BMI), functional tests, and biochemical studies. If a medical exam reveals an individual's excess weight, further examinations are made, and corrective actions for weight lowering are initiated. The collection of urine is non-invasive and therefore attractive for frequent metabolic screening. We compared the chemical profiles of urinary samples of 146 normal weight, excess weight, and obese soldiers of the Mexican Army, using untargeted metabolomics with liquid chromatography coupled to high-resolution mass spectrometry (LC-MS). In combination with data mining, statistical and metabolic pathway analyses suggest increased S-adenosyl-L-methionine (SAM) levels and changes of amino acid metabolites as important variables for overfeeding. We will use these potential biomarkers for the ongoing metabolic monitoring of soldiers in active service. In addition, after validation of our results, we will develop biochemical screening tests that are also suitable for civil applications.
Collapse
Affiliation(s)
- Exsal M. Albores-Mendez
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico
| | - Alexis D. Aguilera Hernández
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico
| | - Alejandra Melo-González
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico
| | - Marco A. Vargas-Hernández
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico
| | - Neptalí Gutierrez de la Cruz
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico
| | - Miguel A. Vazquez-Guzman
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico,Centro de Investigación en Ciencias de la Salud (CICSA), FCS, Universidad Anahuac Mexico, Campus Norte, Mexico City, Mexico
| | - Melchor Castro-Marín
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico
| | - Pablo Romero-Morelos
- Escuela Militar de Graduados de Sanidad, Universidad del Ejército y Fuerza Aérea Mexicanos, Secretaría de la Defensa Nacional, Mexico City, Mexico,Universidad Estatal del Valle de Ecatepec, Ecatepec, Mexico
| | - Robert Winkler
- UGA-Langebio, CINVESTAV, Irapuato, Gto., Mexico,Biotechnology and Biochemistry, CINVESTAV Unidad Irapuato, Irapuato, Gto., Mexico
| |
Collapse
|
17
|
Supervised Learning Models for the Preliminary Detection of COVID-19 in Patients Using Demographic and Epidemiological Parameters. INFORMATION 2022. [DOI: 10.3390/info13070330] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The World Health Organization labelled the new COVID-19 breakout a public health crisis of worldwide concern on 30 January 2020, and it was named the new global pandemic in March 2020. It has had catastrophic consequences on the world economy and well-being of people and has put a tremendous strain on already-scarce healthcare systems globally, particularly in underdeveloped countries. Over 11 billion vaccine doses have already been administered worldwide, and the benefits of these vaccinations will take some time to appear. Today, the only practical approach to diagnosing COVID-19 is through the RT-PCR and RAT tests, which have sometimes been known to give unreliable results. Timely diagnosis and implementation of precautionary measures will likely improve the survival outcome and decrease the fatality rates. In this study, we propose an innovative way to predict COVID-19 with the help of alternative non-clinical methods such as supervised machine learning models to identify the patients at risk based on their characteristic parameters and underlying comorbidities. Medical records of patients from Mexico admitted between 23 January 2020 and 26 March 2022, were chosen for this purpose. Among several supervised machine learning approaches tested, the XGBoost model achieved the best results with an accuracy of 92%. It is an easy, non-invasive, inexpensive, instant and accurate way of forecasting those at risk of contracting the virus. However, it is pretty early to deduce that this method can be used as an alternative in the clinical diagnosis of coronavirus cases.
Collapse
|
18
|
Liu YF, Shu X, Qiao XF, Ai GY, Liu L, Liao J, Qian S, He XJ. Radiomics-Based Machine Learning Models for Predicting P504s/P63 Immunohistochemical Expression: A Noninvasive Diagnostic Tool for Prostate Cancer. Front Oncol 2022; 12:911426. [PMID: 35795067 PMCID: PMC9252170 DOI: 10.3389/fonc.2022.911426] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Accepted: 05/19/2022] [Indexed: 01/31/2023] Open
Abstract
Objective To develop and validate a noninvasive radiomic-based machine learning (ML) model to identify P504s/P63 status and further achieve the diagnosis of prostate cancer (PCa). Methods A retrospective dataset of patients with preoperative prostate MRI examination and P504s/P63 pathological immunohistochemical results between June 2016 and February 2021 was conducted. As indicated by P504s/P63 expression, the patients were divided into label 0 (atypical prostatic hyperplasia), label 1 (benign prostatic hyperplasia, BPH) and label 2 (PCa) groups. This study employed T2WI, DWI and ADC sequences to assess prostate diseases and manually segmented regions of interest (ROIs) with Artificial Intelligence Kit software for radiomics feature acquisition. Feature dimensionality reduction and selection were performed by using a mutual information algorithm. Based on screened features, P504s/P63 prediction models were established by random forest (RF), gradient boosting decision tree (GBDT), logistic regression (LR), adaptive boosting (AdaBoost) and k-nearest neighbor (KNN) algorithms. The performance was evaluated by the area under the ROC curve (AUC) and accuracy. Results A total of 315 patients were enrolled. Among the 851 radiomic features, the 32 top features were derived from T2WI, in which the gray-level run length matrix (GLRLM) and gray-level cooccurrence matrix (GLCM) features accounted for the largest proportion. Among the five models, the RF algorithm performed best in general evaluations (microaverage AUC=0.920, macroaverage AUC=0.870) and provided the most accurate result in further sublabel prediction (the accuracies of label 0, 1, and 2 were 0.831, 0.831, and 0.932, respectively). In comparative sequence analyses, T2WI was the best single-sequence candidate (microaverage AUC=0.94 and macroaverage AUC=0.78). The merged datasets of T2WI, DWI, and ADC yielded optimal AUCs (microaverage AUC=0.930 and macroaverage AUC=0.900). Conclusions The radiomic-based RF classifier has the potential to be used to evaluate the presurgical P504s/P63 status and further diagnose PCa noninvasively and accurately.
Collapse
Affiliation(s)
- Yun-Fan Liu
- Department of Radiology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xin Shu
- Department of Radiology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xiao-Feng Qiao
- Department of Radiology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Guang-Yong Ai
- Department of Radiology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Li Liu
- Big Data and Software Engineering College, Chongqing University, Chongqing, China
| | - Jun Liao
- Big Data and Software Engineering College, Chongqing University, Chongqing, China
| | - Shuang Qian
- Big Data and Software Engineering College, Chongqing University, Chongqing, China
| | - Xiao-Jing He
- Department of Radiology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
- *Correspondence: Xiao-Jing He,
| |
Collapse
|
19
|
Hakkoum H, Abnane I, Idri A. Interpretability in the medical field: A systematic mapping and review study. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2021.108391] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
20
|
Douthit BJ, Walden RL, Cato K, Coviak CP, Cruz C, D'Agostino F, Forbes T, Gao G, Kapetanovic TA, Lee MA, Pruinelli L, Schultz MA, Wieben A, Jeffery AD. Data Science Trends Relevant to Nursing Practice: A Rapid Review of the 2020 Literature. Appl Clin Inform 2022; 13:161-179. [PMID: 35139564 PMCID: PMC8828453 DOI: 10.1055/s-0041-1742218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND The term "data science" encompasses several methods, many of which are considered cutting edge and are being used to influence care processes across the world. Nursing is an applied science and a key discipline in health care systems in both clinical and administrative areas, making the profession increasingly influenced by the latest advances in data science. The greater informatics community should be aware of current trends regarding the intersection of nursing and data science, as developments in nursing practice have cross-professional implications. OBJECTIVES This study aimed to summarize the latest (calendar year 2020) research and applications of nursing-relevant patient outcomes and clinical processes in the data science literature. METHODS We conducted a rapid review of the literature to identify relevant research published during the year 2020. We explored the following 16 topics: (1) artificial intelligence/machine learning credibility and acceptance, (2) burnout, (3) complex care (outpatient), (4) emergency department visits, (5) falls, (6) health care-acquired infections, (7) health care utilization and costs, (8) hospitalization, (9) in-hospital mortality, (10) length of stay, (11) pain, (12) patient safety, (13) pressure injuries, (14) readmissions, (15) staffing, and (16) unit culture. RESULTS Of 16,589 articles, 244 were included in the review. All topics were represented by literature published in 2020, ranging from 1 article to 59 articles. Numerous contemporary data science methods were represented in the literature including the use of machine learning, neural networks, and natural language processing. CONCLUSION This review provides an overview of the data science trends that were relevant to nursing practice in 2020. Examinations of such literature are important to monitor the status of data science's influence in nursing practice.
Collapse
Affiliation(s)
- Brian J. Douthit
- Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, United States
| | - Rachel L. Walden
- Annette and Irwin Eskind Family Biomedical Library, Vanderbilt University, Nashville, Tennessee, United States
| | - Kenrick Cato
- Department of Emergency Medicine, Columbia University School of Nursing, New York, New York, United States
| | - Cynthia P. Coviak
- Professor Emerita of Nursing, Grand Valley State University, Allendale, Michigan, United States
| | - Christopher Cruz
- Global Health Technology and Informatics, Chevron, San Ramon, California, United States
| | - Fabio D'Agostino
- Department of Medicine and Surgery, Saint Camillus International University of Health Sciences, Rome, Italy
| | - Thompson Forbes
- College of Nursing, East Carolina University, Greenville, North California, United States
| | - Grace Gao
- Department of Nursing, St Catherine University, Saint Paul, Minnesota, United States
| | - Theresa A. Kapetanovic
- College of Nursing, East Carolina University, Greenville, North California, United States
| | - Mikyoung A. Lee
- College of Nursing, Texas Woman's University, Denton, Texas, United States
| | - Lisiane Pruinelli
- School of Nursing, University of Minnesota, Minneapolis, Minnesota, United States
| | - Mary A. Schultz
- Department of Nursing, California State University, San Bernardino, California, United States
| | - Ann Wieben
- School of Nursing, University of Wisconsin-Madison, Wisconsin, United States
| | - Alvin D. Jeffery
- School of Nursing, Vanderbilt University; Tennessee Valley Healthcare System, U.S. Department of Veterans Affairs, Nashville, Tennessee, United States,Address for correspondence Alvin D. Jeffery, PhD, RN-BC, CCRN-K, FNP-BC 461 21st Avenue South, Nashville, TN 37240United States
| |
Collapse
|
21
|
A Systematic Review of Explainable Artificial Intelligence in Terms of Different Application Domains and Tasks. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031353] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Artificial intelligence (AI) and machine learning (ML) have recently been radically improved and are now being employed in almost every application domain to develop automated or semi-automated systems. To facilitate greater human acceptability of these systems, explainable artificial intelligence (XAI) has experienced significant growth over the last couple of years with the development of highly accurate models but with a paucity of explainability and interpretability. The literature shows evidence from numerous studies on the philosophy and methodologies of XAI. Nonetheless, there is an evident scarcity of secondary studies in connection with the application domains and tasks, let alone review studies following prescribed guidelines, that can enable researchers’ understanding of the current trends in XAI, which could lead to future research for domain- and application-specific method development. Therefore, this paper presents a systematic literature review (SLR) on the recent developments of XAI methods and evaluation metrics concerning different application domains and tasks. This study considers 137 articles published in recent years and identified through the prominent bibliographic databases. This systematic synthesis of research articles resulted in several analytical findings: XAI methods are mostly developed for safety-critical domains worldwide, deep learning and ensemble models are being exploited more than other types of AI/ML models, visual explanations are more acceptable to end-users and robust evaluation metrics are being developed to assess the quality of explanations. Research studies have been performed on the addition of explanations to widely used AI/ML models for expert users. However, more attention is required to generate explanations for general users from sensitive domains such as finance and the judicial system.
Collapse
|
22
|
A Survey on Artificial Intelligence (AI) and eXplainable AI in Air Traffic Management: Current Trends and Development with Future Research Trajectory. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12031295] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Air Traffic Management (ATM) will be more complex in the coming decades due to the growth and increased complexity of aviation and has to be improved in order to maintain aviation safety. It is agreed that without significant improvement in this domain, the safety objectives defined by international organisations cannot be achieved and a risk of more incidents/accidents is envisaged. Nowadays, computer science plays a major role in data management and decisions made in ATM. Nonetheless, despite this, Artificial Intelligence (AI), which is one of the most researched topics in computer science, has not quite reached end users in ATM domain. In this paper, we analyse the state of the art with regards to usefulness of AI within aviation/ATM domain. It includes research work of the last decade of AI in ATM, the extraction of relevant trends and features, and the extraction of representative dimensions. We analysed how the general and ATM eXplainable Artificial Intelligence (XAI) works, analysing where and why XAI is needed, how it is currently provided, and the limitations, then synthesise the findings into a conceptual framework, named the DPP (Descriptive, Predictive, Prescriptive) model, and provide an example of its application in a scenario in 2030. It concludes that AI systems within ATM need further research for their acceptance by end-users. The development of appropriate XAI methods including the validation by appropriate authorities and end-users are key issues that needs to be addressed.
Collapse
|
23
|
Huang J, He R, Chen J, Li S, Deng Y, Wu X. Boosting Advanced Nasopharyngeal Carcinoma Stage Prediction Using a Two-Stage Classification Framework Based on Deep Learning. INT J COMPUT INT SYS 2021. [PMCID: PMC8523349 DOI: 10.1007/s44196-021-00026-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Abstract Nasopharyngeal carcinoma (NPC) is a popular malignant tumor of the head and neck which is endemic in the world, more than 75% of the NPC patients suffer from locoregionally advanced nasopharyngeal carcinoma (LA-NPC). The survival quality of these patients depends on the reliable prediction of NPC stages III and IVa. In this paper, we propose a two-stage framework to produce the classification probabilities for predicting NPC stages III and IVa. The preprocessing of MR images enhance the quality of images for further analysis. In stage one transfer learning is used to improve the classification effectiveness and the efficiency of CNN models training with limited images. Then in stage two the output of these models are aggregates using soft voting to boost the final prediction. The experimental results show the preprocessing is quite effective, the performance of transfer learning models perform better than the basic CNN model, and our ensemble model outperforms the single model as well as traditional methods, including the TNM staging system and the Radiomics method. Finally, the prediction accuracy boosted by the framework is, respectively, 0.81, indicating that our method achieves the SOTA effectiveness for LA-NPC stage prediction. In addition, the heatmaps generated with Class Activation Map technique illustrate the interpretability of the CNN models, and show their capability of assisting clinicians in medical diagnosis and follow-up treatment by producing discriminative regions related to NPC in the MR images. Graphic Abstract ![]()
Collapse
|
24
|
A Perspective View of Cotton Leaf Image Classification Using Machine Learning Algorithms Using WEKA. ADVANCES IN HUMAN-COMPUTER INTERACTION 2021. [DOI: 10.1155/2021/9367778] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Cotton is one of the major crops in India, where 23% of cotton gets exported to other countries. The cotton yield depends on crop growth, and it gets affected by diseases. In this paper, cotton disease classification is performed using different machine learning algorithms. For this research, the cotton leaf image database was used to segment the images from the natural background using modified factorization-based active contour method. First, the color and texture features are extracted from segmented images. Later, it has to be fed to the machine learning algorithms such as multilayer perceptron, support vector machine, Naïve Bayes, Random Forest, AdaBoost, and K-nearest neighbor. Four color features and eight texture features were extracted, and experimentation was done using three cases: (1) only color features, (2) only texture features, and (3) both color and texture features. The performance of classifiers was better when color features are extracted compared to texture feature extraction. The color features are enough to classify the healthy and unhealthy cotton leaf images. The performance of the classifiers was evaluated using performance parameters such as precision, recall, F-measure, and Matthews correlation coefficient. The accuracies of classifiers such as support vector machine, Naïve Bayes, Random Forest, AdaBoost, and K-nearest neighbor are 93.38%, 90.91%, 95.86%, 92.56%, and 94.21%, respectively, whereas that of the multilayer perceptron classifier is 96.69%.
Collapse
|
25
|
gbt-HIPS: Explaining the Classifications of Gradient Boosted Tree Ensembles. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11062511] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
This research presents Gradient Boosted Tree High Importance Path Snippets (gbt-HIPS), a novel, heuristic method for explaining gradient boosted tree (GBT) classification models by extracting a single classification rule (CR) from the ensemble of decision trees that make up the GBT model. This CR contains the most statistically important boundary values of the input space as antecedent terms. The CR represents a hyper-rectangle of the input space inside which the GBT model is, very reliably, classifying all instances with the same class label as the explanandum instance. In a benchmark test using nine data sets and five competing state-of-the-art methods, gbt-HIPS offered the best trade-off between coverage (0.16–0.75) and precision (0.85–0.98). Unlike competing methods, gbt-HIPS is also demonstrably guarded against under- and over-fitting. A further distinguishing feature of our method is that, unlike much prior work, our explanations also provide counterfactual detail in accordance with widely accepted recommendations for what makes a good explanation.
Collapse
|
26
|
Luo G, Johnson MD, Nkoy FL, He S, Stone BL. Automatically Explaining Machine Learning Prediction Results on Asthma Hospital Visits in Patients With Asthma: Secondary Analysis. JMIR Med Inform 2020; 8:e21965. [PMID: 33382379 PMCID: PMC7808890 DOI: 10.2196/21965] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 10/25/2020] [Accepted: 11/15/2020] [Indexed: 12/27/2022] Open
Abstract
Background Asthma is a major chronic disease that poses a heavy burden on health care. To facilitate the allocation of care management resources aimed at improving outcomes for high-risk patients with asthma, we recently built a machine learning model to predict asthma hospital visits in the subsequent year in patients with asthma. Our model is more accurate than previous models. However, like most machine learning models, it offers no explanation of its prediction results. This creates a barrier for use in care management, where interpretability is desired. Objective This study aims to develop a method to automatically explain the prediction results of the model and recommend tailored interventions without lowering the performance measures of the model. Methods Our data were imbalanced, with only a small portion of data instances linking to future asthma hospital visits. To handle imbalanced data, we extended our previous method of automatically offering rule-formed explanations for the prediction results of any machine learning model on tabular data without lowering the model’s performance measures. In a secondary analysis of the 334,564 data instances from Intermountain Healthcare between 2005 and 2018 used to form our model, we employed the extended method to automatically explain the prediction results of our model and recommend tailored interventions. The patient cohort consisted of all patients with asthma who received care at Intermountain Healthcare between 2005 and 2018, and resided in Utah or Idaho as recorded at the visit. Results Our method explained the prediction results for 89.7% (391/436) of the patients with asthma who, per our model’s correct prediction, were likely to incur asthma hospital visits in the subsequent year. Conclusions This study is the first to demonstrate the feasibility of automatically offering rule-formed explanations for the prediction results of any machine learning model on imbalanced tabular data without lowering the performance measures of the model. After further improvement, our asthma outcome prediction model coupled with the automatic explanation function could be used by clinicians to guide the allocation of limited asthma care management resources and the identification of appropriate interventions.
Collapse
Affiliation(s)
- Gang Luo
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States
| | - Michael D Johnson
- Department of Pediatrics, University of Utah, Salt Lake City, UT, United States
| | - Flory L Nkoy
- Department of Pediatrics, University of Utah, Salt Lake City, UT, United States
| | - Shan He
- Care Transformation and Information Systems, Intermountain Healthcare, Salt Lake City, UT, United States
| | - Bryan L Stone
- Department of Pediatrics, University of Utah, Salt Lake City, UT, United States
| |
Collapse
|