1
|
Hosseini M, Rasekh AH, Keshavarzi A. Improving clinical abbreviation sense disambiguation using attention-based Bi-LSTM and hybrid balancing techniques in imbalanced datasets. J Eval Clin Pract 2024; 30:1327-1336. [PMID: 39031903 DOI: 10.1111/jep.14041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/29/2024] [Accepted: 05/21/2024] [Indexed: 07/22/2024]
Abstract
RATIONALE Clinical abbreviations pose a challenge for clinical decision support systems due to their ambiguity. Additionally, clinical datasets often suffer from class imbalance, hindering the classification of such data. This imbalance leads to classifiers with low accuracy and high error rates. Traditional feature-engineered models struggle with this task, and class imbalance is a known factor that reduces the performance of neural network techniques. AIMS AND OBJECTIVES This study proposes an attention-based bidirectional long short-term memory (Bi-LSTM) model to improve clinical abbreviation disambiguation in clinical documents. We aim to address the challenges of limited training data and class imbalance by employing data generation techniques like reverse substitution and data augmentation with synonym substitution. METHOD We utilise a Bi-LSTM classification model with an attention mechanism to disambiguate each abbreviation. The model's performance is evaluated based on accuracy for each abbreviation. To address the limitations of imbalanced data, we employ data generation techniques to create a more balanced dataset. RESULTS The evaluation results demonstrate that our data balancing technique significantly improves the model's accuracy by 2.08%. Furthermore, the proposed attention-based Bi-LSTM model achieves an accuracy of 96.09% on the UMN dataset, outperforming state-of-the-art results. CONCLUSION Deep neural network methods, particularly Bi-LSTM, offer promising alternatives to traditional feature-engineered models for clinical abbreviation disambiguation. By employing data generation techniques, we can address the challenges posed by limited-resource and imbalanced clinical datasets. This approach leads to a significant improvement in model accuracy for clinical abbreviation disambiguation tasks.
Collapse
Affiliation(s)
- Manda Hosseini
- Department of Computer Engineering, Zand Institute of Higher Education, Shiraz, Iran
| | - Amir Hossein Rasekh
- Department of Computer Engineering, Zand Institute of Higher Education, Shiraz, Iran
| | - Amin Keshavarzi
- Department of Computer Engineering, Marvdasht Branch, Islamic Azad University, Marvdasht, Iran
| |
Collapse
|
2
|
Lokker C, Abdelkader W, Bagheri E, Parrish R, Cotoi C, Navarro T, Germini F, Linkins LA, Haynes RB, Chu L, Afzal M, Iorio A. Boosting efficiency in a clinical literature surveillance system with LightGBM. PLOS DIGITAL HEALTH 2024; 3:e0000299. [PMID: 39312500 PMCID: PMC11419392 DOI: 10.1371/journal.pdig.0000299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 08/14/2024] [Indexed: 09/25/2024]
Abstract
Given the suboptimal performance of Boolean searching to identify methodologically sound and clinically relevant studies in large bibliographic databases, exploring machine learning (ML) to efficiently classify studies is warranted. To boost the efficiency of a literature surveillance program, we used a large internationally recognized dataset of articles tagged for methodological rigor and applied an automated ML approach to train and test binary classification models to predict the probability of clinical research articles being of high methodologic quality. We trained over 12,000 models on a dataset of titles and abstracts of 97,805 articles indexed in PubMed from 2012-2018 which were manually appraised for rigor by highly trained research associates and rated for clinical relevancy by practicing clinicians. As the dataset is unbalanced, with more articles that do not meet the criteria for rigor, we used the unbalanced dataset and over- and under-sampled datasets. Models that maintained sensitivity for high rigor at 99% and maximized specificity were selected and tested in a retrospective set of 30,424 articles from 2020 and validated prospectively in a blinded study of 5253 articles. The final selected algorithm, combining a LightGBM (gradient boosting machine) model trained in each dataset, maintained high sensitivity and achieved 57% specificity in the retrospective validation test and 53% in the prospective study. The number of articles needed to read to find one that met appraisal criteria was 3.68 (95% CI 3.52 to 3.85) in the prospective study, compared with 4.63 (95% CI 4.50 to 4.77) when relying only on Boolean searching. Gradient-boosting ML models reduced the work required to classify high quality clinical research studies by 45%, improving the efficiency of literature surveillance and subsequent dissemination to clinicians and other evidence users.
Collapse
Affiliation(s)
- Cynthia Lokker
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Wael Abdelkader
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Elham Bagheri
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Rick Parrish
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Chris Cotoi
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Tamara Navarro
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Federico Germini
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Lori-Ann Linkins
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - R. Brian Haynes
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Lingyang Chu
- Department of Computing and Software, McMaster University, Hamilton, Ontario, Canada
| | - Muhammad Afzal
- School of Computing and Digital Technology, Birmingham City University, Birmingham, United Kingdom
| | - Alfonso Iorio
- Health Information Research Unit, Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
3
|
Charizanos G, Demirhan H, İçen D. Binary classification with fuzzy logistic regression under class imbalance and complete separation in clinical studies. BMC Med Res Methodol 2024; 24:145. [PMID: 38970036 PMCID: PMC11225249 DOI: 10.1186/s12874-024-02270-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 06/27/2024] [Indexed: 07/07/2024] Open
Abstract
BACKGROUND In binary classification for clinical studies, an imbalanced distribution of cases to classes and an extreme association level between the binary dependent variable and a subset of independent variables can create significant classification problems. These crucial issues, namely class imbalance and complete separation, lead to classification inaccuracy and biased results in clinical studies. METHOD To deal with class imbalance and complete separation problems, we propose using a fuzzy logistic regression framework for binary classification. Fuzzy logistic regression incorporates combinations of triangular fuzzy numbers for the coefficients, inputs, and outputs and produces crisp classification results. The fuzzy logistic regression framework shows strong classification performance due to fuzzy logic's better handling of imbalance and separation issues. Hence, classification accuracy is improved, mitigating the risk of misclassified conditions and biased insights for clinical study patients. RESULTS The performance of the fuzzy logistic regression model is assessed on twelve binary classification problems with clinical datasets. The model has consistently high sensitivity, specificity, F1, precision, and Mathew's correlation coefficient scores across all clinical datasets. There is no evidence of impact from the imbalance or separation that exists in the datasets. Furthermore, we compare the fuzzy logistic regression classification performance against two versions of classical logistic regression and six different benchmark sources in the literature. These six sources provide a total of ten different proposed methodologies, and the comparison occurs by calculating the same set of classification performance scores for each method. Either imbalance or separation impacts seven out of ten methodologies. The remaining three produce better classification performance in their respective clinical studies. However, these are all outperformed by the fuzzy logistic regression framework. CONCLUSION Fuzzy logistic regression showcases strong performance against imbalance and separation, providing accurate predictions and, hence, informative insights for classifying patients in clinical studies.
Collapse
Affiliation(s)
- Georgios Charizanos
- Mathematical Sciences, School of Science, RMIT University, La Trobe St, Melbourne, 3000, Victoria, Australia
| | - Haydar Demirhan
- Mathematical Sciences, School of Science, RMIT University, La Trobe St, Melbourne, 3000, Victoria, Australia.
| | - Duygu İçen
- Department of Statistics, Hacettepe University, Çankaya, Ankara, 06800, Ankara, Türkiye
| |
Collapse
|
4
|
Gutiérrez-Mondragón MA, Vellido A, König C. A Study on the Robustness and Stability of Explainable Deep Learning in an Imbalanced Setting: The Exploration of the Conformational Space of G Protein-Coupled Receptors. Int J Mol Sci 2024; 25:6572. [PMID: 38928278 PMCID: PMC11203844 DOI: 10.3390/ijms25126572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 06/03/2024] [Accepted: 06/12/2024] [Indexed: 06/28/2024] Open
Abstract
G-protein coupled receptors (GPCRs) are transmembrane proteins that transmit signals from the extracellular environment to the inside of the cells. Their ability to adopt various conformational states, which influence their function, makes them crucial in pharmacoproteomic studies. While many drugs target specific GPCR states to exert their effects-thereby regulating the protein's activity-unraveling the activation pathway remains challenging due to the multitude of intermediate transformations occurring throughout this process, and intrinsically influencing the dynamics of the receptors. In this context, computational modeling, particularly molecular dynamics (MD) simulations, may offer valuable insights into the dynamics and energetics of GPCR transformations, especially when combined with machine learning (ML) methods and techniques for achieving model interpretability for knowledge generation. The current study builds upon previous work in which the layer relevance propagation (LRP) technique was employed to interpret the predictions in a multi-class classification problem concerning the conformational states of the β2-adrenergic (β2AR) receptor from MD simulations. Here, we address the challenges posed by class imbalance and extend previous analyses by evaluating the robustness and stability of deep learning (DL)-based predictions under different imbalance mitigation techniques. By meticulously evaluating explainability and imbalance strategies, we aim to produce reliable and robust insights.
Collapse
Affiliation(s)
- Mario A. Gutiérrez-Mondragón
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; (M.A.G.-M.); (A.V.)
| | - Alfredo Vellido
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; (M.A.G.-M.); (A.V.)
- Centro de Investigacion Biomédica en Red (CIBER), 28029 Madrid, Spain
| | - Caroline König
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain; (M.A.G.-M.); (A.V.)
| |
Collapse
|
5
|
Peters SJ, Schmitz-Buhl M, Zielasek J, Gouzoulis-Mayfrank E. Involuntary psychiatric hospitalisation - differences and similarities between patients detained under the mental health act and according to the legal guardianship legislation. BMC Psychiatry 2024; 24:442. [PMID: 38872132 DOI: 10.1186/s12888-024-05892-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 06/05/2024] [Indexed: 06/15/2024] Open
Abstract
BACKGROUND Involuntary psychiatric hospitalisation occurs under different legal premises. According to German law, detention under the Mental Health Act (MHA) is possible in cases of imminent danger of self-harm or harm to others, while detention according to the legal guardianship legislation (LGL) serves to prevent self-harm if there is considerable but not necessarily imminent danger. This study aims to compare clinical, sociodemographic and environmental socioeconomic differences and similarities between patients hospitalised under either the MHA or LGL. METHODS We conducted a retrospective health records analysis of all involuntarily hospitalised cases in the four psychiatric hospitals of the city of Cologne, Germany, in 2011. Of the 1,773 cases, 87.3% were detained under the MHA of the federal state of North Rhine-Westphalia and 6.4% were hospitalised according to the federal LGL. Another 6.3% of the cases were originally admitted under the MHA, but the legal basis of detention was converted to LGL during the inpatient psychiatric stay (MHA→LGL cases). We compared sociodemographic, clinical, systemic and environmental socioeconomic (ESED) variables of the three groups by means of descriptive statistics. We also trained and tested a machine learning-based algorithm to predict class membership of the involuntary modes of psychiatric inpatient care. RESULTS Cases with an admission under the premises of LGL lived less often on their own, and they were more often retired compared to MHA cases. They more often had received previous outpatient or inpatient treatment than MHA cases, they were more often diagnosed with a psychotic disorder and they lived in neighbourhoods that were on average more socially advantaged. MHA→LGL cases were on average older and more often retired than MHA cases. More often, they had a main diagnosis of an organic mental disorder compared to both MHA and LGL cases. Also, they less often received previous psychiatric inpatient treatment compared to LGL cases. The reason for detention (self-harm or harm to others) did not differ between the three groups. The proportion of LGL and MHA cases differed between the four hospitals. Effect sizes were mostly small and the balanced accuracy of the Random Forest was low. CONCLUSION We found some plausible differences in patient characteristics depending on the legal foundation of the involuntary psychiatric hospitalisation. The differences relate to clinical, sociodemographic and socioeconomical issues. However, the low effect sizes and the limited accuracy of the machine learning models indicate that the investigated variables do not sufficiently explain the respective choice of the legal framework. In addition, we found some indication for possibly different interpretation and handling of the premises of the law in practice. Our findings pose the need for further research in this field.
Collapse
Affiliation(s)
- Sönke Johann Peters
- LVR Institute for Healthcare Research, Wilhelm-Griesinger-Strasse 23, 51109, Cologne, Germany
- LVR Clinics Cologne, Wilhelm-Griesinger-Strasse 23, 51109, Cologne, Germany
| | - Mario Schmitz-Buhl
- LVR Clinics Cologne, Wilhelm-Griesinger-Strasse 23, 51109, Cologne, Germany
| | - Jürgen Zielasek
- LVR Institute for Healthcare Research, Wilhelm-Griesinger-Strasse 23, 51109, Cologne, Germany
- Medical Faculty, Heinrich Heine University Düsseldorf, Universitätsstraße 1, 40225, Düsseldorf, Germany
| | - Euphrosyne Gouzoulis-Mayfrank
- LVR Institute for Healthcare Research, Wilhelm-Griesinger-Strasse 23, 51109, Cologne, Germany.
- LVR Clinics Cologne, Wilhelm-Griesinger-Strasse 23, 51109, Cologne, Germany.
| |
Collapse
|
6
|
Captur G, Doykov I, Chung SC, Field E, Barnes A, Zhang E, Heenan I, Norrish G, Moon JC, Elliott PM, Heywood WE, Mills K, Kaski JP. Novel Multiplexed Plasma Biomarker Panel Has Diagnostic and Prognostic Potential in Children With Hypertrophic Cardiomyopathy. CIRCULATION. GENOMIC AND PRECISION MEDICINE 2024; 17:e004448. [PMID: 38847081 PMCID: PMC11188636 DOI: 10.1161/circgen.123.004448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 04/16/2024] [Indexed: 06/20/2024]
Abstract
BACKGROUND Hypertrophic cardiomyopathy (HCM) is defined clinically by pathological left ventricular hypertrophy. We have previously developed a plasma proteomics biomarker panel that correlates with clinical markers of disease severity and sudden cardiac death risk in adult patients with HCM. The aim of this study was to investigate the utility of adult biomarkers and perform new discoveries in proteomics for childhood-onset HCM. METHODS Fifty-nine protein biomarkers were identified from an exploratory plasma proteomics screen in children with HCM and augmented into our existing multiplexed targeted liquid chromatography-tandem/mass spectrometry-based assay. The association of these biomarkers with clinical phenotypes and outcomes was prospectively tested in plasma collected from 148 children with HCM and 50 healthy controls. Machine learning techniques were used to develop novel pediatric plasma proteomic biomarker panels. RESULTS Four previously identified adult HCM markers (aldolase fructose-bisphosphate A, complement C3a, talin-1, and thrombospondin 1) and 3 new markers (glycogen phosphorylase B, lipoprotein a and profilin 1) were elevated in pediatric HCM. Using supervised machine learning applied to training (n=137) and validation cohorts (n=61), this 7-biomarker panel differentiated HCM from healthy controls with an area under the curve of 1.0 in the training data set (sensitivity 100% [95% CI, 95-100]; specificity 100% [95% CI, 96-100]) and 0.82 in the validation data set (sensitivity 75% [95% CI, 59-86]; specificity 88% [95% CI, 75-94]). Reduced circulating levels of 4 other peptides (apolipoprotein L1, complement 5b, immunoglobulin heavy constant epsilon, and serum amyloid A4) found in children with high sudden cardiac death risk provided complete separation from the low and intermediate risk groups and predicted mortality and adverse arrhythmic outcomes (hazard ratio, 2.04 [95% CI, 1.0-4.2]; P=0.044). CONCLUSIONS In children, a 7-biomarker proteomics panel can distinguish HCM from controls with high sensitivity and specificity, and another 4-biomarker panel identifies those at high risk of adverse arrhythmic outcomes, including sudden cardiac death.
Collapse
Affiliation(s)
- Gabriella Captur
- UCL MRC Unit for Lifelong Health & Ageing, UCL, London, United Kingdom (G.C.)
- UCL Institute of Cardiovascular Science, UCL, London, United Kingdom (G.C., J.C.M., P.M.E.)
- The Royal Free Hospital, Centre for Inherited Heart Muscle Conditions, Cardiology Department, UCL, London, United Kingdom (G.C.)
| | - Ivan Doykov
- Translational Mass Spectrometry Research Group, UCL Institute of Child Health, London, United Kingdom (I.D., E.Z., W.E.H., K.M.)
| | - Sheng-Chia Chung
- UCL Institute of Health Informatics Research, Division of Infection and Immunity, London, United Kingdom (S.-C.C.)
| | - Ella Field
- Centre for Paediatric Inherited & Rare Cardiovascular Disease, Institute of Cardiovascular Science, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
- Centre for Inherited Cardiovascular Diseases, Great Ormond Street Hospital, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
| | - Annabelle Barnes
- Centre for Paediatric Inherited & Rare Cardiovascular Disease, Institute of Cardiovascular Science, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
- Centre for Inherited Cardiovascular Diseases, Great Ormond Street Hospital, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
| | - Enpei Zhang
- Translational Mass Spectrometry Research Group, UCL Institute of Child Health, London, United Kingdom (I.D., E.Z., W.E.H., K.M.)
- UCL Medical School, University College London, London, United Kingdom (E.Z.)
| | - Imogen Heenan
- Centre for Paediatric Inherited & Rare Cardiovascular Disease, Institute of Cardiovascular Science, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
- Centre for Inherited Cardiovascular Diseases, Great Ormond Street Hospital, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
| | - Gabrielle Norrish
- Centre for Paediatric Inherited & Rare Cardiovascular Disease, Institute of Cardiovascular Science, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
- Centre for Inherited Cardiovascular Diseases, Great Ormond Street Hospital, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
| | - James C. Moon
- Barts Heart Centre, the Cardiovascular Magnetic Resonance Unit, London, United Kingdom (J.C.M.)
| | - Perry M. Elliott
- Barts Heart Centre, the Inherited Cardiovascular Diseases Unit, St Bartholomew’s Hospital, London, United Kingdom (P.M.E.)
| | - Wendy E. Heywood
- Translational Mass Spectrometry Research Group, UCL Institute of Child Health, London, United Kingdom (I.D., E.Z., W.E.H., K.M.)
| | - Kevin Mills
- Translational Mass Spectrometry Research Group, UCL Institute of Child Health, London, United Kingdom (I.D., E.Z., W.E.H., K.M.)
| | - Juan Pablo Kaski
- Centre for Paediatric Inherited & Rare Cardiovascular Disease, Institute of Cardiovascular Science, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
- Centre for Inherited Cardiovascular Diseases, Great Ormond Street Hospital, London, United Kingdom (E.F., A.B., I.H., G.N., J.P.K.)
| |
Collapse
|
7
|
Blanco LE, Wilcox JH, Hughes MS, Lal RA. Development of a Real-time Force-based Algorithm for Infusion Failure Detection. J Diabetes Sci Technol 2024:19322968241247530. [PMID: 38654491 DOI: 10.1177/19322968241247530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
BACKGROUND Continuous subcutaneous insulin infusion (CSII) is a common treatment option for people with diabetes (PWD), but insulin infusion failures pose a significant challenge, leading to hyperglycemia, diabetes burnout, and increased hospitalizations. Current CSII pumps' occlusion alarm systems are limited in detecting infusion failures; therefore, a more effective detection method is needed. METHODS We conducted five preclinical animal studies to collect data on infusion failures, utilizing both insulin and non-insulin boluses. Data were captured using in-line pressure and flow rate sensors, with additional force data from CSII pumps' onboard sensors in one study. A novel classifier model was developed using this dataset, aimed at detecting different types of infusion failures through direct utilization of force sensor data. Performance was compared against various occlusion alarm thresholds from commercially available CSII pumps. RESULTS The testing dataset included 251 boluses. The Bagging classifier model showed the highest performance metrics among the models tested, exhibiting high accuracy (96%), sensitivity (94%), and specificity (98%), with lower false-positive and false-negative rate compared with traditional occlusion alarm pressure thresholds. CONCLUSIONS Our study developed a novel non-threshold classifier that outperforms current occlusion alarm systems in CSII pumps in detecting infusion failures. This advancement has the potential to reduce the risk of hyperglycemia and hospitalizations due to undetected infusion failures, offering a more reliable and effective CSII therapy for PWD. Further studies involving human participants are recommended to validate these findings and assess the classifier's performance in a real-world setting.
Collapse
|
8
|
Hu J, Sheng Y, Ma J, Tang Y, Liu D, Zhang J, Wei X, Yang Y, Liu Y, Zhang Y, Wang G. Construction and validation of a progression prediction model for locally advanced rectal cancer patients received neoadjuvant chemoradiotherapy followed by total mesorectal excision based on machine learning. Front Oncol 2024; 13:1231508. [PMID: 38328435 PMCID: PMC10849061 DOI: 10.3389/fonc.2023.1231508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 12/28/2023] [Indexed: 02/09/2024] Open
Abstract
Background We attempted to develop a progression prediction model for local advanced rectal cancer(LARC) patients who received preoperative neoadjuvant chemoradiotherapy(NCRT) and operative treatment to identify high-risk patients in advance. Methods Data from 272 LARC patients who received NCRT and total mesorectal excision(TME) from 2011 to 2018 at the Fourth Hospital of Hebei Medical University were collected. Data from 161 patients with rectal cancer (each sample with one target variable (progression) and 145 characteristic variables) were included. One Hot Encoding was applied to numerically represent some characteristics. The K-Nearest Neighbor (KNN) filling method was used to determine the missing values, and SmoteTomek comprehensive sampling was used to solve the data imbalance. Eventually, data from 135 patients with 45 characteristic clinical variables were obtained. Random forest, decision tree, support vector machine (SVM), and XGBoost were used to predict whether patients with rectal cancer will exhibit progression. LASSO regression was used to further filter the variables and narrow down the list of variables using a Venn diagram. Eventually, the prediction model was constructed by multivariate logistic regression, and the performance of the model was confirmed in the validation set. Results Eventually, data from 135 patients including 45 clinical characteristic variables were included in the study. Data were randomly divided in an 8:2 ratio into a data set and a validation set, respectively. Area Under Curve (AUC) values of 0.72 for the decision tree, 0.97 for the random forest, 0.89 for SVM, and 0.94 for XGBoost were obtained from the data set. Similar results were obtained from the validation set. Twenty-three variables were obtained from LASSO regression, and eight variables were obtained by considering the intersection of the variables obtained using the previous four machine learning methods. Furthermore, a multivariate logistic regression model was constructed using the data set; the ROC indicated its good performance. The ROC curve also verified the good predictive performance in the validation set. Conclusions We constructed a logistic regression model with good predictive performance, which allowed us to accurately predict whether patients who received NCRT and TME will exhibit disease progression.
Collapse
Affiliation(s)
- Jitao Hu
- Department of General Surgery, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
| | - Yuanyuan Sheng
- School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
| | - Jinlong Ma
- School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
| | - Yujie Tang
- Department of General Surgery, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
| | - Dong Liu
- Department of Gastrointestinal Surgery, The Third Hospital of Hebei Medical University, Shijiazhuang, China
| | - Jianqing Zhang
- Department of General Surgery, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
| | - Xudong Wei
- Department of General Surgery, The Third Hospital of Hebei Medical University, Shijiazhuang, China
| | - Yang Yang
- Department of Gastrointestinal Surgery, The Third Hospital of Hebei Medical University, Shijiazhuang, China
| | - Yueping Liu
- Department of Pathology, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
| | - Yongqiang Zhang
- School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang, China
| | - Guiying Wang
- Department of General Surgery, The Fourth Hospital of Hebei Medical University, Shijiazhuang, China
- The Second Hospital of Hebei Medical University, Shijiazhuang, Hebei, China
| |
Collapse
|
9
|
Löhle M, Timpka J, Bremer A, Khodakarami H, Gandor F, Horne M, Ebersbach G, Odin P, Storch A. Application of single wrist-wearable accelerometry for objective motor diary assessment in fluctuating Parkinson's disease. NPJ Digit Med 2023; 6:194. [PMID: 37848531 PMCID: PMC10582031 DOI: 10.1038/s41746-023-00937-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 09/29/2023] [Indexed: 10/19/2023] Open
Abstract
Advanced Parkinson's disease (PD) is characterized by motor fluctuations including unpredictable oscillations remarkably impairing quality of life. Effective management and development of novel therapies for these response fluctuations largely depend on clinical rating instruments such as the widely-used PD home diary, which are associated with biases and errors. Recent advancements in digital health technologies provide user-friendly wearables that can be tailored for continuous monitoring of motor fluctuations. Their criterion validity under real-world conditions using clinical examination as the gold standard remains to be determined. We prospectively examined this validity of a wearable accelerometer-based digital Parkinson's Motor Diary (adPMD) using the Parkinson's Kinetigraph (PKG®) in an alternative application by converting its continuous data into one of the three motor categories of the PD home diary (Off, On and Dyskinetic state). Sixty-three out of 91 eligible participants with fluctuating PD (46% men, average age 66) had predefined sufficient adPMD datasets (>70% of half-hour periods) from 2 consecutive days. 92% of per-protocol assessments were completed. adPMD monitoring of daily times in motor states showed moderate validity for Off and Dyskinetic state (ICC = 0.43-0.51), while inter-rating methods agreements on half-hour-level can be characterized as poor (median Cohen's κ = 0.13-0.21). Individualization of adPMD thresholds for transferring accelerometer data into diary categories improved temporal agreements up to moderate level for Dyskinetic state detection (median Cohen's κ = 0.25-0.41). Here we report that adPMD real-world-monitoring captures daily times in Off and Dyskinetic state in advanced PD with moderate validities, while temporal agreement of adPMD and clinical observer diary data is limited.
Collapse
Affiliation(s)
- Matthias Löhle
- Department of Neurology, University Medical Center Rostock, Rostock, Germany.
- German Center for Neurodegenerative Diseases (DZNE) Rostock-Greifswald, Rostock, Germany.
| | - Jonathan Timpka
- Division of Neurology, Department of Clinical Sciences Lund, Lund University, Lund, Sweden
- Department of Neurology, Skåne University Hospital, Lund, Sweden
| | - Alexander Bremer
- Department of Neurology, University Medical Center Rostock, Rostock, Germany
| | | | - Florin Gandor
- Movement Disorders Hospital, Beelitz-Heilstätten, Beelitz, Germany
- Department of Neurology, Otto-von-Guericke University, Magdeburg, Germany
| | - Malcom Horne
- Bionics Institute, Melbourne, VIC, Australia
- The Department of Medicine, The University of Melbourne, St Vincent's Hospital, Fitzroy, VIC, 3010, Australia
| | - Georg Ebersbach
- Movement Disorders Hospital, Beelitz-Heilstätten, Beelitz, Germany
| | - Per Odin
- Division of Neurology, Department of Clinical Sciences Lund, Lund University, Lund, Sweden
- Department of Neurology, Skåne University Hospital, Lund, Sweden
| | - Alexander Storch
- Department of Neurology, University Medical Center Rostock, Rostock, Germany.
- German Center for Neurodegenerative Diseases (DZNE) Rostock-Greifswald, Rostock, Germany.
| |
Collapse
|
10
|
Ghavidel A, Pazos P. Machine learning (ML) techniques to predict breast cancer in imbalanced datasets: a systematic review. J Cancer Surviv 2023:10.1007/s11764-023-01465-3. [PMID: 37749361 DOI: 10.1007/s11764-023-01465-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/09/2023] [Indexed: 09/27/2023]
Abstract
Knowledge discovery in databases (KDD) is crucial in analyzing data to extract valuable insights. In medical outcome prediction, KDD is increasingly applied, particularly in diseases with high incidence, mortality, and costs, like cancer. ML techniques can develop more accurate predictive models for cancer patients' clinical outcomes, aiding informed healthcare decision-making. However, cancer prediction modeling faces challenges because of the unbalanced nature of the datasets, where there is a small minority category of patients with a cancer diagnosis compared to a majority category of cancer-free patients. Imbalanced datasets pose statistical hurdles like bias and overfitting when developing accurate prediction models. This systematic review focuses on breast cancer prediction articles published from 2008 to 2023. The objective is to examine ML methods used in three critical steps of KDD: preprocessing, data mining, and interpretation which address the imbalanced data problem in breast cancer prediction. This work synthesizes prior research in ML methods for breast cancer prediction. The findings help identify effective preprocessing strategies, including balancing and feature selection methods, robust predictive models, and evaluation metrics of those models. The study aims to inform healthcare providers and researchers about effective techniques for accurate breast cancer prediction.
Collapse
Affiliation(s)
- Arman Ghavidel
- Engineering Management and Systems Engineering, Old Dominion University, Norfolk, VA, USA
| | - Pilar Pazos
- Engineering Management and Systems Engineering, Old Dominion University, Norfolk, VA, USA.
| |
Collapse
|
11
|
Shi Y, Shen Z, Zeng W, Luo S, Zhou L, Wang N. A schizophrenia study based on multi-frequency dynamic functional connectivity analysis of fMRI. Front Hum Neurosci 2023; 17:1164685. [PMID: 37250690 PMCID: PMC10213427 DOI: 10.3389/fnhum.2023.1164685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 04/17/2023] [Indexed: 05/31/2023] Open
Abstract
At present, fMRI studies mainly focus on the entire low-frequency band (0. 01-0.08 Hz). However, the neuronal activity is dynamic, and different frequency bands may contain different information. Therefore, a novel multi-frequency-based dynamic functional connectivity (dFC) analysis method was proposed in this study, which was then applied to a schizophrenia study. First, three frequency bands (Conventional: 0.01-0.08 Hz, Slow-5: 0.0111-0.0302 Hz, and Slow-4: 0.0302-0.0820 Hz) were obtained using Fast Fourier Transform. Next, the fractional amplitude of low-frequency fluctuations was used to identify abnormal regions of interest (ROIs) of schizophrenia, and dFC among these abnormal ROIs was implemented by the sliding time window method at four window-widths. Finally, recursive feature elimination was employed to select features, and the support vector machine was applied for the classification of patients with schizophrenia and healthy controls. The experimental results showed that the proposed multi-frequency method (Combined: Slow-5 and Slow-4) had a better classification performance compared with the conventional method at shorter sliding window-widths. In conclusion, our results revealed that the dFCs among the abnormal ROIs varied at different frequency bands and the efficiency of combining multiple features from different frequency bands can improve classification performance. Therefore, it would be a promising approach for identifying brain alterations in schizophrenia.
Collapse
Affiliation(s)
- Yuhu Shi
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Zehao Shen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Weiming Zeng
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Sizhe Luo
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Lili Zhou
- Surgery Department of Tongji University Affiliated Yangpu Central Hospital, Shanghai, China
| | - Nizhuan Wang
- School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
| |
Collapse
|
12
|
Emmons KM, Mendez S, Lee RM, Erani D, Mascioli L, Abreu M, Adams S, Daly J, Bierer BE. Data sharing in the context of community-engaged research partnerships. Soc Sci Med 2023; 325:115895. [PMID: 37062144 PMCID: PMC10308954 DOI: 10.1016/j.socscimed.2023.115895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 03/04/2023] [Accepted: 04/06/2023] [Indexed: 04/18/2023]
Abstract
Over the past 20 years, the National Institutes for Health (NIH) has implemented several policies designed to improve sharing of research data, such as the NIH public access policy for publications, NIH genomic data sharing policy, and National Cancer Institute (NCI) Cancer Moonshot public access and data sharing policy. In January 2023, a new NIH data sharing policy has gone into effect, requiring researchers to submit a Data Management and Sharing Plan in proposals for NIH funding (NIH. Supplemental information to the, 2020b; NIH. Final policy for data, 2020a). These policies are based on the idea that sharing data is a key component of the scientific method, as it enables the creation of larger data repositories that can lead to research questions that may not be possible in individual studies (Alter and Gonzalez, 2018; Jwa and Poldrack, 2022), allows enhanced collaboration, and maximizes the federal investment in research. Important questions that we must consider as data sharing is expanded are to whom do benefits of data sharing accrue and to whom do benefits not accrue? In an era of growing efforts to engage diverse communities in research, we must consider the impact of data sharing for all research participants and the communities that they represent. We examine the issue of data sharing through a community-engaged research lens, informed by a long-standing partnership between community-engaged researchers and a key community health organization (Kruse et al., 2022). We contend that without effective community engagement and rich contextual knowledge, biases resulting from data sharing can remain unchecked. We provide several recommendations that would allow better community engagement related to data sharing to ensure both community and researcher understanding of the issues involved and move toward shared benefits. By identifying good models for evaluating the impact of data sharing on communities that contribute data, and then using those models systematically, we will advance the consideration of the community perspective and increase the likelihood of benefits for all.
Collapse
Affiliation(s)
- Karen M Emmons
- Department of Social and Behavioral Science, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA.
| | - Samuel Mendez
- Department of Social and Behavioral Science, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
| | - Rebekka M Lee
- Department of Social and Behavioral Science, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
| | - Diana Erani
- Massachusetts League of Community Health Centers, 40 Court Street, 10th Floor, Boston, MA, 02108, USA
| | - Lynette Mascioli
- Massachusetts League of Community Health Centers, 40 Court Street, 10th Floor, Boston, MA, 02108, USA
| | - Marlene Abreu
- Massachusetts League of Community Health Centers, 40 Court Street, 10th Floor, Boston, MA, 02108, USA
| | - Susan Adams
- Massachusetts League of Community Health Centers, 40 Court Street, 10th Floor, Boston, MA, 02108, USA
| | - James Daly
- Department of Social and Behavioral Science, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA, 02115, USA
| | - Barbara E Bierer
- Division of Global Health Equity, Department of Medicine, Brigham and Women's Hospital, 75 Francis St., Boston, MA, 02115, USA; Department of Medicine, Harvard Medical School, 25 Shattuck St., Boston, MA, 02115, USA
| |
Collapse
|
13
|
Hassanzadeh R, Farhadian M, Rafieemehr H. Hospital mortality prediction in traumatic injuries patients: comparing different SMOTE-based machine learning algorithms. BMC Med Res Methodol 2023; 23:101. [PMID: 37087425 PMCID: PMC10122327 DOI: 10.1186/s12874-023-01920-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 04/13/2023] [Indexed: 04/24/2023] Open
Abstract
BACKGROUND Trauma is one of the most critical public health issues worldwide, leading to death and disability and influencing all age groups. Therefore, there is great interest in models for predicting mortality in trauma patients admitted to the ICU. The main objective of the present study is to develop and evaluate SMOTE-based machine-learning tools for predicting hospital mortality in trauma patients with imbalanced data. METHODS This retrospective cohort study was conducted on 126 trauma patients admitted to an intensive care unit at Besat hospital in Hamadan Province, western Iran, from March 2020 to March 2021. Data were extracted from the medical information records of patients. According to the imbalanced property of the data, SMOTE techniques, namely SMOTE, Borderline-SMOTE1, Borderline-SMOTE2, SMOTE-NC, and SVM-SMOTE, were used for primary preprocessing. Then, the Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Extreme Gradient Boosting (XGBoost) methods were used to predict patients' hospital mortality with traumatic injuries. The performance of the methods used was evaluated by sensitivity, specificity, Positive Predictive Value (PPV), Negative Predictive Value (NPV), accuracy, Area Under the Curve (AUC), Geometric Mean (G-means), F1 score, and P-value of McNemar's test. RESULTS Of the 126 patients admitted to an ICU, 117 (92.9%) survived and 9 (7.1%) died. The mean follow-up time from the date of trauma to the date of outcome was 3.98 ± 4.65 days. The performance of ML algorithms is not good with imbalanced data, whereas the performance of SMOTE-based ML algorithms is significantly improved. The mean area under the ROC curve (AUC) of all SMOTE-based models was more than 91%. F1-score and G-means before balancing the dataset were below 70% for all ML models except ANN. In contrast, F1-score and G-means for the balanced datasets reached more than 90% for all SMOTE-based models. Among all SMOTE-based ML methods, RF and ANN based on SMOTE and XGBoost based on SMOTE-NC achieved the highest value for all evaluation criteria. CONCLUSIONS This study has shown that SMOTE-based ML algorithms better predict outcomes in traumatic injuries than ML algorithms. They have the potential to assist ICU physicians in making clinical decisions.
Collapse
Affiliation(s)
- Roghayyeh Hassanzadeh
- Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Maryam Farhadian
- Research Center for Health Sciences, Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Hassan Rafieemehr
- Department of Medical Laboratory Sciences, School of Paramedicine, Hamadan University of Medical Sciences, Hamadan, Iran.
| |
Collapse
|
14
|
Chen F, Yin G, Dong Y, Li G, Zhang W. KHGCN: Knowledge-Enhanced Recommendation with Hierarchical Graph Capsule Network. ENTROPY (BASEL, SWITZERLAND) 2023; 25:e25040697. [PMID: 37190485 PMCID: PMC10137578 DOI: 10.3390/e25040697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 04/13/2023] [Accepted: 04/17/2023] [Indexed: 05/17/2023]
Abstract
Knowledge graphs as external information has become one of the mainstream directions of current recommendation systems. Various knowledge-graph-representation methods have been proposed to promote the development of knowledge graphs in related fields. Knowledge-graph-embedding methods can learn entity information and complex relationships between the entities in knowledge graphs. Furthermore, recently proposed graph neural networks can learn higher-order representations of entities and relationships in knowledge graphs. Therefore, the complete presentation in the knowledge graph enriches the item information and alleviates the cold start of the recommendation process and too-sparse data. However, the knowledge graph's entire entity and relation representation in personalized recommendation tasks will introduce unnecessary noise information for different users. To learn the entity-relationship presentation in the knowledge graph while effectively removing noise information, we innovatively propose a model named knowledge-enhanced hierarchical graph capsule network (KHGCN), which can extract node embeddings in graphs while learning the hierarchical structure of graphs. Our model eliminates noisy entities and relationship representations in the knowledge graph by the entity disentangling for the recommendation and introduces the attentive mechanism to strengthen the knowledge-graph aggregation. Our model learns the presentation of entity relationships by an original graph capsule network. The capsule neural networks represent the structured information between the entities more completely. We validate the proposed model on real-world datasets, and the validation results demonstrate the model's effectiveness.
Collapse
Affiliation(s)
- Fukun Chen
- School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
| | - Guisheng Yin
- School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
| | - Yuxin Dong
- School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
| | - Gesu Li
- School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
| | - Weiqi Zhang
- School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
| |
Collapse
|
15
|
Terhorst Y, Sander LB, Ebert DD, Baumeister H. Optimizing the predictive power of depression screenings using machine learning. Digit Health 2023; 9:20552076231194939. [PMID: 37654715 PMCID: PMC10467308 DOI: 10.1177/20552076231194939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 07/28/2023] [Indexed: 09/02/2023] Open
Abstract
Objective Mental health self-report and clinician-rating scales with diagnoses defined by sum-score cut-offs are often used for depression screening. This study investigates whether machine learning (ML) can detect major depressive episodes (MDE) based on screening scales with higher accuracy than best-practice clinical sum-score approaches. Methods Primary data was obtained from two RCTs on the treatment of depression. Ground truth were DSM 5 MDE diagnoses based on structured clinical interviews (SCID) and PHQ-9 self-report, clinician-rated QIDS-16, and HAM-D-17 were predictors. ML models were trained using 10-fold cross-validation. Performance was compared against best-practice sum-score cut-offs. Primary outcome was the Area Under the Curve (AUC) of the Receiver Operating Characteristic curve. DeLong's test with bootstrapping was used to test for differences in AUC. Secondary outcomes were balanced accuracy, precision, recall, F1-score, and number needed to diagnose (NND). Results A total of k = 1030 diagnoses (no diagnosis: k = 775; MDE: k = 255) were included. ML models achieved an AUCQIDS-16 = 0.94, AUCHAM-D-17 = 0.88, and AUCPHQ-9 = 0.83 in the testing set. ML AUC was significantly higher than sum-score cut-offs for QIDS-16 and PHQ-9 (ps ≤ 0.01; HAM_D-17: p = 0.847). Applying optimal prediction thresholds, QIDS-16 classifier achieved clinically relevant improvements (Δbalanced accuracy = 8%, ΔF1-score = 14%, ΔNND = 21%). Differences for PHQ_9 and HAM-D-17 were marginal. Conclusions ML augmented depression screenings could potentially make a major contribution to improving MDE diagnosis depending on questionnaire (e.g., QIDS-16). Confirmatory studies are needed before ML enhanced screening can be implemented into routine care practice.
Collapse
Affiliation(s)
- Yannik Terhorst
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| | - Lasse B Sander
- Medical Psychology and Medical Sociology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - David D Ebert
- Department for Sport and Health Sciences, Chair for Psychology & Digital Mental Health Care, Technical University of Munich, Munich, Germany
| | - Harald Baumeister
- Department of Clinical Psychology and Psychotherapy, Institute of Psychology and Education, University Ulm, Ulm, Germany
| |
Collapse
|
16
|
Association between the Systolic Blood Pressure Trajectory and Risk of Stroke in a Health-Management Population in Jiaozuo, China. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:7472188. [PMID: 36619241 PMCID: PMC9812623 DOI: 10.1155/2022/7472188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 12/02/2022] [Accepted: 12/05/2022] [Indexed: 12/31/2022]
Abstract
The trajectories of systolic blood pressure (SBP) in a screening population in Jiaozuo were examined, and the association between the different types of SBP trajectories and the risk of stroke was evaluated. Data of a fixed cohort population from the Jiaozuo Stroke Prevention and Control Project Management Special Database System that underwent community screening in 2015, 2017, 2019, and 2021 were collected. Ultimately, a total of 1,451 participants who met the inclusion criteria for this study were included in the analysis, which was performed using group trajectory modeling. The baseline SBP for each trajectory subgroup was characterized at follow-up. Kaplan-Meier analysis for each trajectory group was also performed, and the relationship between the SBP trajectory and risk of stroke onset during follow-up was validated using a Cox proportional hazards model. Based on the SBP from 2015 to 2021, this cohort population was divided into three groups based on the trajectory development patterns: the low-stable group (37.6%), the moderate-increasing group (53.4%), and the high-acutely increasing group (9%). Gender, age, body mass index, diastolic blood pressure, and fasting blood glucose level were predictive factors for the SBP trajectory group. The cumulative survival risk in the high-acutely increasing group was higher than that of the other two groups. After adjusting for potential confounding factors and using the low-stable group as a reference, the hazard ratios (95% confidence interval) for the risk of stroke onset in the moderate-increasing and high-acutely increasing groups were 1.38 (0.91-2.07) and 1.51 (0.82-2.76), respectively. The results of the analysis demonstrate that higher blood pressure trajectories are associated with a higher risk of stroke and that the risk of stroke can be reduced by better control and management of the SBP.
Collapse
|
17
|
Predicting graft failure in pediatric liver transplantation based on early biomarkers using machine learning models. Sci Rep 2022; 12:22411. [PMID: 36575218 PMCID: PMC9794703 DOI: 10.1038/s41598-022-25900-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 12/06/2022] [Indexed: 12/28/2022] Open
Abstract
The early detection of graft failure in pediatric liver transplantation is crucial for appropriate intervention. Graft failure is associated with numerous perioperative risk factors. This study aimed to develop an individualized predictive model for 90-days graft failure in pediatric liver transplantation using machine learning methods. We conducted a single-center retrospective cohort study. A total of 87 liver transplantation cases performed in patients aged < 12 years at the Severance Hospital between January 2010 and September 2020 were included as data samples. Preoperative conditions of recipients and donors, intraoperative care, postoperative serial laboratory parameters, and events observed within seven days of surgery were collected as features. A least absolute shrinkage and selection operator (LASSO) -based method was used for feature selection to overcome the high dimensionality and collinearity of variables. Among 146 features, four variables were selected as the resultant features, namely, preoperative hepatic encephalopathy, sodium level at the end of surgery, hepatic artery thrombosis, and total bilirubin level on postoperative day 7. These features were selected from different times and represent distinct clinical aspects. The model with logistic regression demonstrated the best prediction performance among various machine learning methods tested (area under the receiver operating characteristic curve (AUROC) = 0.898 and area under the precision-recall curve (AUPR) = 0.882). The risk scoring system developed based on the logistic regression model showed an AUROC of 0.910 and an AUPR of 0.830. Together, the prediction of graft failure in pediatric liver transplantation using the proposed machine learning model exhibited superior discrimination power and, therefore, can provide valuable information to clinicians for their decision making during the postoperative management of the patients.
Collapse
|
18
|
Effects of Data Augmentation with the BNNSMOTE Algorithm in Seizure Detection Using 1D-MobileNet. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:4114178. [PMID: 36578313 PMCID: PMC9792253 DOI: 10.1155/2022/4114178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 10/19/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022]
Abstract
Automatic seizure detection technology has important implications for reducing the workload of neurologists for epilepsy diagnosis and treatment. Due to the unpredictable nature of seizures, the imbalanced classification of seizure and nonseizure data continues to be challenging. In this work, we first propose a novel algorithm named the borderline nearest neighbor synthetic minority oversampling technique (BNNSMOTE) to address the imbalanced classification problem and improve seizure detection performance. The algorithm uses the nearest neighbor notion to generate nonseizure samples near the boundary, then determines the seizure samples that are difficult to learn at the boundary, and lastly selects seizure samples at random to be used in the synthesis of new samples. In view of the characteristic that electroencephalogram (EEG) signals are one-dimensional signals, we then develop a 1D-MobileNet model to validate the algorithm's performance. Results demonstrate that the proposed algorithm outperforms previous seizure detection methods on the CHB-MIT dataset, achieving an average accuracy of 99.40%, a recall value of 87.46%, a precision of 97.17%, and an F1-score of 91.90%, respectively. We also had considerable success when we used additional datasets for verification at the same time. Our algorithm's data augmentation effects are more pronounced and perform better at seizure detection than the existing imbalanced techniques. Besides, the model's parameters and calculation volume have been significantly reduced, making it more suitable for mobile terminals and embedded devices.
Collapse
|
19
|
Park JJ, Seok HG, Woo IH, Park CH. Racial differences in prevalence and anatomical distribution of tarsal coalition. Sci Rep 2022; 12:21567. [PMID: 36513745 PMCID: PMC9747905 DOI: 10.1038/s41598-022-26049-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 12/08/2022] [Indexed: 12/14/2022] Open
Abstract
Previous studies have reported a prevalence of tarsal coalition of 0.03-13%. Calcaneonavicular coalition is known as main anatomical type, and the bilateral occurrence of tarsal coalition is known to be 50% or more. These are the results of studies on Caucasians, there have been few studies targeting large number of East Asians so far. We hypothesized that the prevalence and characteristics of tarsal coalition in East Asians might differ from those in Caucasians. The medical records of 839 patients who underwent bilateral computed tomography on foot and ankle in our hospital from January 2012 to April 2021 were retrospectively reviewed. The overall prevalence was 6.0%, talocalcaneal coalition was the most common anatomical type. The overall bilateral occurrence was 56.5%, talocalcaneal coalition had the highest bilateral occurrence (76.0%) among anatomical types. Isolated union of the posterior facet was the most common subtype of talocalcaneal coalition (43.2%). Talocalcaneal coalition had a significantly higher proportion of coalition-related symptomatic patients than calcaneonavicular coalition (p = 0.019). Our study showed a similar trend to other East Asian studies, confirming the existence of racial differences. The possibility of tarsal coalition in foot and ankle patients in East Asians should always be considered, and bilateral examination is essential for diagnosis.
Collapse
Affiliation(s)
- Jeong Jin Park
- grid.413040.20000 0004 0570 1914Department of Orthopaedic Surgery, Yeungnam University Hospital, Yeungnam University Medical Center, 170 Hyeonchung-ro, Nam-gu, Daegu, 42415 South Korea
| | - Hyun Gyu Seok
- grid.413040.20000 0004 0570 1914Department of Orthopaedic Surgery, Yeungnam University Hospital, Yeungnam University Medical Center, 170 Hyeonchung-ro, Nam-gu, Daegu, 42415 South Korea
| | - In Ha Woo
- grid.413040.20000 0004 0570 1914Department of Orthopaedic Surgery, Yeungnam University Hospital, Yeungnam University Medical Center, 170 Hyeonchung-ro, Nam-gu, Daegu, 42415 South Korea
| | - Chul Hyun Park
- grid.413040.20000 0004 0570 1914Department of Orthopaedic Surgery, Yeungnam University Medical Center, Yeungnam University College of Medicine, 170 Hyeonchung-ro, Nam-gu, Daegu, 42415 South Korea
| |
Collapse
|
20
|
Almadhor A, Sattar U, Al Hejaili A, Ghulam Mohammad U, Tariq U, Ben Chikha H. An efficient computer vision-based approach for acute lymphoblastic leukemia prediction. Front Comput Neurosci 2022; 16:1083649. [PMID: 36507304 PMCID: PMC9729282 DOI: 10.3389/fncom.2022.1083649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 11/14/2022] [Indexed: 11/25/2022] Open
Abstract
Leukemia (blood cancer) diseases arise when the number of White blood cells (WBCs) is imbalanced in the human body. When the bone marrow produces many immature WBCs that kill healthy cells, acute lymphocytic leukemia (ALL) impacts people of all ages. Thus, timely predicting this disease can increase the chance of survival, and the patient can get his therapy early. Manual prediction is very expensive and time-consuming. Therefore, automated prediction techniques are essential. In this research, we propose an ensemble automated prediction approach that uses four machine learning algorithms K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB). The C-NMC leukemia dataset is used from the Kaggle repository to predict leukemia. Dataset is divided into two classes cancer and healthy cells. We perform data preprocessing steps, such as the first images being cropped using minimum and maximum points. Feature extraction is performed to extract the feature using pre-trained Convolutional Neural Network-based Deep Neural Network (DNN) architectures (VGG19, ResNet50, or ResNet101). Data scaling is performed by using the MinMaxScaler normalization technique. Analysis of Variance (ANOVA), Recursive Feature Elimination (RFE), and Random Forest (RF) as feature Selection techniques. Classification machine learning algorithms and ensemble voting are applied to selected features. Results reveal that SVM with 90.0% accuracy outperforms compared to other algorithms.
Collapse
Affiliation(s)
- Ahmad Almadhor
- Department of Computer Engineering and Networks, College of Computer and Information Sciences, Jouf University, Sakaka, Saudi Arabia,*Correspondence: Ahmad Almadhor
| | - Usman Sattar
- Department of Management Science, Beaconhouse National University, Lahore, Pakistan,Usman Sattar
| | - Abdullah Al Hejaili
- Computer Science Department, Faculty of Computers & Information Technology, University of Tabuk, Tabuk, Saudi Arabia
| | - Uzma Ghulam Mohammad
- Department of Computer Science and Software Engineering, International Islamic University, Islamabad, Pakistan
| | - Usman Tariq
- Department of Management Information Systems, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - Haithem Ben Chikha
- Department of Computer Engineering and Networks, College of Computer and Information Sciences, Jouf University, Sakaka, Saudi Arabia
| |
Collapse
|