26
|
Othman MO, Forsmark C, Yadav D, Singh VK, Lara LF, Park W, Zhang Z, Yu J, Kort JJ. Development of clinical screening tool for exocrine pancreatic insufficiency in patients with definite chronic pancreatitis. Pancreatology 2024:S1424-3903(24)00102-9. [PMID: 38693039 DOI: 10.1016/j.pan.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 04/05/2024] [Accepted: 04/12/2024] [Indexed: 05/03/2024]
Abstract
BACKGROUND/OBJECTIVES No simple, accurate diagnostic tests exist for exocrine pancreatic insufficiency (EPI), and EPI remains underdiagnosed in chronic pancreatitis (CP). We sought to develop a digital screening tool to assist clinicians to predict EPI in patients with definite CP. METHODS This was a retrospective case-control study of patients with definite CP with/without EPI. Overall, 49 candidate predictor variables were utilized to train a Classification and Regression Tree (CART) model to rank all predictors and select a parsimonious set of predictors for EPI status. Five-fold cross-validation was used to assess generalizability, and the full CART model was compared with 4 additional predictive models. EPI misclassification rate (mRate) served as primary endpoint metric. RESULTS 274 patients with definite CP from 6 pancreatitis centers across the United States were included, of which 58 % had EPI based on predetermined criteria. The optimal CART decision tree included 10 variables. The mRate without/with 5-fold cross-validation of the CART was 0.153 (training error) and 0.314 (prediction error), and the area under the receiver operating characteristic curve was 0.889 and 0.682, respectively. Sensitivity and specificity without/with 5-fold cross-validation was 0.888/0.789 and 0.794/0.535, respectively. A trained second CART without pancreas imaging variables (n = 6), yielded 8 variables. Training error/prediction error was 0.190/0.351; sensitivity was 0.869/0.650, and specificity was 0.728/0.649, each without/with 5-fold cross-validation. CONCLUSION We developed two CART models that were integrated into one digital screening tool to assess for EPI in patients with definite CP and with two to six input variables needed for predicting EPI status.
Collapse
|
27
|
Zhu K, Chang J, Zhang S, Li Y, Zuo J, Ni H, Xie B, Yao J, Xu Z, Bian S, Yan T, Wu X, Chen S, Jin W, Wang Y, Xu P, Song P, Wu Y, Shen C, Zhu J, Yu Y, Dong F. The enhanced connectivity between the frontoparietal, somatomotor network and thalamus as the most significant network changes of chronic low back pain. Neuroimage 2024; 290:120558. [PMID: 38437909 DOI: 10.1016/j.neuroimage.2024.120558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 02/22/2024] [Accepted: 02/27/2024] [Indexed: 03/06/2024] Open
Abstract
The prolonged duration of chronic low back pain (cLBP) inevitably leads to changes in the cognitive, attentional, sensory and emotional processing brain regions. Currently, it remains unclear how these alterations are manifested in the interplay between brain functional and structural networks. This study aimed to predict the Oswestry Disability Index (ODI) in cLBP patients using multimodal brain magnetic resonance imaging (MRI) data and identified the most significant features within the multimodal networks to aid in distinguishing patients from healthy controls (HCs). We constructed dynamic functional connectivity (dFC) and structural connectivity (SC) networks for all participants (n = 112) and employed the Connectome-based Predictive Modeling (CPM) approach to predict ODI scores, utilizing various feature selection thresholds to identify the most significant network change features in dFC and SC outcomes. Subsequently, we utilized these significant features for optimal classifier selection and the integration of multimodal features. The results revealed enhanced connectivity among the frontoparietal network (FPN), somatomotor network (SMN) and thalamus in cLBP patients compared to HCs. The thalamus transmits pain-related sensations and emotions to the cortical areas through the dorsolateral prefrontal cortex (dlPFC) and primary somatosensory cortex (SI), leading to alterations in whole-brain network functionality and structure. Regarding the model selection for the classifier, we found that Support Vector Machine (SVM) best fit these significant network features. The combined model based on dFC and SC features significantly improved classification performance between cLBP patients and HCs (AUC=0.9772). Finally, the results from an external validation set support our hypotheses and provide insights into the potential applicability of the model in real-world scenarios. Our discovery of enhanced connectivity between the thalamus and both the dlPFC (FPN) and SI (SMN) provides a valuable supplement to prior research on cLBP.
Collapse
|
28
|
Cherblanc J, Gaboury S, Maître J, Côté I, Cadell S, Bergeron-Leclerc C. Predicting levels of prolonged grief disorder symptoms during the COVID-19 pandemic: An integrated approach of classical data exploration, predictive machine learning, and explainable AI. J Affect Disord 2024; 351:746-754. [PMID: 38290589 DOI: 10.1016/j.jad.2024.01.236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 01/11/2024] [Accepted: 01/26/2024] [Indexed: 02/01/2024]
Abstract
BACKGROUND Prior studies on Prolonged Grief Disorder (PGD) primarily employed classical approaches to link bereaved individuals' characteristics with PGD symptom levels. This study utilized machine learning to identify key factors influencing PGD symptoms during the COVID-19 pandemic. METHODS We analyzed data from 479 participants through an online survey, employing classical data exploration, predictive machine learning, and SHapley Additive exPlanations (SHAP) to determine key factors influencing PGD symptoms measured with the Traumatic Grief Inventory - Self Report (TGI-SR) from 19 variables, comparing five predictive models. RESULTS The classical approach identified eight variables associated with a possible PGD (TGI-SR score ≥ 59): unexpected causes of death, living alone, seeking professional support, taking anxiety and/or depression medications, using more grief services (telephone or online supports) and more confrontation-oriented coping strategies, and higher levels of depression and anxiety. Using machine learning techniques, the CatBoost algorithm provided the best predictive model of the TGI-SR score (r2 = 0.6479). The three variables influencing the most the level of PGD symptoms were anxiety, and levels of avoidance and confrontation coping strategies used. CONCLUSIONS This pioneering approach within the field of grief research enabled us to leverage the extensive dataset collected during the pandemic, facilitating a deeper comprehension of the predominant factors influencing the grieving process for individuals who experienced loss during this period. LIMITATIONS This study acknowledges self-selection bias, limited sample diversity, and suggests further research is needed to fully understand the predictors of PGD symptoms.
Collapse
|
29
|
Hu WJ, Bai G, Wang Y, Hong DM, Jiang JH, Li JX, Hua Y, Wang XY, Chen Y. Predictive modeling for postoperative delirium in elderly patients with abdominal malignancies using synthetic minority oversampling technique. World J Gastrointest Oncol 2024; 16:1227-1235. [PMID: 38660665 PMCID: PMC11037067 DOI: 10.4251/wjgo.v16.i4.1227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 01/12/2024] [Accepted: 02/20/2024] [Indexed: 04/10/2024] Open
Abstract
BACKGROUND Postoperative delirium, particularly prevalent in elderly patients after abdominal cancer surgery, presents significant challenges in clinical management. AIM To develop a synthetic minority oversampling technique (SMOTE)-based model for predicting postoperative delirium in elderly abdominal cancer patients. METHODS In this retrospective cohort study, we analyzed data from 611 elderly patients who underwent abdominal malignant tumor surgery at our hospital between September 2020 and October 2022. The incidence of postoperative delirium was recorded for 7 d post-surgery. Patients were divided into delirium and non-delirium groups based on the occurrence of postoperative delirium or not. A multivariate logistic regression model was used to identify risk factors and develop a predictive model for postoperative delirium. The SMOTE technique was applied to enhance the model by oversampling the delirium cases. The model's predictive accuracy was then validated. RESULTS In our study involving 611 elderly patients with abdominal malignant tumors, multivariate logistic regression analysis identified significant risk factors for postoperative delirium. These included the Charlson comorbidity index, American Society of Anesthesiologists classification, history of cerebrovascular disease, surgical duration, perioperative blood transfusion, and postoperative pain score. The incidence rate of postoperative delirium in our study was 22.91%. The original predictive model (P1) exhibited an area under the receiver operating characteristic curve of 0.862. In comparison, the SMOTE-based logistic early warning model (P2), which utilized the SMOTE oversampling algorithm, showed a slightly lower but comparable area under the curve of 0.856, suggesting no significant difference in performance between the two predictive approaches. CONCLUSION This study confirms that the SMOTE-enhanced predictive model for postoperative delirium in elderly abdominal tumor patients shows performance equivalent to that of traditional methods, effectively addressing data imbalance.
Collapse
|
30
|
Salazar JK, Fay ML, Khouja BA, Mate M, Zhou X, Lingareddygari P, Liggans G. Dynamics of Listeriamonocytogenes and Salmonella enterica on Cooked Vegetables During Storage. J Food Prot 2024; 87:100259. [PMID: 38447927 DOI: 10.1016/j.jfp.2024.100259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 02/25/2024] [Accepted: 02/28/2024] [Indexed: 03/08/2024]
Abstract
Fresh vegetables have been linked to multiple foodborne outbreaks in the U.S., with Listeria monocytogenes and Salmonella enterica identified as leading causes. Beyond raw vegetables, cooked vegetables can also pose food safety concerns due to improper cooking temperature and time combinations or postcooking contamination. Cooked vegetables, having had their native microbiota reduced through heat inactivation, might provide an environment that favors the growth of pathogens due to diminished microbial competition. While the risks associated with raw vegetables are recognized, the survival and growth of pathogens on cooked vegetables remain inadequately studied. This study investigated the growth kinetics of both L. monocytogenes and S. enterica on various cooked vegetables (carrot, corn, onions, green bell pepper, and potato). Vegetables were cooked at 177°C until the internal temperature reached 90°C and then cooled to 5°C. Cooled vegetables were inoculated with a four-strain cocktail of either L. monocytogenes or S. enterica at 3 log CFU/g, then stored at different temperatures (5, 10, or 25°C) for up to 7 days. Both pathogens survived on all vegetables when stored at 5°C. At 10°C, both pathogens proliferated on all vegetables, with the exception of L. monocytogenes on pepper. At 25°C, the highest growth rates were observed by both pathogens on carrot (5.55 ± 0.22 and 6.42 ± 0.23 log CFU/g/d for L. monocytogenes and S. enterica, respectively). S. enterica displayed higher growth rates at 25°C compared to L. monocytogenes on all vegetables. Overall, these results bridge the knowledge gap concerning the growth kinetics of both S. enterica and L. monocytogenes on various cooked vegetables, offering insights to further enhance food safety.
Collapse
|
31
|
Chamberlain AM, Bergeron NP, Al-Abcha AK, Weston SA, Jiang R, Attia ZI, Friedman PA, Gersh BJ, Noseworthy PA, Siontis KC. Postoperative atrial fibrillation: Prediction of subsequent recurrences with clinical risk modeling and artificial intelligence electrocardiography. CARDIOVASCULAR DIGITAL HEALTH JOURNAL 2024; 5:111-114. [PMID: 38765621 PMCID: PMC11096649 DOI: 10.1016/j.cvdhj.2024.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
|
32
|
Long X, Huangfu X, Huang R, Liang Y, Wu S, Wang J. The application of machine learning methods for prediction of heavy metal by activated carbons, biochars, and carbon nanotubes. CHEMOSPHERE 2024; 354:141584. [PMID: 38460852 DOI: 10.1016/j.chemosphere.2024.141584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 01/11/2024] [Accepted: 02/28/2024] [Indexed: 03/11/2024]
Abstract
Carbonaceous materials are commonly used as adsorbents for heavy metals. The determination of the adsorption capacity needs time and energy, and the key factors affecting the adsorption capacity have not been determined. Therefore, a new and efficient method is needed to predict the adsorption capacity and explore the decisive factors in the adsorption process. In this study, three tree-based machine learning models (i.e., random forest, gradient boosting decision tree, and extreme gradient boosting) were developed to predict the adsorption capacity of eight heavy metals (i.e., As, Cd, Cr, Cu, Hg, Ni, Pb, and Zn) on activated carbons, biochars, and carbon nanotubes using 3674 data points extracted from 151 journal articles. After a comprehensive comparison, the gradient boosting decision tree had the best performance for a combined model based on all data (R2 = 0.9707, RMSE = 0.1420). Moreover, independent models were developed for three datasets classified by the adsorbent and eight datasets classified by the heavy metals. In addition, a graphical user interface was built to predict the adsorption capacity of heavy metals. This study provides a novel strategy and convenient tool for the removal of heavy metals and can help to improve the removal efficiency of heavy metals to build a healthier world.
Collapse
|
33
|
Scott Wang HH, Li M, Cahill D, Panagides J, Logvinenko T, Chow J, Nelson C. A machine learning algorithm predicting risk of dilating VUR among infants with hydronephrosis using UTD classification. J Pediatr Urol 2024; 20:271-278. [PMID: 37993352 DOI: 10.1016/j.jpurol.2023.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 10/17/2023] [Accepted: 11/04/2023] [Indexed: 11/24/2023]
Abstract
BACKGROUNDS Urinary Tract Dilation (UTD) classification has been designed to be a more objective grading system to evaluate antenatal and post-natal UTD. Due to unclear association between UTD classifications to specific anomalies such as vesico-ureteral reflux (VUR), management recommendations tend to be subjective. OBJECTIVE We sought to develop a model to reliably predict VUR from early post-natal ultrasound. STUDY DESIGN Radiology records from single institution were reviewed to identify infants aged 0-90 days undergoing early ultrasound for antenatal UTD. Medical records were reviewed to confirm diagnosis of VUR. Primary outcome defined as dilating (≥Gr3) VUR. Exclusion criteria include major congenital urologic anomalies (bilateral renal agenesis, horseshoe kidney, cross fused ectopia, exstrophy) as well as patients without VCUG. Data were split into training/testing sets by 4:1 ratio. Machine learning (ML) algorithm hyperparameters were tuned by the validation set. RESULTS In total, 280 patients (540 renal units) were included in the study (73 % male). Median (IQR) age at ultrasound was 27 (18-38) days. 66 renal units were found to have ≥ grade 3 VUR. The final model included gender, ureteral dilation, parenchymal appearance, parenchymal thickness, central calyceal dilation. The model predicted VUR with AUC at 0.81(0.73-0.88) on out-of-sample testing data. Model is shown in the figure. DISCUSSION We developed a ML model that can predict dilating VUR among patients with hydronephrosis in early ultrasound. The study is limited by the retrospective and single institutional nature of data source. This is one of the first studies demonstrating high performance for future diagnosis prediction in early hydronephrosis cohort. CONCLUSIONS By predicting dilating VUR, our predictive model using machine learning algorithm provides promising performance to facilitate individualized management of children with prenatal hydronephrosis, and identify those most likely to benefit from VCUG. This would allow more selective use of this test, increasing the yield while also minimizing overutilization.
Collapse
|
34
|
Scheer JK, Ames CP. Artificial Intelligence in Spine Surgery. Neurosurg Clin N Am 2024; 35:253-262. [PMID: 38423741 DOI: 10.1016/j.nec.2023.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
The amount and quality of data being used in our everyday lives continue to advance in an unprecedented pace. This digital revolution has permeated healthcare, specifically spine surgery, allowing for very advanced and complex computational analytics, such as artificial intelligence (AI) and machine learning (ML). The integration of these methods into clinical practice has just begun, and the following review article will describe AI/ML, demonstrate how it has been applied in adult spinal deformity surgery, and show its potential to improve patient care touching on future directions.
Collapse
|
35
|
Charest N, Lowe CN, Ramsland C, Meyer B, Samano V, Williams AJ. Improving predictions of compound amenability for liquid chromatography-mass spectrometry to enhance non-targeted analysis. Anal Bioanal Chem 2024:10.1007/s00216-024-05229-5. [PMID: 38530399 DOI: 10.1007/s00216-024-05229-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 02/14/2024] [Accepted: 02/16/2024] [Indexed: 03/28/2024]
Abstract
Mass-spectrometry-based non-targeted analysis (NTA), in which mass spectrometric signals are assigned chemical identities based on a systematic collation of evidence, is a growing area of interest for toxicological risk assessment. Successful NTA results in better identification of potentially hazardous pollutants within the environment, facilitating the development of targeted analytical strategies to best characterize risks to human and ecological health. A supporting component of the NTA process involves assessing whether suspected chemicals are amenable to the mass spectrometric method, which is necessary in order to assign an observed signal to the chemical structure. Prior work from this group involved the development of a random forest model for predicting the amenability of 5517 unique chemical structures to liquid chromatography-mass spectrometry (LC-MS). This work improves the interpretability of the group's prior model of the same endpoint, as well as integrating 1348 more data points across negative and positive ionization modes. We enhance interpretability by feature engineering, a machine learning practice that reduces the input dimensionality while attempting to preserve performance statistics. We emphasize the importance of interpretable machine learning models within the context of building confidence in NTA identification. The novel data were curated by the labeling of compounds as amenable or unamenable by expert curators, resulting in an enhanced set of chemical compounds to expand the applicability domain of the prior model. The balanced accuracy benchmark of the newly developed model is comparable to performance previously reported (mean CV BA is 0.84 vs. 0.82 in positive mode, and 0.85 vs. 0.82 in negative mode), while on a novel external set, derived from this work's data, the Matthews correlation coefficients (MCC) for the novel models are 0.66 and 0.68 for positive and negative mode, respectively. Our group's prior published models scored MCC of 0.55 and 0.54 on the same external sets. This demonstrates appreciable improvement over the chemical space captured by the expanded dataset. This work forms part of our ongoing efforts to develop models with higher interpretability and higher performance to support NTA efforts.
Collapse
|
36
|
Li R, Xiong Z, Ma Y, Li Y, Yang Y, Ma S, Ha C. Enhancing precision medicine: a nomogram for predicting platinum resistance in epithelial ovarian cancer. World J Surg Oncol 2024; 22:81. [PMID: 38509620 PMCID: PMC10956367 DOI: 10.1186/s12957-024-03359-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 03/08/2024] [Indexed: 03/22/2024] Open
Abstract
BACKGROUND This study aimed to develop a novel nomogram that can accurately estimate platinum resistance to enhance precision medicine in epithelial ovarian cancer(EOC). METHODS EOC patients who received primary therapy at the General Hospital of Ningxia Medical University between January 31, 2019, and June 30, 2021 were included. The LASSO analysis was utilized to screen the variables which contained clinical features and platinum-resistance gene immunohistochemistry scores. A nomogram was created after the logistic regression analysis to develop the prediction model. The consistency index (C-index), calibration curve, receiver operating characteristic (ROC) curve, and decision curve analysis (DCA) were used to assess the nomogram's performance. RESULTS The logistic regression analysis created a prediction model based on 11 factors filtered down by LASSO regression. As predictors, the immunohistochemical scores of CXLC1, CXCL2, IL6, ABCC1, LRP, BCL2, vascular tumor thrombus, ascites cancer cells, maximum tumor diameter, neoadjuvant chemotherapy, and HE4 were employed. The C-index of the nomogram was found to be 0.975. The nomogram's specificity is 95.35% and its sensitivity, with a cut-off value of 165.6, is 92.59%, as seen by the ROC curve. After the nomogram was externally validated in the test cohort, the coincidence rate was determined to be 84%, and the ROC curve indicated that the nomogram's AUC was 0.949. CONCLUSION A nomogram containing clinical characteristics and platinum gene IHC scores was developed and validated to predict the risk of EOC platinum resistance.
Collapse
|
37
|
Allred AR, Clark TK. A computational model of motion sickness dynamics during passive self-motion in the dark. Exp Brain Res 2024:10.1007/s00221-024-06804-z. [PMID: 38489025 DOI: 10.1007/s00221-024-06804-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 02/08/2024] [Indexed: 03/17/2024]
Abstract
Predicting the time course of motion sickness symptoms enables the evaluation of provocative stimuli and the development of countermeasures for reducing symptom severity. In pursuit of this goal, we present an Observer-driven model of motion sickness for passive motions in the dark. Constructed in two stages, this model predicts motion sickness symptoms by bridging sensory conflict (i.e., differences between actual and expected sensory signals) arising from the Observer model of spatial orientation perception (stage 1) to Oman's model of motion sickness symptom dynamics (stage 2; presented in 1982 and 1990) through a proposed "Normalized Innovation Squared" statistic. The model outputs the expected temporal development of human motion sickness symptom magnitudes (mapped to the Misery Scale) at a population level, due to arbitrary, 6-degree-of-freedom, self-motion stimuli. We trained model parameters using individual subject responses collected during fore-aft translations and off-vertical axis of rotation motions. Improving on prior efforts, we only used datasets with experimental conditions congruent with the perceptual stage (i.e., adequately provided passive motions without visual cues) to inform the model. We assessed model performance by predicting an unseen validation dataset, producing a Q2 value of 0.91. Demonstrating this model's broad applicability, we formulate predictions for a host of stimuli, including translations, earth-vertical rotations, and altered gravity, and we provide our implementation for other users. Finally, to guide future research efforts, we suggest how to rigorously advance this model (e.g., incorporating visual cues, active motion, responses to motion of different frequency, etc.).
Collapse
|
38
|
Wen YR, Lin XW, Zhou YW, Xu L, Zhang JL, Chen CY, He J. N-glycan biosignatures as a potential diagnostic biomarker for early-stage pancreatic cancer. World J Gastrointest Oncol 2024; 16:659-669. [PMID: 38577461 PMCID: PMC10989390 DOI: 10.4251/wjgo.v16.i3.659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 12/21/2023] [Accepted: 01/18/2024] [Indexed: 03/12/2024] Open
Abstract
BACKGROUND Pancreatic ductal adenocarcinoma (PDAC) has a poor prognosis, with a 5-year survival rate of less than 10%, owing to its late-stage diagnosis. Early detection of pancreatic cancer (PC) can significantly increase survival rates. AIM To identify the serum biomarker signatures associated with early-stage PDAC by serum N-glycan analysis. METHODS An extensive patient cohort was used to determine a biomarker signature, including patients with PDAC that was well-defined at an early stage (stages I and II). The biomarker signature was derived from a case-control study using a case-cohort design consisting of 29 patients with stage I, 22 with stage II, 4 with stage III, 16 with stage IV PDAC, and 88 controls. We used multiparametric analysis to identify early-stage PDAC N-glycan signatures and developed an N-glycan signature-based diagnosis model called the "Glyco-model". RESULTS The biomarker signature was created to discriminate samples derived from patients with PC from those of controls, with a receiver operating characteristic area under the curve of 0.86. In addition, the biomarker signature combined with cancer antigen 19-9 could discriminate patients with PDAC from controls, with a receiver operating characteristic area under the curve of 0.919. Glyco-model demonstrated favorable diagnostic performance in all stages of PC. The diagnostic sensitivity for stage I PDAC was 89.66%. CONCLUSION In a prospective validation study, this serum biomarker signature may offer a viable method for detecting early-stage PDAC.
Collapse
|
39
|
Kleinstreuer N, Hartung T. Artificial intelligence (AI)-it's the end of the tox as we know it (and I feel fine). Arch Toxicol 2024; 98:735-754. [PMID: 38244040 PMCID: PMC10861653 DOI: 10.1007/s00204-023-03666-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 12/12/2023] [Indexed: 01/22/2024]
Abstract
The rapid progress of AI impacts diverse scientific disciplines, including toxicology, and has the potential to transform chemical safety evaluation. Toxicology has evolved from an empirical science focused on observing apical outcomes of chemical exposure, to a data-rich field ripe for AI integration. The volume, variety and velocity of toxicological data from legacy studies, literature, high-throughput assays, sensor technologies and omics approaches create opportunities but also complexities that AI can help address. In particular, machine learning is well suited to handle and integrate large, heterogeneous datasets that are both structured and unstructured-a key challenge in modern toxicology. AI methods like deep neural networks, large language models, and natural language processing have successfully predicted toxicity endpoints, analyzed high-throughput data, extracted facts from literature, and generated synthetic data. Beyond automating data capture, analysis, and prediction, AI techniques show promise for accelerating quantitative risk assessment by providing probabilistic outputs to capture uncertainties. AI also enables explanation methods to unravel mechanisms and increase trust in modeled predictions. However, issues like model interpretability, data biases, and transparency currently limit regulatory endorsement of AI. Multidisciplinary collaboration is needed to ensure development of interpretable, robust, and human-centered AI systems. Rather than just automating human tasks at scale, transformative AI can catalyze innovation in how evidence is gathered, data are generated, hypotheses are formed and tested, and tasks are performed to usher new paradigms in chemical safety assessment. Used judiciously, AI has immense potential to advance toxicology into a more predictive, mechanism-based, and evidence-integrated scientific discipline to better safeguard human and environmental wellbeing across diverse populations.
Collapse
|
40
|
Pérez-Millan A, Borrego-Écija S, Falgàs N, Juncà-Parella J, Bosch B, Tort-Merino A, Antonell A, Bargalló N, Rami L, Balasa M, Lladó A, Sala-Llonch R, Sánchez-Valle R. Cortical thickness modeling and variability in Alzheimer's disease and frontotemporal dementia. J Neurol 2024; 271:1428-1438. [PMID: 38012398 PMCID: PMC10896866 DOI: 10.1007/s00415-023-12087-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 09/29/2023] [Accepted: 10/31/2023] [Indexed: 11/29/2023]
Abstract
BACKGROUND AND OBJECTIVE Alzheimer's disease (AD) and frontotemporal dementia (FTD) show different patterns of cortical thickness (CTh) loss compared with healthy controls (HC), even though there is relevant heterogeneity between individuals suffering from each of these diseases. Thus, we developed CTh models to study individual variability in AD, FTD, and HC. METHODS We used the baseline CTh measures of 379 participants obtained from the structural MRI processed with FreeSurfer. A total of 169 AD patients (63 ± 9 years, 65 men), 88 FTD patients (64 ± 9 years, 43 men), and 122 HC (62 ± 10 years, 47 men) were studied. We fitted region-wise temporal models of CTh using Support Vector Regression. Then, we studied associations of individual deviations from the model with cerebrospinal fluid levels of neurofilament light chain (NfL) and 14-3-3 protein and Mini-Mental State Examination (MMSE). Furthermore, we used real longitudinal data from 144 participants to test model predictivity. RESULTS We defined CTh spatiotemporal models for each group with a reliable fit. Individual deviation correlated with MMSE for AD and with NfL for FTD. AD patients with higher deviations from the trend presented higher MMSE values. In FTD, lower NfL levels were associated with higher deviations from the CTh prediction. For AD and HC, we could predict longitudinal visits with the presented model trained with baseline data. For FTD, the longitudinal visits had more variability. CONCLUSION We highlight the value of CTh models for studying AD and FTD longitudinal changes and variability and their relationships with cognitive features and biomarkers.
Collapse
|
41
|
Mercea PV, Ossberger M, Wyrwich R, Herburger M, Barge V, Aluri R, Toşa V. Modeling the Migration Behavior of Extractables from Mono- and Multilayer Polyolefin Films to Mathematically Predict the Concentration of Leachable Impurities in Pharmaceutical Drug Products. Part 2: Conservative Diffusion and Partition Coefficient Determinations. PDA J Pharm Sci Technol 2024; 78:33-44. [PMID: 37580130 DOI: 10.5731/pdajpst.2022.012817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 07/17/2023] [Indexed: 08/16/2023]
Abstract
In the development of a pharmaceutical drug product packaging, an important step is to demonstrate acceptable levels of leachable impurities migrating from the packaging material into the drug product during its shelf life and therapeutic use. Such migration processes can be quantified either by analytical methods (which is often challenging and labor intensive) or (in many cases) through theoretical modeling, which is a reliable, quick, and cost-effective method to forecast the level of leachable impurities in the packaged drug when the diffusion and partition coefficients are known. In the previous part, it was shown how these parameters can be determined experimentally, and subsequent theoretical fitting of the results for a series of low- and high-molecular-weight organic compounds (known leachables) in a series of polyolefin materials was performed. One of the interpretations of these results is that a theoretical calculation can be made only for organic compounds and materials whose diffusion/partition/solubility coefficients were determined experimentally and theoretical fitting was achieved. However, in practice, there will be situations in which other leachable compounds may have to be investigated. In such cases, strictly speaking, it would be necessary to perform the whole experimental and fitting procedure for the new compound before a proper theoretical modeling is possible. But this would make the theoretical calculation of a leaching process from a pharmaceutical packaging material a cumbersome and cost intensive procedure. To address this problem, the pools of diffusion and partition coefficients were used to develop an approach that allows the estimation, without any additional experimentation, of so-called "conservative" diffusion and partition coefficients for a much wider range of potential leachables in the polyolefin pharmaceutical packaging materials and aqueous solutions investigated previously.
Collapse
|
42
|
Mercea PV, Ossberger M, Wyrwich R, Herburger M, Barge V, Aluri R, Toşa V. Modeling the Migration Behavior of Extractables from Mono- and Multilayer Polyolefin Films to Mathematically Predict the Concentration of Leachable Impurities in Pharmaceutical Drug Products. Part 1: Experimental Details and Modeling Experimental Results. PDA J Pharm Sci Technol 2024; 78:3-32. [PMID: 37580127 DOI: 10.5731/pdajpst.2022.012816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 07/17/2023] [Indexed: 08/16/2023]
Abstract
An important step in the development of a pharmaceutical drug product is to demonstrate acceptable levels of leachable impurities during the shelf-life and therapeutic use of the drug product. If the diffusion and partition coefficients are known, the concentration profile of a leachable impurity in the drug product can be predicted theoretically at a given temperature and time. With this objective in mind, kinetic experiments were performed to study the migration of low- to high-molecular-weight organic compounds from mono- and multilayer polyolefin films. Migration curves at different temperatures were generated for each compound when these films were brought in contact with aqueous solutions with varying pH or with another plastic film made from a different polyolefin material. "Best fit" migration curves and the corresponding diffusion and partition coefficients (about 300 pieces) were obtained by using numerical software developed by FABES. The results obtained show that, in general, the correlation between the calculated diffusion and partition coefficients and temperature, between 30°C and 85°C, obeys the Arrhenius and Van't Hoff equations. In this temperature range, the diffusion and partition coefficients can be used to model and predict migration of the investigated compounds from the same pharmaceutical packaging materials. A comparison of these coefficient values with other polyolefin films also provides insights into the chemistry of the mono- and multilayers and the impact it has on the migration behavior of the compounds. In a consecutive paper, an approach to overestimate the diffusion and partition coefficients to account for the variability in experimental data is explained and finally, the use of these overestimated parameters to predict the concentrations for other compounds leaching from the multilayer films into aqueous drug product formulations is discussed.
Collapse
|
43
|
Gallo E. Revolutionizing Synthetic Antibody Design: Harnessing Artificial Intelligence and Deep Sequencing Big Data for Unprecedented Advances. Mol Biotechnol 2024:10.1007/s12033-024-01064-2. [PMID: 38308755 DOI: 10.1007/s12033-024-01064-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 01/02/2024] [Indexed: 02/05/2024]
Abstract
Synthetic antibodies (Abs) represent a category of engineered proteins meticulously crafted to replicate the functions of their natural counterparts. Such Abs are generated in vitro, enabling advanced molecular alterations associated with antigen recognition, paratope site engineering, and biochemical refinements. In a parallel realm, deep sequencing has brought about a paradigm shift in molecular biology. It facilitates the prompt and cost-effective high-throughput sequencing of DNA and RNA molecules, enabling the comprehensive big data analysis of Ab transcriptomes, including specific regions of interest. Significantly, the integration of artificial intelligence (AI), based on machine- and deep- learning approaches, has fundamentally transformed our capacity to discern patterns hidden within deep sequencing big data, including distinctive Ab features and protein folding free energy landscapes. Ultimately, current AI advances can generate approximations of the most stable Ab structural configurations, enabling the prediction of de novo synthetic Abs. As a result, this manuscript comprehensively examines the latest and relevant literature concerning the intersection of deep sequencing big data and AI methodologies for the design and development of synthetic Abs. Together, these advancements have accelerated the exploration of antibody repertoires, contributing to the refinement of synthetic Ab engineering and optimizations, and facilitating advancements in the lead identification process.
Collapse
|
44
|
Freundlich RE, Clifton JC, Epstein RH, Pandharipande PP, Grogan TR, Moore RP, Byrne DW, Fabbro M, Hofer IS. External validation of a predictive model for reintubation after cardiac surgery: A retrospective, observational study. J Clin Anesth 2024; 92:111295. [PMID: 37883900 PMCID: PMC10872431 DOI: 10.1016/j.jclinane.2023.111295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 08/24/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
STUDY OBJECTIVE Explore validation of a model to predict patients' risk of failing extubation, to help providers make informed, data-driven decisions regarding the optimal timing of extubation. DESIGN We performed temporal, geographic, and domain validations of a model for the risk of reintubation after cardiac surgery by assessing its performance on data sets from three academic medical centers, with temporal validation using data from the institution where the model was developed. SETTING Three academic medical centers in the United States. PATIENTS Adult patients arriving in the cardiac intensive care unit with an endotracheal tube in place after cardiac surgery. INTERVENTIONS Receiver operating characteristic (ROC) curves and concordance statistics were used as measures of discriminative ability, and calibration curves and Brier scores were used to assess the model's predictive ability. MEASUREMENTS Temporal validation was performed in 1642 patients with a reintubation rate of 4.8%, with the model demonstrating strong discrimination (optimism-corrected c-statistic 0.77) and low predictive error (Brier score 0.044) but poor model precision and recall (Optimal F1 score 0.29). Combined domain and geographic validation were performed in 2041 patients with a reintubation rate of 1.5%. The model displayed solid discriminative ability (optimism-corrected c-statistic = 0.73) and low predictive error (Brier score = 0.0149) but low precision and recall (Optimal F1 score = 0.13). Geographic validation was performed in 2489 patients with a reintubation rate of 1.6%, with the model displaying good discrimination (optimism-corrected c-statistic = 0.71) and predictive error (Brier score = 0.0152) but poor precision and recall (Optimal F1 score = 0.13). MAIN RESULTS The reintubation model displayed strong discriminative ability and low predictive error within each validation cohort. CONCLUSIONS Future work is needed to explore how to optimize models before local implementation.
Collapse
|
45
|
Jiang Y, Huang D, Chen Q, Yu Y, Hu Y, Wang Y, Chen R, Yao L, Zhong X, Kong L, Yu Q, Lu J, Li Y, Shi Y. A novel online calculator based on clinical features and hematological parameters to predict total skin clearance in patients with moderate to severe psoriasis. J Transl Med 2024; 22:121. [PMID: 38297242 PMCID: PMC10829231 DOI: 10.1186/s12967-023-04847-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 12/29/2023] [Indexed: 02/02/2024] Open
Abstract
BACKGROUND Treatment responses to biologic agents vary between patients with moderate to severe psoriasis; while some patients achieve total skin clearance (TSC), a proportion of patients may only experience partial improvement. OBJECTIVE This study was designed to identify potential predictors for achieving TSC in psoriasis patients treated with IL-17 inhibitors. It also aimed to develop an easy-to-use calculator incorporating these factors by the nomogram to predict TSC response. METHODS A total of 381 patients with psoriasis receiving ixekizumab were included in the development cohort and 229 psoriasis patients who initiated secukinumab treatment were included in the validation cohort. The study endpoint was achieving TSC after 12 weeks of IL-17 inhibitors treatment, defined as the 100% improvement in Psoriasis Area and Severity Index (PASI 100). Multivariate Cox regression analyses and LASSO analysis were performed to identify clinical predictors and blood predictors respectively. RESULTS The following parameters were identified as predictive factors associated with TSC: previous biologic treatment, joint involvement, genital area affected, early response (PASI 60 at week 4), neutrophil counts and uric acid levels. The nomogram model incorporating these factors achieved good discrimination in the development cohort (AUC, 0.721; 95% CI 0.670-0.773) and validation cohort (AUC, 0.715; 95% CI 0.665-0.760). The calibration curves exhibited a satisfactory fit, indicating the accuracy of the model. Furthermore, the decision curve analysis confirmed the clinical utility of the nomogram, highlighting its favorable value for practical application. Web-based online calculator has been developed to enhance the efficiency of clinical applications. CONCLUSIONS This study developed a practical and clinically applicable nomogram model for the prediction of TSC in patients with moderate to severe psoriasis. The nomogram model demonstrated robust predictive performance and exhibited significant clinical utility. Trial registration A multi-center clinical study of systemic treatment strategies for psoriasis in Chinese population;ChiCTR2000036186; Registered 31 August 2020; https://www.chictr.org.cn/showproj.html?proj=58256 .
Collapse
|
46
|
Zou ZH, Liu XQ, Li WH, Zhou XT, Li XF. Development and validation of multiple linear regression models for predicting total hip arthroplasty acetabular prosthesis. J Orthop Surg Res 2024; 19:73. [PMID: 38233875 DOI: 10.1186/s13018-024-04526-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 01/01/2024] [Indexed: 01/19/2024] Open
Abstract
PURPOSE To establish a multivariate linear equation to predict the diameter (outer diameter) of the acetabular prosthesis used in total hip arthroplasty. METHODS A cohort of 258 individuals who underwent THA at our medical facility were included in this study. The independent variables encompassed the patients' height, weight, foot length, gender, age, and surgical access. The dependent variable in this study was the diameter of the acetabular prosthesis utilized during the surgical procedure. The entire cohort dataset was randomly partitioned into a training cohort and a validation cohort, with a ratio of 7:3, employing the SPSS 26.0 software. Pearson correlation analysis was conducted to examine the relationships between the patients' height, weight, foot length, gender, age, surgical access, and the diameter of the acetabular prosthesis in the training cohort. Additionally, a multiple linear regression equation was developed using the independent variables from the training cohort and the diameter of the acetabular prosthesis as the dependent variable. This equation aimed to predict the diameter of the acetabular prosthesis based on the patients' characteristics. The accuracy of the equation was evaluated by substituting the data of the validation cohort into the multiple linear equation. The predicted acetabular prosthesis diameters were then compared with the actual diameters used in the operation. RESULTS The correlation analysis conducted on the training cohort revealed that surgical access (r = 0.054) and age (r = -0.120) exhibited no significant correlation with the diameter of the acetabular prosthesis utilized during the intraoperative procedure. Conversely, height (r = 0.687), weight (r = 0.654), foot length (r = 0.687), and sex (r = 0.354) demonstrated a significant correlation with the diameter of the acetabular prosthesis used intraoperatively. Furthermore, a predictive equation, denoted as Y (acetabular prosthesis diameter in mm) = 20.592 + 0.548 × foot length (cm) + 0.083 × height (cm) + 0.077 × weight (kg), was derived. This equation accurately predicted the diameter within one size with an accuracy rate of 64.94% and within two sizes with an accuracy rate of 94.81%. CONCLUSION Anthropometric data can accurately predict the diameter of acetabular prosthesis during total hip arthroplasty.
Collapse
|
47
|
Mosaid H, Barakat A, John K, Faouzi E, Bustillo V, El Garnaoui M, Heung B. Improved soil carbon stock spatial prediction in a Mediterranean soil erosion site through robust machine learning techniques. ENVIRONMENTAL MONITORING AND ASSESSMENT 2024; 196:130. [PMID: 38198014 DOI: 10.1007/s10661-024-12294-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 01/01/2024] [Indexed: 01/11/2024]
Abstract
Soil serves as a reservoir for organic carbon stock, which indicates soil quality and fertility within the terrestrial ecosystem. Therefore, it is crucial to comprehend the spatial distribution of soil organic carbon stock (SOCS) and the factors influencing it to achieve sustainable practices and ensure soil health. Thus, the present study aimed to apply four machine learning (ML) models, namely, random forest (RF), k-nearest neighbors (kNN), support vector machine (SVM), and Cubist model tree (Cubist), to improve the prediction of SOCS in the Srou catchment located in the Upper Oum Er-Rbia watershed, Morocco. From an inventory of 120 sample points, 80% were used for training the model, with the remaining 20% set aside for model testing. Boruta's algorithm and the multicollinearity test identified only nine (9) factors as the controlling factors selected as input data for predicting SOCS. As a result, spatial distribution maps for SOCS were generated for all models, then compared, and further validated using statistical metrics. Among the models tested, the RF model exhibited the best performance (R2 = 0.76, RMSE = 0.52 Mg C/ha, NRMSE = 0.13, and MAE = 0.34 Mg C/ha), followed closely by the SVM model (R2 = 0.68, RMSE = 0.59 Mg C/ha, NRMSE = 0.15, and MAE = 0.34 Mg C/ha) and Cubist model (R2 = 0.64, RMSE = 0.63 Mg C/ha, NRMSE = 0.16, and MAE = 0.43 Mg C/ha), while the kNN model had the lowest performance (R2 = 0.31, RMSE = 0.94 Mg C/ha, NRMSE = 0.24, and MAE = 0.63 Mg C/ha). However, bulk density, pH, electrical conductivity, and calcium carbonate were the most important factors for spatially predicting SOCS in this semi-arid region. Hence, the methodology used in this study, which relies on ML algorithms, holds the potential for modeling and mapping SOCS and soil properties in comparable contexts elsewhere.
Collapse
|
48
|
Holster T, Ji S, Marttinen P. Risk adjustment for regional healthcare funding allocations with ensemble methods: an empirical study and interpretation. THE EUROPEAN JOURNAL OF HEALTH ECONOMICS : HEPAC : HEALTH ECONOMICS IN PREVENTION AND CARE 2024:10.1007/s10198-023-01656-w. [PMID: 38170332 DOI: 10.1007/s10198-023-01656-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 11/24/2023] [Indexed: 01/05/2024]
Abstract
We experiment with recent ensemble machine learning methods in estimating healthcare costs, utilizing Finnish data containing rich individual-level information on healthcare costs, socioeconomic status and diagnostic data from multiple registries. Our data are a random 10% sample (553,675 observations) from the Finnish population in 2017. Using annual healthcare cost in 2017 as a response variable, we compare the performance of Random forest, Gradient Boosting Machine (GBM) and eXtreme Gradient Boosting (XGBoost) to linear regression. As machine learning methods are often seen as unsuitable in risk adjustment applications because of their relative opaqueness, we also introduce visualizations from the machine learning literature to help interpret the contribution of individual variables to the prediction. Our results show that ensemble machine learning methods can improve predictive performance, with all of them significantly outperforming linear regression, and that a certain level of interpretation can be provided for them. We also find individual-level socioeconomic variables to improve prediction accuracy and that their effect is larger for machine learning methods. However, we find that the predictions used for funding allocations are sensitive to model selection, highlighting the need for comprehensive robustness testing when estimating risk adjustment models used in applications.
Collapse
|
49
|
Hong C, Liu M, Wojdyla DM, Hickey J, Pencina M, Henao R. Trans-Balance: Reducing demographic disparity for prediction models in the presence of class imbalance. J Biomed Inform 2024; 149:104532. [PMID: 38070817 PMCID: PMC10850917 DOI: 10.1016/j.jbi.2023.104532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 10/21/2023] [Accepted: 10/28/2023] [Indexed: 12/21/2023]
Abstract
INTRODUCTION Risk prediction, including early disease detection, prevention, and intervention, is essential to precision medicine. However, systematic bias in risk estimation caused by heterogeneity across different demographic groups can lead to inappropriate or misinformed treatment decisions. In addition, low incidence (class-imbalance) outcomes negatively impact the classification performance of many standard learning algorithms which further exacerbates the racial disparity issues. Therefore, it is crucial to improve the performance of statistical and machine learning models in underrepresented populations in the presence of heavy class imbalance. METHOD To address demographic disparity in the presence of class imbalance, we develop a novel framework, Trans-Balance, by leveraging recent advances in imbalance learning, transfer learning, and federated learning. We consider a practical setting where data from multiple sites are stored locally under privacy constraints. RESULTS We show that the proposed Trans-Balance framework improves upon existing approaches by explicitly accounting for heterogeneity across demographic subgroups and cohorts. We demonstrate the feasibility and validity of our methods through numerical experiments and a real application to a multi-cohort study with data from participants of four large, NIH-funded cohorts for stroke risk prediction. CONCLUSION Our findings indicate that the Trans-Balance approach significantly improves predictive performance, especially in scenarios marked by severe class imbalance and demographic disparity. Given its versatility and effectiveness, Trans-Balance offers a valuable contribution to enhancing risk prediction in biomedical research and related fields.
Collapse
|
50
|
Li T, Li Z, Guo S, Jiang S, Sun Q, Wu Y, Tian J. The value of using left ventricular pressure-strain loops to evaluate myocardial work in predicting heart failure with improved ejection fraction. Int J Cardiol 2024; 394:131366. [PMID: 37734490 DOI: 10.1016/j.ijcard.2023.131366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 08/25/2023] [Accepted: 09/15/2023] [Indexed: 09/23/2023]
Abstract
BACKGROUND The ultrasound left ventricular pressure-strain loop (LV PSL) was applied to evaluate myocardial work in heart failure with improved ejection fraction (HFimpEF) versus patients with persistent heart failure with reduced ejection fraction (HFrEF) to investigate the value of myocardial work parameters in predicting HFimpEF. METHODS We collected 120 patients with HFrEF and recorded clinical characteristics and echocardiographic parameters (PSL technique) of patients. Patients were divided into HFimpEF group or persistent HFrEF group according to the outcome of follow-up. Furthermore, differential clinical and echocardiographic parameters were determined by Student's t-test. We recognized the important echocardiographic parameters to predict whether patients would recover to HFimpEF using the univariate logistic regression analysis and ROC curves. In addition, the multivariate logistic regression models were constructed and evaluated using Delong test and decision curve analysis. RESULTS Firstly, the HFimpEF group had a higher prevalence of hypertension and higher systolic blood pressure (P-values <0.05). In terms of echocardiographic parameters, HFimpEF group also had higher LVEF, LV GLS, GCW, GWE, and GWI and lower LVEDD (P-values <0.01). In particular, LVEF, LVEDD, GLS, GWI, and GCW were robust predictors of the conversion of HFrEF patients to HFimpEF (AUC >0.70, P-values <0.05). Finally, we determined that the predictive Model 4 (LVEF, LVEDD, GLS, and GCW) had the optimal diagnostic power. CONCLUSION The model constructed by GCW with LVEF, LVEDD, and GLS has important predictive value for HFimpEF, which is an effective clinical decision-making tool for providing disease assessment.
Collapse
|