Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhou QM, Zhe L, Brooke RJ, Hudson MM, Yuan Y. A relationship between the incremental values of area under the ROC curve and of area under the precision-recall curve. Diagn Progn Res 2021;5:13. [PMID: 34261544 PMCID: PMC8278775 DOI: 10.1186/s41512-021-00102-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 06/08/2021] [Indexed: 11/17/2022] Open

For:	Zhou QM, Zhe L, Brooke RJ, Hudson MM, Yuan Y. A relationship between the incremental values of area under the ROC curve and of area under the precision-recall curve. Diagn Progn Res 2021;5:13. [PMID: 34261544 PMCID: PMC8278775 DOI: 10.1186/s41512-021-00102-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 06/08/2021] [Indexed: 11/17/2022] Open

Number

Cited by Other Article(s)

Kubiak KB, Więckowska B, Jodłowska-Siewert E, Guzik P. Visualising and quantifying the usefulness of new predictors stratified by outcome class: The U-smile method. PLoS One 2024;19:e0303276. [PMID: 38768166 PMCID: PMC11104627 DOI: 10.1371/journal.pone.0303276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Accepted: 04/22/2024] [Indexed: 05/22/2024] Open

Shi J, Tang J, Liu L, Zhang C, Chen W, Qi M, Han Z, Chen X. Integrative Analyses of Bulk and Single-Cell RNA Seq Identified the Shared Genes in Acute Respiratory Distress Syndrome and Rheumatoid Arthritis. Mol Biotechnol 2024:10.1007/s12033-024-01141-6. [PMID: 38656728 DOI: 10.1007/s12033-024-01141-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Accepted: 03/06/2024] [Indexed: 04/26/2024]

Abstract

Acute respiratory distress syndrome (ARDS), a progressive status of acute lung injury (ALI), is primarily caused by an immune-mediated inflammatory disorder, which can be an acute pulmonary complication of rheumatoid arthritis (RA). As a chronic inflammatory disease regulated by the immune system, RA is closely associated with the occurrence and progression of respiratory diseases. However, it remains elusive whether there are shared genes between the molecular mechanisms underlying RA and ARDS. The objective of this study is to identify potential shared genes for further clinical drug discovery through integrated analysis of bulk RNA sequencing datasets obtained from the Gene Expression Omnibus database, employing differentially expressed genes (DEGs) analysis and weighted gene co-expression network analysis (WGCNA). The hub genes were identified through the intersection of common DEGs and WGCNA-derived genes. The Random Forest (RF) and least absolute shrinkage and selection operator (LASSO) algorithms were subsequently employed to identify key shared target genes associated with two diseases. Additionally, RA immune infiltration analysis and COVID-19 single-cell transcriptome analysis revealed the correlation between these key genes and immune cells. A total of 59 shared genes were identified from the intersection of DEGs and gene clusters obtained through WGCNA, which analyzed the integrated gene matrix of ALI/ARDS and RA. The RF and LASSO algorithms were employed to screen for target genes specific to ALI/ARDS and RA, respectively. The final set of overlapping genes (FCMR, ADAM28, HK3, GRB10, UBE2J1, HPSE, DDX24, BATF, and CST7) all exhibited a strong predictive effect with an area under the curve (AUC) value greater than 0.8. Then, the immune infiltration analysis revealed a strong correlation between UBE2J1 and plasma cells in RA. Furthermore, scRNA-seq analysis demonstrated differential expression of these nine target genes primarily in T cells and NK cells, with CST7 showing a significant positive correlation specifically with NK cells. Beyond that, transcriptome sequencing was conducted on lung tissue collected from ALI mice, confirming the substantial differential expression of FCMR, HK3, UBE2J1, and BATF. This study provides unprecedented evidence linking the pathophysiological mechanisms of ALI/ARDS and RA to immune regulation, which offers novel understanding for future clinical treatment and experimental research.

Collapse

Li Q, Tang X, Li W. Potential diagnostic markers and biological mechanism for osteoarthritis with obesity based on bioinformatics analysis. PLoS One 2023;18:e0296033. [PMID: 38127891 PMCID: PMC10735003 DOI: 10.1371/journal.pone.0296033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 12/04/2023] [Indexed: 12/23/2023] Open

Chen SF, Su CC, Huang CC, Ogink PT, Yen HK, Groot OQ, Hu MH. External validation of machine learning algorithm predicting prolonged opioid prescriptions in opioid-naïve lumbar spine surgery patients using a Taiwanese cohort. J Formos Med Assoc 2023;122:1321-1330. [PMID: 37453900 DOI: 10.1016/j.jfma.2023.06.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 06/26/2023] [Accepted: 06/30/2023] [Indexed: 07/18/2023] Open

Kurosawa R, Iida K, Ajiro M, Awaya T, Yamada M, Kosaki K, Hagiwara M. PDIVAS: Pathogenicity predictor for Deep-Intronic Variants causing Aberrant Splicing. BMC Genomics 2023;24:601. [PMID: 37817060 PMCID: PMC10563346 DOI: 10.1186/s12864-023-09645-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 09/01/2023] [Indexed: 10/12/2023] Open

Ebrahimi A, Wiil UK, Baskaran R, Peimankar A, Andersen K, Nielsen AS. AUD-DSS: a decision support system for early detection of patients with alcohol use disorder. BMC Bioinformatics 2023;24:329. [PMID: 37658294 PMCID: PMC10474761 DOI: 10.1186/s12859-023-05450-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 08/21/2023] [Indexed: 09/03/2023] Open

Abstract

BACKGROUND

Alcohol use disorder (AUD) causes significant morbidity, mortality, and injuries. According to reports, approximately 5% of all registered deaths in Denmark could be due to AUD. The problem is compounded by the late identification of patients with AUD, a situation that can cause enormous problems, from psychological to physical to economic problems. Many individuals suffering from AUD never undergo specialist treatment during their addiction due to obstacles such as taboo and the poor performance of current screening tools. Therefore, there is a lack of rapid intervention. This can be mitigated by the early detection of patients with AUD. A clinical decision support system (DSS) powered by machine learning (ML) methods can be used to diagnose patients' AUD status earlier.

METHODS

This study proposes an effective AUD prediction model (AUDPM), which can be used in a DSS. The proposed model consists of four distinct components: (1) imputation to address missing values using the k-nearest neighbours approach, (2) recursive feature elimination with cross validation to select the most relevant subset of features, (3) a hybrid synthetic minority oversampling technique-edited nearest neighbour approach to remove noise and balance the distribution of the training data, and (4) an ML model for the early detection of patients with AUD. Two data sources, including a questionnaire and electronic health records of 2571 patients, were collected from Odense University Hospital in the Region of Southern Denmark for the AUD-Dataset. Then, the AUD-Dataset was used to build ML models. The results of different ML models, such as support vector machine, K-nearest neighbour, decision tree, random forest, and extreme gradient boosting, were compared. Finally, a combination of all these models in an ensemble learning approach was selected for the AUDPM.

RESULTS

The results revealed that the proposed ensemble AUDPM outperformed other single models and our previous study results, achieving 0.96, 0.94, 0.95, and 0.97 precision, recall, F1-score, and accuracy, respectively. In addition, we designed and developed an AUD-DSS prototype.

CONCLUSION

It was shown that our proposed AUDPM achieved high classification performance. In addition, we identified clinical factors related to the early detection of patients with AUD. The designed AUD-DSS is intended to be integrated into the existing Danish health care system to provide novel information to clinical staff if a patient shows signs of harmful alcohol use; in other words, it gives staff a good reason for having a conversation with patients for whom a conversation is relevant.

Collapse

Li B, Zhang Y, Peng H, Fan Q, He S, Zhang Y, Shi S, Zhang Y, Ma A. Multi-semantic feature fusion attention network for binary code similarity detection. Sci Rep 2023;13:4096. [PMID: 36907937 PMCID: PMC10008825 DOI: 10.1038/s41598-023-31280-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 03/09/2023] [Indexed: 03/14/2023] Open

Ebrahimi A, Wiil UK, Naemi A, Mansourvar M, Andersen K, Nielsen AS. Identification of clinical factors related to prediction of alcohol use disorder from electronic health records using feature selection methods. BMC Med Inform Decis Mak 2022;22:304. [PMID: 36424597 PMCID: PMC9686074 DOI: 10.1186/s12911-022-02051-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Accepted: 11/16/2022] [Indexed: 11/25/2022] Open

Abstract

BACKGROUND

High dimensionality in electronic health records (EHR) causes a significant computational problem for any systematic search for predictive, diagnostic, or prognostic patterns. Feature selection (FS) methods have been indicated to be effective in feature reduction as well as in identifying risk factors related to prediction of clinical disorders. This paper examines the prediction of patients with alcohol use disorder (AUD) using machine learning (ML) and attempts to identify risk factors related to the diagnosis of AUD.

METHODS

A FS framework consisting of two operational levels, base selectors and ensemble selectors. The first level consists of five FS methods: three filter methods, one wrapper method, and one embedded method. Base selector outputs are aggregated to develop four ensemble FS methods. The outputs of FS method were then fed into three ML algorithms: support vector machine (SVM), K-nearest neighbor (KNN), and random forest (RF) to compare and identify the best feature subset for the prediction of AUD from EHRs.

RESULTS

In terms of feature reduction, the embedded FS method could significantly reduce the number of features from 361 to 131. In terms of classification performance, RF based on 272 features selected by our proposed ensemble method (Union FS) with the highest accuracy in predicting patients with AUD, 96%, outperformed all other models in terms of AUROC, AUPRC, Precision, Recall, and F1-Score. Considering the limitations of embedded and wrapper methods, the best overall performance was achieved by our proposed Union Filter FS, which reduced the number of features to 223 and improved Precision, Recall, and F1-Score in RF from 0.77, 0.65, and 0.71 to 0.87, 0.81, and 0.84, respectively. Our findings indicate that, besides gender, age, and length of stay at the hospital, diagnosis related to digestive organs, bones, muscles and connective tissue, and the nervous systems are important clinical factors related to the prediction of patients with AUD.

CONCLUSION

Our proposed FS method could improve the classification performance significantly. It could identify clinical factors related to prediction of AUD from EHRs, thereby effectively helping clinical staff to identify and treat AUD patients and improving medical knowledge of the AUD condition. Moreover, the diversity of features among female and male patients as well as gender disparity were investigated using FS methods and ML techniques.

Collapse

Wang Y, Huang Z, Xiao Y, Wan W, Yang X. The shared biomarkers and pathways of systemic lupus erythematosus and metabolic syndrome analyzed by bioinformatics combining machine learning algorithm and single-cell sequencing analysis. Front Immunol 2022;13:1015882. [PMID: 36341378 PMCID: PMC9627509 DOI: 10.3389/fimmu.2022.1015882] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 10/03/2022] [Indexed: 11/24/2022] Open

Abstract

Background

Systemic lupus erythematosus (SLE) is one of the most prevalent systemic autoimmune diseases, and metabolic syndrome (MetS) is the most common metabolic disorder that contains hypertension, dyslipidemia, and obesity. Despite clinical evidence suggested potential associations between SLE and MetS, the underlying pathogenesis is yet unclear.

Methods

The microarray data sets of SLE and MetS were obtained from the Gene Expression Omnibus (GEO) database. To identify the shared genes between SLE and MetS, the Differentially Expressed Genes (DEGs) analysis and the weighted gene co-expression network analysis (WGCNA) were conducted. Then, the GO and KEGG analyses were performed, and the protein-protein interaction (PPI) network was constructed. Next, Random Forest and LASSO algorithms were used to screen shared hub genes, and a diagnostic model was built using the machine learning technique XG-Boost. Subsequently, CIBERSORT and GSVA were used to estimate the correlation between shared hub genes and immune infiltration as well as metabolic pathways. Finally, the significant hub genes were verified using single-cell RNA sequencing (scRNA-seq) data.

Results

Using limma and WGCNA, we identified 153 shared feature genes, which were enriched in immune- and metabolic-related pathways. Further, 20 shared hub genes were screened and successfully used to build a prognostic model. Those shared hub genes were associated with immunological and metabolic processes in peripheral blood. The scRNA-seq results verified that TNFSF13B and OAS1, possessing the highest diagnostic efficacy, were mainly expressed by monocytes. Additionally, they showed positive correlations with the pathways for the metabolism of xenobiotics and cholesterol, both of which were proven to be active in this comorbidity, and shown to be concentrated in monocytes.

Conclusion

This study identified shared hub genes and constructed an effective diagnostic model in SLE and MetS. TNFSF13B and OAS1 had a positive correlation with cholesterol and xenobiotic metabolism. Both of these two biomarkers and metabolic pathways were potentially linked to monocytes, which provides novel insights into the pathogenesis and combined therapy of SLE comorbidity with MetS.

Collapse

Yang J, Zhang L, Tang X, Han M. CodnNet: A lightweight CNN architecture for detection of COVID-19 infection. Appl Soft Comput 2022;130:109656. [PMID: 36188336 PMCID: PMC9508701 DOI: 10.1016/j.asoc.2022.109656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 08/17/2022] [Accepted: 09/20/2022] [Indexed: 11/26/2022]

A machine learning algorithm for predicting prolonged postoperative opioid prescription after lumbar disc herniation surgery. An external validation study using 1,316 patients from a Taiwanese cohort. Spine J 2022;22:1119-1130. [PMID: 35202784 DOI: 10.1016/j.spinee.2022.02.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Revised: 01/31/2022] [Accepted: 02/14/2022] [Indexed: 02/03/2023]

Abstract

BACKGROUND CONTEXT

Preoperative prediction of prolonged postoperative opioid prescription helps identify patients for increased surveillance after surgery. The SORG machine learning model has been developed and successfully tested using 5,413 patients from the United States (US) to predict the risk of prolonged opioid prescription after surgery for lumbar disc herniation. However, external validation is an often-overlooked element in the process of incorporating prediction models in current clinical practice. This cannot be stressed enough in prediction models where medicolegal and cultural differences may play a major role.

PURPOSE

The authors aimed to investigate the generalizability of the US citizens prediction model SORG to a Taiwanese patient cohort.

STUDY DESIGN

Retrospective study at a large academic medical center in Taiwan.

PATIENT SAMPLE

Of 1,316 patients who were 20 years or older undergoing initial operative management for lumbar disc herniation between 2010 and 2018.

OUTCOME MEASURES

The primary outcome of interest was prolonged opioid prescription defined as continuing opioid prescription to at least 90 to 180 days after the first surgery for lumbar disc herniation at our institution.

METHODS

Baseline characteristics were compared between the external validation cohort and the original developmental cohorts. Discrimination (area under the receiver operating characteristic curve and the area under the precision-recall curve), calibration, overall performance (Brier score), and decision curve analysis were used to assess the performance of the SORG ML algorithm in the validation cohort. This study had no funding source or conflict of interests.

RESULTS

Overall, 1,316 patients were identified with sustained postoperative opioid prescription in 41 (3.1%) patients. The validation cohort differed from the development cohort on several variables including 93% of Taiwanese patients receiving NSAIDS preoperatively compared with 22% of US citizens patients, while 30% of Taiwanese patients received opioids versus 25% in the US. Despite these differences, the SORG prediction model retained good discrimination (area under the receiver operating characteristic curve of 0.76 and the area under the precision-recall curve of 0.33) and good overall performance (Brier score of 0.028 compared with null model Brier score of 0.030) while somewhat overestimating the chance of prolonged opioid use (calibration slope of 1.07 and calibration intercept of -0.87). Decision-curve analysis showed the SORG model was suitable for clinical use.

CONCLUSIONS

Despite differences at baseline and a very strict opioid policy, the SORG algorithm for prolonged opioid use after surgery for lumbar disc herniation has good discriminative abilities and good overall performance in a Han Chinese patient group in Taiwan. This freely available digital application can be used to identify high-risk patients and tailor prevention policies for these patients that may mitigate the long-term adverse consequence of opioid dependence: https://sorg-apps.shinyapps.io/lumbardiscopioid/.

Collapse

Mao N, Shi Y, Lian C, Wang Z, Zhang K, Xie H, Zhang H, Chen Q, Cheng G, Xu C, Dai Y. Intratumoral and peritumoral radiomics for preoperative prediction of neoadjuvant chemotherapy effect in breast cancer based on contrast-enhanced spectral mammography. Eur Radiol 2022;32:3207-3219. [DOI: 10.1007/s00330-021-08414-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 09/26/2021] [Accepted: 10/13/2021] [Indexed: 12/14/2022]

Qi J, Lei J, Li N, Huang D, Liu H, Zhou K, Dai Z, Sun C. Machine learning models to predict in-hospital mortality in septic patients with diabetes. Front Endocrinol (Lausanne) 2022;13:1034251. [PMID: 36465642 PMCID: PMC9709414 DOI: 10.3389/fendo.2022.1034251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Accepted: 10/25/2022] [Indexed: 11/17/2022] Open

Abstract

BACKGROUND

Sepsis is a leading cause of morbidity and mortality in hospitalized patients. Up to now, there are no well-established longitudinal networks from molecular mechanisms to clinical phenotypes in sepsis. Adding to the problem, about one of the five patients presented with diabetes. For this subgroup, management is difficult, and prognosis is difficult to evaluate.

METHODS

From the three databases, a total of 7,001 patients were enrolled on the basis of sepsis-3 standard and diabetes diagnosis. Input variable selection is based on the result of correlation analysis in a handpicking way, and 53 variables were left. A total of 5,727 records were collected from Medical Information Mart for Intensive Care database and randomly split into a training set and an internal validation set at a ratio of 7:3. Then, logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were conducted to build the predictive model by using training set. Then, the models were tested by the internal validation set. The data from eICU Collaborative Research Database (n = 815) and dtChina critical care database (n = 459) were used to test the model performance as the external validation set.

RESULTS

In the internal validation set, the accuracy values of logistic regression with lasso regularization, Bayes logistic regression, decision tree, random forest, and XGBoost were 0.878, 0.883, 0.865, 0.883, and 0.882, respectively. Likewise, in the external validation set 1, lasso regularization = 0.879, Bayes logistic regression = 0.877, decision tree = 0.865, random forest = 0.886, and XGBoost = 0.875. In the external validation set 2, lasso regularization = 0.715, Bayes logistic regression = 0.745, decision tree = 0.763, random forest = 0.760, and XGBoost = 0.699.

CONCLUSION

The top three models for internal validation set were Bayes logistic regression, random forest, and XGBoost, whereas the top three models for external validation set 1 were random forest, logistic regression, and Bayes logistic regression. In addition, the top three models for the external validation set 2 were decision tree, random forest, and Bayes logistic regression. Random forest model performed well with the training and three validation sets. The most important features are age, albumin, and lactate.

Collapse