1
|
Zheng Y, Wang J, Ling Z, Zhang J, Zeng Y, Wang K, Zhang Y, Nong L, Sang L, Xu Y, Liu X, Li Y, Huang Y. A diagnostic model for sepsis-induced acute lung injury using a consensus machine learning approach and its therapeutic implications. J Transl Med 2023; 21:620. [PMID: 37700323 PMCID: PMC10498641 DOI: 10.1186/s12967-023-04499-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 09/01/2023] [Indexed: 09/14/2023] Open
Abstract
BACKGROUND A significant proportion of septic patients with acute lung injury (ALI) are recognized late due to the absence of an efficient diagnostic test, leading to the postponed treatments and consequently higher mortality. Identifying diagnostic biomarkers may improve screening to identify septic patients at high risk of ALI earlier and provide the potential effective therapeutic drugs. Machine learning represents a powerful approach for making sense of complex gene expression data to find robust ALI diagnostic biomarkers. METHODS The datasets were obtained from GEO and ArrayExpress databases. Following quality control and normalization, the datasets (GSE66890, GSE10474 and GSE32707) were merged as the training set, and four machine learning feature selection methods (Elastic net, SVM, random forest and XGBoost) were applied to construct the diagnostic model. The other datasets were considered as the validation sets. To further evaluate the performance and predictive value of diagnostic model, nomogram, Decision Curve Analysis (DCA) and Clinical Impact Curve (CIC) were constructed. Finally, the potential small molecular compounds interacting with selected features were explored from the CTD database. RESULTS The results of GSEA showed that immune response and metabolism might play an important role in the pathogenesis of sepsis-induced ALI. Then, 52 genes were identified as putative biomarkers by consensus feature selection from all four methods. Among them, 5 genes (ARHGDIB, ALDH1A1, TACR3, TREM1 and PI3) were selected by all methods and used to predict ALI diagnosis with high accuracy. The external datasets (E-MTAB-5273 and E-MTAB-5274) demonstrated that the diagnostic model had great accuracy with AUC value of 0.725 and 0.833, respectively. In addition, the nomogram, DCA and CIC showed that the diagnostic model had great performance and predictive value. Finally, the small molecular compounds (Curcumin, Tretinoin, Acetaminophen, Estradiol and Dexamethasone) were screened as the potential therapeutic agents for sepsis-induced ALI. CONCLUSION This consensus of multiple machine learning algorithms identified 5 genes that were able to distinguish ALI from septic patients. The diagnostic model could identify septic patients at high risk of ALI, and provide potential therapeutic targets for sepsis-induced ALI.
Collapse
Affiliation(s)
- Yongxin Zheng
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Jinping Wang
- Department of Cardiovascular Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, Guangdong,, China
| | - Zhaoyi Ling
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Jiamei Zhang
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Yuan Zeng
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Ke Wang
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Yu Zhang
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Lingbo Nong
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Ling Sang
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Yonghao Xu
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Xiaoqing Liu
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China
| | - Yimin Li
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China.
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China.
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China.
| | - Yongbo Huang
- Department of Critical Care Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China.
- Guangzhou Institute of Respiratory Health, Guangzhou, 510120, China.
- State Key Laboratory of Respiratory Diseases, Guangzhou, 510120, China.
| |
Collapse
|
4
|
Lévano M, Nowak H. New aspects of the elastic net algorithm for cluster analysis. Neural Comput Appl 2011; 20:835-850. [PMID: 21949468 PMCID: PMC3155750 DOI: 10.1007/s00521-010-0498-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2010] [Accepted: 11/16/2010] [Indexed: 12/03/2022]
Abstract
The elastic net algorithm formulated by Durbin–Willshaw as a heuristic method and initially applied to solve the traveling salesman problem can be used as a tool for data clustering in n-dimensional space. With the help of statistical mechanics, it is formulated as a deterministic annealing method, where a chain with a fixed number of nodes interacts at different temperatures with the data cloud. From a given temperature on the nodes are found to be the optimal centroids of fuzzy clusters, if the number of nodes is much smaller than the number of data points. We show in this contribution that for this temperature, the centroids of hard clusters, defined by the nearest neighbor clusters of every node, are in the same position as the optimal centroids of the fuzzy clusters. The same is true for the standard deviations. This result can be used as a stopping criterion for the annealing process. The stopping temperature and the number and sizes of the hard clusters depend on the number of nodes in the chain. Test was made with homogeneous and nonhomogeneous artificial clusters in two dimensions. A medical application is given to localize tumors and their size in images of a combined measurement of X-ray computed tomography and positron emission tomography.
Collapse
Affiliation(s)
- Marcos Lévano
- Escuela de Ingeniería Informática, Facultad de Ingeniería, Universidad Católica de Temuco, Av. Manuel Montt 56, Casilla 15-D, Temuco, Chile
| | | |
Collapse
|
5
|
Shi G, Boerwinkle E, Morrison AC, Gu CC, Chakravarti A, Rao DC. Mining gold dust under the genome wide significance level: a two-stage approach to analysis of GWAS. Genet Epidemiol 2010; 35:111-8. [PMID: 21254218 DOI: 10.1002/gepi.20556] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Revised: 10/27/2010] [Accepted: 11/17/2010] [Indexed: 12/14/2022]
Abstract
We propose a two-stage approach to analyze genome-wide association data in order to identify a set of promising single-nucleotide polymorphisms (SNPs). In stage one, we select a list of top signals from single SNP analyses by controlling false discovery rate. In stage two, we use the least absolute shrinkage and selection operator (LASSO) regression to reduce false positives. The proposed approach was evaluated using simulated quantitative traits based on genome-wide SNP data on 8,861 Caucasian individuals from the Atherosclerosis Risk in Communities (ARIC) Study. Our first stage, targeted at controlling false negatives, yields better power than using Bonferroni-corrected significance level. The LASSO regression reduces the number of significant SNPs in stage two: it reduces false-positive SNPs and it reduces true-positive SNPs also at simulated causal loci due to linkage disequilibrium. Interestingly, the LASSO regression preserves the power from stage one, i.e., the number of causal loci detected from the LASSO regression in stage two is almost the same as in stage one, while reducing false positives further. Real data on systolic blood pressure in the ARIC study was analyzed using our two-stage approach which identified two significant SNPs, one of which was reported to be genome-significant in a meta-analysis containing a much larger sample size. On the other hand, a single SNP association scan did not yield any significant results.
Collapse
Affiliation(s)
- Gang Shi
- Division of Biostatistics, Washington University School of Medicine, Saint Louis, Missouri 63110-1093, USA.
| | | | | | | | | | | |
Collapse
|
7
|
Guo YZ, Feng EM, Wang Y. Optimal HP configurations of proteins by combining local search with elastic net algorithm. ACTA ACUST UNITED AC 2007; 70:335-40. [PMID: 16982100 DOI: 10.1016/j.jbbm.2006.08.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2006] [Revised: 07/31/2006] [Accepted: 08/03/2006] [Indexed: 11/30/2022]
Abstract
The prediction of protein conformation from its amino-acid sequence is one of the most prominent problems in computational biology. But it is NP-hard. Here, we focus on an abstraction widely studied of this problem, the two-dimensional hydrophobic-polar protein folding problem (2D HP PFP). Mathematical optimal model of free energy of protein is established. Native conformations are often sought using stochastic sampling methods, but which are slow. The elastic net (EN) algorithm is one of fast deterministic methods as travelling salesman problem (TSP) strategies. However, it cannot be applied directly to protein folding problem, because of fundamental differences in the two types of problems. In this paper, how the 2D HP protein folding problem can be framed in terms of TSP is shown. Combination of the modified elastic net algorithm and novel local search method is adopted to solve this problem. To our knowledge, this is the first application of EN algorithm to 2D HP model. The results indicate that our approach can find more optimal conformations and is simple to implement, computationally efficient and fast.
Collapse
Affiliation(s)
- Yu-Zhen Guo
- Department of Applied Mathematics, Dalian University of Technology 116024 Dalian, China.
| | | | | |
Collapse
|