1
|
Lin H, Zhang X, Feng Y, Gong Z, Li J, Wang W, Fan J. Advancing lung adenocarcinoma prognosis and immunotherapy prediction with a multi-omics consensus machine learning approach. J Cell Mol Med 2024; 28:e18520. [PMID: 38958523 PMCID: PMC11221067 DOI: 10.1111/jcmm.18520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/16/2024] [Accepted: 06/04/2024] [Indexed: 07/04/2024] Open
Abstract
Lung adenocarcinoma (LUAD) is a tumour characterized by high tumour heterogeneity. Although there are numerous prognostic and immunotherapeutic options available for LUAD, there is a dearth of precise, individualized treatment plans. We integrated mRNA, lncRNA, microRNA, methylation and mutation data from the TCGA database for LUAD. Utilizing ten clustering algorithms, we identified stable multi-omics consensus clusters (MOCs). These data were then amalgamated with ten machine learning approaches to develop a robust model capable of reliably identifying patient prognosis and predicting immunotherapy outcomes. Through ten clustering algorithms, two prognostically relevant MOCs were identified, with MOC2 showing more favourable outcomes. We subsequently constructed a MOCs-associated machine learning model (MOCM) based on eight MOCs-specific hub genes. Patients characterized by a lower MOCM score exhibited better overall survival and responses to immunotherapy. These findings were consistent across multiple datasets, and compared to many previously published LUAD biomarkers, our MOCM score demonstrated superior predictive performance. Notably, the low MOCM group was more inclined towards 'hot' tumours, characterized by higher levels of immune cell infiltration. Intriguingly, a significant positive correlation between GJB3 and the MOCM score (R = 0.77, p < 0.01) was discovered. Further experiments confirmed that GJB3 significantly enhances LUAD proliferation, invasion and migration, indicating its potential as a key target for LUAD treatment. Our developed MOCM score accurately predicts the prognosis of LUAD patients and identifies potential beneficiaries of immunotherapy, offering broad clinical applicability.
Collapse
Affiliation(s)
- Haoran Lin
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Xiao Zhang
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Yanlong Feng
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Zetian Gong
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Jun Li
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Wei Wang
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| | - Jun Fan
- Department of Thoracic SurgeryThe First Affiliated Hospital of Nanjing Medical UniversityNanjingChina
| |
Collapse
|
2
|
Han T, Bai Y, Liu Y, Dong Y, Liang C, Gao L, Zhou J, Guo J, Wu J, Hu D. Integrated multi-omics analysis and machine learning to refine molecular subtypes, prognosis, and immunotherapy in lung adenocarcinoma. Funct Integr Genomics 2024; 24:118. [PMID: 38935217 DOI: 10.1007/s10142-024-01388-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 04/01/2024] [Accepted: 05/17/2024] [Indexed: 06/28/2024]
Abstract
Lung adenocarcinoma (LUAD) has a malignant characteristic that is highly aggressive and prone to metastasis. There is still a lack of suitable biomarkers to facilitate the refinement of precision-based therapeutic regimens. We used a combination of 10 known clustering algorithms and the omics data from 4 dimensions to identify high-resolution molecular subtypes of LUAD. Subsequently, consensus machine learning-related prognostic signature (CMRS) was developed based on subtypes related genes and an integrated program framework containing 10 machine learning algorithms. The efficiency of CMRS was analyzed from the perspectives of tumor microenvironment, genomic landscape, immunotherapy, drug sensitivity, and single-cell analysis. In terms of results, through multi-omics clustering, we identified 2 comprehensive omics subtypes (CSs) in which CS1 patients had worse survival outcomes, higher aggressiveness, mRNAsi and mutation frequency. Subsequently, we developed CMRS based on 13 key genes up-regulated in CS1. The prognostic predictive efficiency of CMRS was superior to most established LUAD prognostic signatures. CMRS demonstrated a strong correlation with tumor microenvironmental feature variants and genomic instability generation. Regarding clinical performance, patients in the high CMRS group were more likely to benefit from immunotherapy, whereas low CMRS were more likely to benefit from chemotherapy and targeted drug therapy. In addition, we evaluated that drugs such as neratinib, oligomycin A, and others may be candidates for patients in the high CMRS group. Single-cell analysis revealed that CMRS-related genes were mainly expressed in epithelial cells. The novel molecular subtypes identified in this study based on multi-omics data could provide new insights into the stratified treatment of LUAD, while the development of CMRS could serve as a candidate indicator of the degree of benefit of precision therapy and immunotherapy for LUAD.
Collapse
Affiliation(s)
- Tao Han
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Ying Bai
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China.
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China.
| | - Yafeng Liu
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Yunjia Dong
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Chao Liang
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Lu Gao
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Jiawei Zhou
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Jianqiang Guo
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China
| | - Jing Wu
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China.
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China.
- Key Laboratory of Industrial Dust Deep Reduction and Occupational Health and Safety of Anhui Higher Education Institute, Huainan, Anhui, China.
- Key Laboratory of Industrial Dust Prevention and Control & Occupational Safety and Health of the Ministry of Education, Anhui University of Science and Technology, Huainan, Anhui, China.
| | - Dong Hu
- School of Medicine, Anhui University of Science and Technology, Huainan, Anhui, China.
- Anhui Occupational Health and Safety Engineering Laboratory, Huainan, Anhui, China.
- Key Laboratory of Industrial Dust Deep Reduction and Occupational Health and Safety of Anhui Higher Education Institute, Huainan, Anhui, China.
- Key Laboratory of Industrial Dust Prevention and Control & Occupational Safety and Health of the Ministry of Education, Anhui University of Science and Technology, Huainan, Anhui, China.
| |
Collapse
|
3
|
Ji Q, Zheng Y, Zhou L, Chen F, Li W. Unveiling divergent treatment prognoses in IDHwt-GBM subtypes through multiomics clustering: a swift dual MRI-mRNA model for precise subtype prediction. J Transl Med 2024; 22:578. [PMID: 38890658 PMCID: PMC11186189 DOI: 10.1186/s12967-024-05401-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Accepted: 06/13/2024] [Indexed: 06/20/2024] Open
Abstract
BACKGROUND IDH1-wildtype glioblastoma multiforme (IDHwt-GBM) is a highly heterogeneous and aggressive brain tumour characterised by a dismal prognosis and significant challenges in accurately predicting patient outcomes. To address these issues and personalise treatment approaches, we aimed to develop and validate robust multiomics molecular subtypes of IDHwt-GBM. Through this, we sought to uncover the distinct molecular signatures underlying these subtypes, paving the way for improved diagnosis and targeted therapy for this challenging disease. METHODS To identify stable molecular subtypes among 184 IDHwt-GBM patients from TCGA, we used the consensus clustering method to consolidate the results from ten advanced multiomics clustering approaches based on mRNA, lncRNA, and mutation data. We developed subtype prediction models using the PAM and machine learning algorithms based on mRNA and MRI data for enhanced clinical utility. These models were validated in five independent datasets, and an online interactive system was created. We conducted a comprehensive assessment of the clinical impact, drug treatment response, and molecular associations of the IDHwt-GBM subtypes. RESULTS In the TCGA cohort, two molecular subtypes, class 1 and class 2, were identified through multiomics clustering of IDHwt-GBM patients. There was a significant difference in survival between Class 1 and Class 2 patients, with a hazard ratio (HR) of 1.68 [1.15-2.47]. This difference was validated in other datasets (CGGA: HR = 1.75[1.04, 2.94]; CPTAC: HR = 1.79[1.09-2.91]; GALSS: HR = 1.66[1.09-2.54]; UCSF: HR = 1.33[1.00-1.77]; UPENN HR = 1.29[1.04-1.58]). Additionally, class 2 was more sensitive to treatment with radiotherapy combined with temozolomide, and this sensitivity was validated in the GLASS cohort. Correspondingly, class 2 and class 1 exhibited significant differences in mutation patterns, enriched pathways, programmed cell death (PCD), and the tumour immune microenvironment. Class 2 had more mutation signatures associated with defective DNA mismatch repair (P = 0.0021). Enriched pathways of differentially expressed genes in class 1 and class 2 (P-adjust < 0.05) were mainly related to ferroptosis, the PD-1 checkpoint pathway, the JAK-STAT signalling pathway, and other programmed cell death and immune-related pathways. The different cell death modes and immune microenvironments were validated across multiple datasets. Finally, our developed survival prediction model, which integrates molecular subtypes, age, and sex, demonstrated clinical benefits based on the decision curve in the test set. We deployed the molecular subtyping prediction model and survival prediction model online, allowing interactive use and facilitating user convenience. CONCLUSIONS Molecular subtypes were identified and verified through multiomics clustering in IDHwt-GBM patients. These subtypes are linked to specific mutation patterns, the immune microenvironment, prognoses, and treatment responses.
Collapse
Affiliation(s)
- Qiang Ji
- Department of Neuro-Oncology, Cancer Center, China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
- National Institute for Data Science in Health and Medicine, Capital Medical University, Beijing, China
| | - Yi Zheng
- Department of Neuro-Oncology, Cancer Center, China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
- China National Clinical Research Center for Neurological Diseases, Beijing, China
| | - Lili Zhou
- Department of Neuro-Oncology, Cancer Center, China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Feng Chen
- Department of Neuro-Oncology, Cancer Center, China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
| | - Wenbin Li
- Department of Neuro-Oncology, Cancer Center, China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing, China.
- National Institute for Data Science in Health and Medicine, Capital Medical University, Beijing, China.
| |
Collapse
|
4
|
Nolin SJ, Siegel PB, Ashwell CM. Differences in the microbiome of the small intestine of Leghorn lines divergently selected for antibody titer to sheep erythrocytes suggest roles for commensals in host humoral response. Front Physiol 2024; 14:1304051. [PMID: 38260103 PMCID: PMC10800846 DOI: 10.3389/fphys.2023.1304051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 12/18/2023] [Indexed: 01/24/2024] Open
Abstract
For forty generations, two lines of White Leghorn chickens have been selected for high (HAS) or low (LAS) antibody response to a low dose injection of sheep red blood cells (SRBCs). Their gut is home to billons of microorganisms and the largest number of immune cells in the body; therefore, the objective of this experiment was to gain understanding of the ways the microbiome may influence the differential antibody response observed in these lines. We achieved this by characterizing the small intestinal microbiome of HAS and LAS chickens, determining their functional microbiome profiles, and by using machine learning to identify microbes which best differentiate HAS from LAS and associating the abundance of those microbes with host gene expression. Microbiome sequencing revealed greater diversity in LAS but statistically higher abundance of several strains, particularly those of Lactobacillus, in HAS. Enrichment of microbial metabolites implicated in immune response such as lactic acid, short chain fatty acids, amino acids, and vitamins were different between HAS and LAS. The abundance of several microbial strains corresponds to enriched host gene expression pathways related to immune response. These data provide a compelling argument that the microbiome is both likely affected by host divergent genetic selection and that it exerts influence on host antibody response by various mechanisms.
Collapse
Affiliation(s)
- Shelly J. Nolin
- Prestage Department of Poultry Science, North Carolina State University, Raleigh, NC, United States
| | - Paul B. Siegel
- School of Animal Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Christopher M. Ashwell
- Davis College of Agriculture, Natural Resources, and Design, West Virginia University, Morgantown, WV, United States
| |
Collapse
|
5
|
Wang X, Wu S, Sun L, Jin P, Zhang J, Liu W, Zhan Z, Wang Z, Liu X, He L. Pan-cancer analysis revealing that PTPN2 is an indicator of risk stratification for acute myeloid leukemia. Sci Rep 2023; 13:18372. [PMID: 37884566 PMCID: PMC10603079 DOI: 10.1038/s41598-023-44892-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 10/13/2023] [Indexed: 10/28/2023] Open
Abstract
The non-receptor protein tyrosine phosphatases gene family (PTPNs) is involved in the tumorigenesis and development of many cancers, but the role of PTPNs in acute myeloid leukemia (AML) remains unclear. After a comprehensive evaluation on the expression patterns and immunological effects of PTPNs using a pan-cancer analysis based on RNA sequencing data obtained from The Cancer Genome Atlas, the most valuable gene PTPN2 was discovered. Further investigation of the expression patterns of PTPN2 in different tissues and cells showed a robust correlation with AML. PTPN2 was then systematically correlated with immunological signatures in the AML tumor microenvironment and its differential expression was verified using clinical samples. In addition, a prediction model, being validated and compared with other models, was developed in our research. The systematic analysis of PTPN family reveals that the effect of PTPNs on cancer may be correlated to mediating cell cycle-related pathways. It was then found that PTPN2 was highly expressed in hematologic diseases and bone marrow tissues, and its differential expression in AML patients and normal humans was verified by clinical samples. Based on its correlation with immune infiltrates, immunomodulators, and immune checkpoint, PTPN2 was found to be a reliable biomarker in the immunotherapy cohort and a prognostic predictor of AML. And PTPN2'riskscore can accurately predict the prognosis and response of cancer immunotherapy. These findings revealed the correlation between PTPNs and immunophenotype, which may be related to cell cycle. PTPN2 was differentially expressed between clinical AML patients and normal people. It is a diagnostic biomarker and potentially therapeutic target, providing targeted guidance for clinical treatment.
Collapse
Affiliation(s)
- Xuanyu Wang
- Department of Urology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071, China
| | - Sanyun Wu
- Department of Hematology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Le Sun
- Department of Urology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071, China
| | - Peipei Jin
- Department of Hematology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Jianmin Zhang
- Department of Hematology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Wen Liu
- Department of Hematology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Zhuo Zhan
- Department of Hematology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
| | - Zisong Wang
- School of Basic Medical Sciences, Wuhan University, Wuhan, 430071, Hubei Province, China
| | - Xiaoping Liu
- Department of Pathology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071, China.
| | - Li He
- Department of Urology, Zhongnan Hospital of Wuhan University, 169 Donghu Road, Wuhan, 430071, China.
- Department of Hematology, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China.
| |
Collapse
|
6
|
Chu G, Ji X, Wang Y, Niu H. Integrated multiomics analysis and machine learning refine molecular subtypes and prognosis for muscle-invasive urothelial cancer. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 33:110-126. [PMID: 37449047 PMCID: PMC10336357 DOI: 10.1016/j.omtn.2023.06.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 06/01/2023] [Indexed: 07/18/2023]
Abstract
Muscle-invasive urothelial cancer (MUC), characterized by high aggressiveness and significant heterogeneity, is currently lacking highly precise individualized treatment options. We used a computational pipeline to synthesize multiomics data from MUC patients using 10 clustering algorithms, which were then combined with 10 machine learning algorithms to identify molecular subgroups of high resolution and develop a robust consensus machine learning-driven signature (CMLS). Through multiomics clustering, we identified three cancer subtypes (CSs) that are related to prognosis, with CS2 exhibiting the most favorable prognostic outcome. Subsequent screening enabled identification of 12 hub genes that constitute a CMLS with robust predictive power for prognosis. The low-CMLS group exhibited a more favorable prognosis and greater responsiveness to immunotherapy and was more likely to exhibit the "hot tumor" phenotype. The high-CMLS group had a poor prognosis and lower likelihood of benefitting from immunotherapy, but dasatinib and romidepsin may serve as promising treatments for them. Comprehensive analysis of multiomics data can offer important insights and further refine the molecular classification of MUC. Identification of CMLS represents a valuable tool for early prediction of patient prognosis and for screening potential candidates likely to benefit from immunotherapy, with broad implications for clinical practice.
Collapse
Affiliation(s)
- Guangdi Chu
- Department of Urology, The Affiliated Hospital of Qingdao University, Qingdao 266003, China
| | - Xiaoyu Ji
- Department of Gynecology Minimally Invasive Center, Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing, China
| | - Yonghua Wang
- Department of Urology, The Affiliated Hospital of Qingdao University, Qingdao 266003, China
| | - Haitao Niu
- Department of Urology, The Affiliated Hospital of Qingdao University, Qingdao 266003, China
| |
Collapse
|
7
|
Chen C, Wang J, Pan D, Wang X, Xu Y, Yan J, Wang L, Yang X, Yang M, Liu G. Applications of multi-omics analysis in human diseases. MedComm (Beijing) 2023; 4:e315. [PMID: 37533767 PMCID: PMC10390758 DOI: 10.1002/mco2.315] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 05/25/2023] [Accepted: 05/31/2023] [Indexed: 08/04/2023] Open
Abstract
Multi-omics usually refers to the crossover application of multiple high-throughput screening technologies represented by genomics, transcriptomics, single-cell transcriptomics, proteomics and metabolomics, spatial transcriptomics, and so on, which play a great role in promoting the study of human diseases. Most of the current reviews focus on describing the development of multi-omics technologies, data integration, and application to a particular disease; however, few of them provide a comprehensive and systematic introduction of multi-omics. This review outlines the existing technical categories of multi-omics, cautions for experimental design, focuses on the integrated analysis methods of multi-omics, especially the approach of machine learning and deep learning in multi-omics data integration and the corresponding tools, and the application of multi-omics in medical researches (e.g., cancer, neurodegenerative diseases, aging, and drug target discovery) as well as the corresponding open-source analysis tools and databases, and finally, discusses the challenges and future directions of multi-omics integration and application in precision medicine. With the development of high-throughput technologies and data integration algorithms, as important directions of multi-omics for future disease research, single-cell multi-omics and spatial multi-omics also provided a detailed introduction. This review will provide important guidance for researchers, especially who are just entering into multi-omics medical research.
Collapse
Affiliation(s)
- Chongyang Chen
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
- Co‐innovation Center of NeurodegenerationNantong UniversityNantongChina
| | - Jing Wang
- Shenzhen Key Laboratory of Modern ToxicologyShenzhen Medical Key Discipline of Health Toxicology (2020–2024)Shenzhen Center for Disease Control and PreventionShenzhenChina
| | - Donghui Pan
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
| | - Xinyu Wang
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
| | - Yuping Xu
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
| | - Junjie Yan
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
| | - Lizhen Wang
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
| | - Xifei Yang
- Shenzhen Key Laboratory of Modern ToxicologyShenzhen Medical Key Discipline of Health Toxicology (2020–2024)Shenzhen Center for Disease Control and PreventionShenzhenChina
| | - Min Yang
- Key Laboratory of Nuclear MedicineMinistry of HealthJiangsu Key Laboratory of Molecular Nuclear MedicineJiangsu Institute of Nuclear MedicineWuxiChina
| | - Gong‐Ping Liu
- Co‐innovation Center of NeurodegenerationNantong UniversityNantongChina
- Department of PathophysiologySchool of Basic MedicineKey Laboratory of Ministry of Education of China and Hubei Province for Neurological DisordersTongji Medical CollegeHuazhong University of Science and TechnologyWuhanChina
| |
Collapse
|
8
|
Erdem C, Gross SM, Heiser LM, Birtwistle MR. MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms. Nat Commun 2023; 14:3991. [PMID: 37414767 PMCID: PMC10326020 DOI: 10.1038/s41467-023-39729-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 06/27/2023] [Indexed: 07/08/2023] Open
Abstract
Robust identification of context-specific network features that control cellular phenotypes remains a challenge. We here introduce MOBILE (Multi-Omics Binary Integration via Lasso Ensembles) to nominate molecular features associated with cellular phenotypes and pathways. First, we use MOBILE to nominate mechanisms of interferon-γ (IFNγ) regulated PD-L1 expression. Our analyses suggest that IFNγ-controlled PD-L1 expression involves BST2, CLIC2, FAM83D, ACSL5, and HIST2H2AA3 genes, which were supported by prior literature. We also compare networks activated by related family members transforming growth factor-beta 1 (TGFβ1) and bone morphogenetic protein 2 (BMP2) and find that differences in ligand-induced changes in cell size and clustering properties are related to differences in laminin/collagen pathway activity. Finally, we demonstrate the broad applicability and adaptability of MOBILE by analyzing publicly available molecular datasets to investigate breast cancer subtype specific networks. Given the ever-growing availability of multi-omics datasets, we envision that MOBILE will be broadly useful for identification of context-specific molecular features and pathways.
Collapse
Affiliation(s)
- Cemal Erdem
- Department of Chemical and Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Sean M Gross
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA.
| | - Marc R Birtwistle
- Department of Chemical and Biomolecular Engineering, Clemson University, Clemson, SC, USA.
- Department of Bioengineering, Clemson University, Clemson, SC, USA.
| |
Collapse
|
9
|
Bakr S, Brennan K, Mukherjee P, Argemi J, Hernaez M, Gevaert O. Identifying key multifunctional components shared by critical cancer and normal liver pathways via SparseGMM. CELL REPORTS METHODS 2023; 3:100392. [PMID: 36814838 PMCID: PMC9939431 DOI: 10.1016/j.crmeth.2022.100392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/16/2022] [Accepted: 12/21/2022] [Indexed: 01/19/2023]
Abstract
Despite the abundance of multimodal data, suitable statistical models that can improve our understanding of diseases with genetic underpinnings are challenging to develop. Here, we present SparseGMM, a statistical approach for gene regulatory network discovery. SparseGMM uses latent variable modeling with sparsity constraints to learn Gaussian mixtures from multiomic data. By combining coexpression patterns with a Bayesian framework, SparseGMM quantitatively measures confidence in regulators and uncertainty in target gene assignment by computing gene entropy. We apply SparseGMM to liver cancer and normal liver tissue data and evaluate discovered gene modules in an independent single-cell RNA sequencing (scRNA-seq) dataset. SparseGMM identifies PROCR as a regulator of angiogenesis and PDCD1LG2 and HNF4A as regulators of immune response and blood coagulation in cancer. Furthermore, we show that more genes have significantly higher entropy in cancer compared with normal liver. Among high-entropy genes are key multifunctional components shared by critical pathways, including p53 and estrogen signaling.
Collapse
Affiliation(s)
- Shaimaa Bakr
- Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
- Department of Radiology, Stanford University, Stanford, CA 94305, USA
| | - Kevin Brennan
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Pritam Mukherjee
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Josepmaria Argemi
- Liver Unit, Clinica Universidad de Navarra, Hepatology Program, Center for Applied Medical Research, 31008 Pamplona, Navarra, Spain
| | - Mikel Hernaez
- Center for Applied Medical Research, University of Navarra, 31009 Pamplona, Navarra, Spain
| | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine and Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
10
|
Gimeno M, San José-Enériz E, Villar S, Agirre X, Prosper F, Rubio A, Carazo F. Explainable artificial intelligence for precision medicine in acute myeloid leukemia. Front Immunol 2022; 13:977358. [PMID: 36248800 PMCID: PMC9556772 DOI: 10.3389/fimmu.2022.977358] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 09/13/2022] [Indexed: 12/02/2022] Open
Abstract
Artificial intelligence (AI) can unveil novel personalized treatments based on drug screening and whole-exome sequencing experiments (WES). However, the concept of “black box” in AI limits the potential of this approach to be translated into the clinical practice. In contrast, explainable AI (XAI) focuses on making AI results understandable to humans. Here, we present a novel XAI method -called multi-dimensional module optimization (MOM)- that associates drug screening with genetic events, while guaranteeing that predictions are interpretable and robust. We applied MOM to an acute myeloid leukemia (AML) cohort of 319 ex-vivo tumor samples with 122 screened drugs and WES. MOM returned a therapeutic strategy based on the FLT3, CBFβ-MYH11, and NRAS status, which predicted AML patient response to Quizartinib, Trametinib, Selumetinib, and Crizotinib. We successfully validated the results in three different large-scale screening experiments. We believe that XAI will help healthcare providers and drug regulators better understand AI medical decisions.
Collapse
Affiliation(s)
- Marian Gimeno
- Departamento de Ingeniería Biomédica y Ciencias, TECNUN, Universidad de Navarra, San Sebastián, Spain
| | - Edurne San José-Enériz
- Programa Hemato-Oncología, Centro de Investigación Médica Aplicada, Instituto de Investigación Sanitaria de Navarra (IDISNA), Universidad de Navarra, Pamplona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Sara Villar
- Departamento de Hematología and CCUN (Cancer Center University of Navarra), Clínica Universidad de Navarra, Universidad de Navarra, Pamplona, Spain
| | - Xabier Agirre
- Programa Hemato-Oncología, Centro de Investigación Médica Aplicada, Instituto de Investigación Sanitaria de Navarra (IDISNA), Universidad de Navarra, Pamplona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
| | - Felipe Prosper
- Programa Hemato-Oncología, Centro de Investigación Médica Aplicada, Instituto de Investigación Sanitaria de Navarra (IDISNA), Universidad de Navarra, Pamplona, Spain
- Centro de Investigación Biomédica en Red de Cáncer (CIBERONC), Madrid, Spain
- Departamento de Hematología and CCUN (Cancer Center University of Navarra), Clínica Universidad de Navarra, Universidad de Navarra, Pamplona, Spain
| | - Angel Rubio
- Departamento de Ingeniería Biomédica y Ciencias, TECNUN, Universidad de Navarra, San Sebastián, Spain
- Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Universidad de Navarra, Pamplona, Spain
- *Correspondence: Angel Rubio, ; Fernando Carazo,
| | - Fernando Carazo
- Departamento de Ingeniería Biomédica y Ciencias, TECNUN, Universidad de Navarra, San Sebastián, Spain
- Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Universidad de Navarra, Pamplona, Spain
- *Correspondence: Angel Rubio, ; Fernando Carazo,
| |
Collapse
|
11
|
Xiang J, Meng X, Zhao Y, Wu FX, Li M. HyMM: hybrid method for disease-gene prediction by integrating multiscale module structure. Brief Bioinform 2022; 23:6547263. [PMID: 35275996 DOI: 10.1093/bib/bbac072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Revised: 01/18/2022] [Accepted: 02/13/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Identifying disease-related genes is an important issue in computational biology. Module structure widely exists in biomolecule networks, and complex diseases are usually thought to be caused by perturbations of local neighborhoods in the networks, which can provide useful insights for the study of disease-related genes. However, the mining and effective utilization of the module structure is still challenging in such issues as a disease gene prediction. RESULTS We propose a hybrid disease-gene prediction method integrating multiscale module structure (HyMM), which can utilize multiscale information from local to global structure to more effectively predict disease-related genes. HyMM extracts module partitions from local to global scales by multiscale modularity optimization with exponential sampling, and estimates the disease relatedness of genes in partitions by the abundance of disease-related genes within modules. Then, a probabilistic model for integration of gene rankings is designed in order to integrate multiple predictions derived from multiscale module partitions and network propagation, and a parameter estimation strategy based on functional information is proposed to further enhance HyMM's predictive power. By a series of experiments, we reveal the importance of module partitions at different scales, and verify the stable and good performance of HyMM compared with eight other state-of-the-arts and its further performance improvement derived from the parameter estimation. CONCLUSIONS The results confirm that HyMM is an effective framework for integrating multiscale module structure to enhance the ability to predict disease-related genes, which may provide useful insights for the study of the multiscale module structure and its application in such issues as a disease-gene prediction.
Collapse
Affiliation(s)
- Ju Xiang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China; Department of Basic Medical Sciences & Academician Workstation, Changsha Medical University, Changsha, Hunan 410219, China
| | - Xiangmao Meng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yichao Zhao
- School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, S7N 5A9, Canada
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
12
|
Shen C, Li H, Li M, Niu Y, Liu J, Zhu L, Gui H, Han W, Wang H, Zhang W, Wang X, Luo X, Sun Y, Yan J, Guan F. DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA-miRNA-mRNA regulatory axes. Brief Bioinform 2022; 23:6537347. [PMID: 35224615 PMCID: PMC8921741 DOI: 10.1093/bib/bbac046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 01/13/2022] [Accepted: 01/29/2022] [Indexed: 12/12/2022] Open
Abstract
The lack of a reliable and easy-to-operate screening pipeline for disease-related noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related lncRNA-miRNA-mRNA regulatory axis prediction from multiomics (DLRAPom), to identify risk biomarkers and disease-related lncRNA-miRNA-mRNA regulatory axes by adding a novel machine learning model on the basis of conventional analysis and combining experimental validation. The pipeline consists of four parts, including selecting hub biomarkers by conventional bioinformatics analysis, discovering the most essential protein-coding biomarkers by a novel machine learning model, extracting the key lncRNA-miRNA-mRNA axis and validating experimentally. Our study is the first one to propose a new pipeline predicting the interactions between lncRNA and miRNA and mRNA by combining WGCNA and XGBoost. Compared with the methods reported previously, we developed an Optimized XGBoost model to reduce the degree of overfitting in multiomics data, thereby improving the generalization ability of the overall model for the integrated analysis of multiomics data. With applications to gestational diabetes mellitus (GDM), we predicted nine risk protein-coding biomarkers and some potential lncRNA-miRNA-mRNA regulatory axes, which all correlated with GDM. In those regulatory axes, the MALAT1/hsa-miR-144-3p/IRS1 axis was predicted to be the key axis and was identified as being associated with GDM for the first time. In short, as a flexible pipeline, DLRAPom can contribute to molecular pathogenesis research of diseases, effectively predicting potential disease-related noncoding RNA regulatory networks and providing promising candidates for functional research on disease pathogenesis.
Collapse
Affiliation(s)
- Chen Shen
- Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Huiyu Li
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Miao Li
- Department of Ultrasound, the Second Affiliated Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Yu Niu
- Department of Endocrinology and Metabolism, Ninth Hospital of Xi'an City, Xi'an, China
| | - Jing Liu
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Li Zhu
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Hongsheng Gui
- Center for Behavior Health and Psychiatry Research, Henry Ford Health System, Detroit, MI, USA
| | - Wei Han
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Huiying Wang
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Wenpei Zhang
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Xiaochen Wang
- Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Xiao Luo
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Yu Sun
- Department of Endocrinology and Metabolism, Qilu Hospital of Shandong University, Ji'nan, China
| | - Jiangwei Yan
- Department of Genetics, School of Medicine & Forensics, Shanxi Medical University, Taiyuan, China
| | - Fanglin Guan
- Shanghai Key Laboratory of Forensic Medicine, Academy of Forensic Science; Key Laboratory of National Ministry of Health for Forensic Sciences, School of Medicine & Forensics, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| |
Collapse
|
13
|
Abstract
Multi-omics data analysis is an important aspect of cancer molecular biology studies and has led to ground-breaking discoveries. Many efforts have been made to develop machine learning methods that automatically integrate omics data. Here, we review machine learning tools categorized as either general-purpose or task-specific, covering both supervised and unsupervised learning for integrative analysis of multi-omics data. We benchmark the performance of five machine learning approaches using data from the Cancer Cell Line Encyclopedia, reporting accuracy on cancer type classification and mean absolute error on drug response prediction, and evaluating runtime efficiency. This review provides recommendations to researchers regarding suitable machine learning method selection for their specific applications. It should also promote the development of novel machine learning methodologies for data integration, which will be essential for drug discovery, clinical trial design, and personalized treatments.
Collapse
Affiliation(s)
- Zhaoxiang Cai
- ProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
| | - Rebecca C. Poulos
- ProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
| | - Jia Liu
- ProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
- Faculty of Medicine, Western Sydney University, Campbelltown, NSW, Australia
| | - Qing Zhong
- ProCan®, Children’s Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, 214 Hawkesbury Rd, Westmead, NSW 2145, Australia
| |
Collapse
|
14
|
Li R, Li L, Xu Y, Yang J. Machine learning meets omics: applications and perspectives. Brief Bioinform 2021; 23:6425809. [PMID: 34791021 DOI: 10.1093/bib/bbab460] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 09/29/2021] [Accepted: 10/07/2021] [Indexed: 02/07/2023] Open
Abstract
The innovation of biotechnologies has allowed the accumulation of omics data at an alarming rate, thus introducing the era of 'big data'. Extracting inherent valuable knowledge from various omics data remains a daunting problem in bioinformatics. Better solutions often need some kind of more innovative methods for efficient handlings and effective results. Recent advancements in integrated analysis and computational modeling of multi-omics data helped address such needs in an increasingly harmonious manner. The development and application of machine learning have largely advanced our insights into biology and biomedicine and greatly promoted the development of therapeutic strategies, especially for precision medicine. Here, we propose a comprehensive survey and discussion on what happened, is happening and will happen when machine learning meets omics. Specifically, we describe how artificial intelligence can be applied to omics studies and review recent advancements at the interface between machine learning and the ever-widest range of omics including genomics, transcriptomics, proteomics, metabolomics, radiomics, as well as those at the single-cell resolution. We also discuss and provide a synthesis of ideas, new insights, current challenges and perspectives of machine learning in omics.
Collapse
Affiliation(s)
- Rufeng Li
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an 710061, P. R. China
| | - Lixin Li
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an 710061, P. R. China
| | - Yungang Xu
- School of Electronics and Information, Northwestern Polytechnical University, Xi'an, 710129, China
| | - Juan Yang
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Xi'an Jiaotong University Health Science Center, Xi'an 710061, P. R. China.,Key Laboratory of Environment and Genes Related to Diseases (Xi'an Jiaotong University), Ministry of Education of China, Xi'an 710061, P. R. China
| |
Collapse
|
15
|
Li F, Dong S, Leier A, Han M, Guo X, Xu J, Wang X, Pan S, Jia C, Zhang Y, Webb GI, Coin LJM, Li C, Song J. Positive-unlabeled learning in bioinformatics and computational biology: a brief review. Brief Bioinform 2021; 23:6415313. [PMID: 34729589 DOI: 10.1093/bib/bbab461] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 09/27/2021] [Accepted: 10/07/2021] [Indexed: 12/14/2022] Open
Abstract
Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This classification scheme requires two fully labeled classes of data (e.g. positive and negative samples) to train a classification model. However, in many bioinformatics applications, labeling data is laborious, and the negative samples might be potentially mislabeled due to the limited sensitivity of the experimental equipment. The positive unlabeled (PU) learning scheme was therefore proposed to enable the classifier to learn directly from limited positive samples and a large number of unlabeled samples (i.e. a mixture of positive or negative samples). To date, several PU learning algorithms have been developed to address various biological questions, such as sequence identification, functional site characterization and interaction prediction. In this paper, we revisit a collection of 29 state-of-the-art PU learning bioinformatic applications to address various biological questions. Various important aspects are extensively discussed, including PU learning methodology, biological application, classifier design and evaluation strategy. We also comment on the existing issues of PU learning and offer our perspectives for the future development of PU learning applications. We anticipate that our work serves as an instrumental guideline for a better understanding of the PU learning framework in bioinformatics and further developing next-generation PU learning frameworks for critical biological applications.
Collapse
Affiliation(s)
- Fuyi Li
- Monash University, Australia
| | | | - André Leier
- Department of Genetics, UAB School of Medicine, USA
| | - Meiya Han
- Department of Biochemistry and Molecular Biology, Monash University, Australia
| | | | - Jing Xu
- Computer Science and Technology from Nankai University, China
| | - Xiaoyu Wang
- Department of Biochemistry and Molecular Biology and Biomedicine Discovery Institute, Monash University, Australia
| | - Shirui Pan
- University of Technology Sydney (UTS), Ultimo, NSW, Australia
| | - Cangzhi Jia
- College of Science, Dalian Maritime University, Australia
| | - Yang Zhang
- Northwestern Polytechnical University, China
| | - Geoffrey I Webb
- Faculty of Information Technology at Monash University, Australia
| | - Lachlan J M Coin
- Department of Clinical Pathology, University of Melbourne, Australia
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry of Molecular Biology, Monash University, Australia
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Melbourne, Australia
| |
Collapse
|
16
|
Weissler EH, Naumann T, Andersson T, Ranganath R, Elemento O, Luo Y, Freitag DF, Benoit J, Hughes MC, Khan F, Slater P, Shameer K, Roe M, Hutchison E, Kollins SH, Broedl U, Meng Z, Wong JL, Curtis L, Huang E, Ghassemi M. The role of machine learning in clinical research: transforming the future of evidence generation. Trials 2021; 22:537. [PMID: 34399832 PMCID: PMC8365941 DOI: 10.1186/s13063-021-05489-x] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
Background Interest in the application of machine learning (ML) to the design, conduct, and analysis of clinical trials has grown, but the evidence base for such applications has not been surveyed. This manuscript reviews the proceedings of a multi-stakeholder conference to discuss the current and future state of ML for clinical research. Key areas of clinical trial methodology in which ML holds particular promise and priority areas for further investigation are presented alongside a narrative review of evidence supporting the use of ML across the clinical trial spectrum. Results Conference attendees included stakeholders, such as biomedical and ML researchers, representatives from the US Food and Drug Administration (FDA), artificial intelligence technology and data analytics companies, non-profit organizations, patient advocacy groups, and pharmaceutical companies. ML contributions to clinical research were highlighted in the pre-trial phase, cohort selection and participant management, and data collection and analysis. A particular focus was paid to the operational and philosophical barriers to ML in clinical research. Peer-reviewed evidence was noted to be lacking in several areas. Conclusions ML holds great promise for improving the efficiency and quality of clinical research, but substantial barriers remain, the surmounting of which will require addressing significant gaps in evidence.
Collapse
Affiliation(s)
- E Hope Weissler
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA.
| | | | | | - Rajesh Ranganath
- Courant Institute of Mathematical Science, New York University, New York, NY, USA
| | - Olivier Elemento
- Englander Institute for Precision Medicine, Weill Cornell Medical College, New York, NY, USA
| | - Yuan Luo
- Northwestern University Clinical and Translational Sciences Institute, Northwestern University, Chicago, IL, USA
| | - Daniel F Freitag
- Division Pharmaceuticals, Open Innovation and Digital Technologies, Bayer AG, Wuppertal, Germany
| | - James Benoit
- University of Alberta, Edmonton, Alberta, Canada
| | - Michael C Hughes
- Department of Computer Science, Tufts University, Medford, MA, USA
| | | | | | | | | | | | - Scott H Kollins
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA
| | - Uli Broedl
- Boehringer-Ingelheim, Burlington, Canada
| | | | | | - Lesley Curtis
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA
| | - Erich Huang
- Duke Clinical Research Institute, Duke University School of Medicine, Box 2834, Durham, NC, 27701, USA.,Duke Forge, Durham, NC, USA
| | - Marzyeh Ghassemi
- Vector Institute, University of Toronto, Toronto, Ontario, Canada.,Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.,Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, Massachusetts, 02139, USA.,CIFAR AI Chair, Vector Institute, Toronto, Ontario, Canada
| |
Collapse
|
17
|
Tarazona S, Arzalluz-Luque A, Conesa A. Undisclosed, unmet and neglected challenges in multi-omics studies. NATURE COMPUTATIONAL SCIENCE 2021; 1:395-402. [PMID: 38217236 DOI: 10.1038/s43588-021-00086-z] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/17/2021] [Indexed: 01/15/2024]
Abstract
Multi-omics approaches have become a reality in both large genomics projects and small laboratories. However, the multi-omics research community still faces a number of issues that have either not been sufficiently discussed or for which current solutions are still limited. In this Perspective, we elaborate on these limitations and suggest points of attention for future research. We finally discuss new opportunities and challenges brought to the field by the rapid development of single-cell high-throughput molecular technologies.
Collapse
Affiliation(s)
- Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Angeles Arzalluz-Luque
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Ana Conesa
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
- Institute for Integrative Systems Biology, Spanish National Research Council, Valencia, Spain.
| |
Collapse
|
18
|
Yee NS. Machine intelligence for precision oncology. World J Transl Med 2021; 9:1-10. [DOI: 10.5528/wjtm.v9.i1.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 12/22/2020] [Accepted: 03/02/2021] [Indexed: 02/06/2023] Open
Abstract
Despite various advances in cancer research, the incidence and mortality rates of malignant diseases have remained high. Accurate risk assessment, prevention, detection, and treatment of cancer tailored to the individual are major challenges in clinical oncology. Artificial intelligence (AI), a field of applied computer science, has shown promising potential of accelerating evolution of healthcare towards precision oncology. This article focuses on highlights of the application of data-driven machine learning (ML) and deep learning (DL) in translational research for cancer diagnosis, prognosis, treatment, and clinical outcomes. ML-based algorithms in radiological and histological images have been demonstrated to improve detection and diagnosis of cancer. DL-based prediction models in molecular or multi-omics datasets of cancer for biomarkers and targets enable drug discovery and treatment. ML approaches combining radiomics with genomics and other omics data enhance the power of AI in improving diagnosis, prognostication, and treatment of cancer. Ethical and regulatory issues involving patient confidentiality and data security impose certain limitations on practical implementation of ML in clinical oncology. However, the ultimate goal of application of AI in cancer research is to develop and implement multi-modal machine intelligence for improving clinical decision on individualized management of patients.
Collapse
Affiliation(s)
- Nelson S Yee
- Department of Medicine, The Pennsylvania State University College of Medicine, Penn State Cancer Institute, Penn State Health Milton S. Hershey Medical Center, Hershey, PA 17033-0850, United States
| |
Collapse
|
19
|
Kanwar MK, Kilic A, Mehra MR. Machine learning, artificial intelligence and mechanical circulatory support: A primer for clinicians. J Heart Lung Transplant 2021; 40:414-425. [PMID: 33775520 DOI: 10.1016/j.healun.2021.02.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 01/26/2021] [Accepted: 02/22/2021] [Indexed: 12/11/2022] Open
Abstract
Artificial intelligence (AI) refers to the ability of machines to perform intelligent tasks, and machine learning (ML) is a subset of AI describing the ability of machines to learn independently and make accurate predictions. The application of AI combined with "big data" from the electronic health records, is poised to impact how we take care of patients. In recent years, an expanding body of literature has been published using ML in cardiovascular health care, including mechanical circulatory support (MCS). This primer article provides an overview for clinicians on relevant concepts of ML and AI, reviews predictive modeling concepts in ML and provides contextual reference to how AI is being adapted in the field of MCS. Lastly, it explains how these methods could be incorporated in the practices of medicine to improve patient outcomes.
Collapse
Affiliation(s)
- Manreet K Kanwar
- Cardiovascular Institute at Allegheny Health Network, Pittsburgh, Pennsylvania
| | - Arman Kilic
- Division of Cardiac Surgery, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Mandeep R Mehra
- Brigham and Women's Hospital Heart and Vascular Center and Harvard Medical School, Boston, Massachusetts.
| |
Collapse
|
20
|
Subramanian M, Wojtusciszyn A, Favre L, Boughorbel S, Shan J, Letaief KB, Pitteloud N, Chouchane L. Precision medicine in the era of artificial intelligence: implications in chronic disease management. J Transl Med 2020; 18:472. [PMID: 33298113 PMCID: PMC7725219 DOI: 10.1186/s12967-020-02658-5] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 12/02/2020] [Indexed: 02/07/2023] Open
Abstract
Aberrant metabolism is the root cause of several serious health issues, creating a huge burden to health and leading to diminished life expectancy. A dysregulated metabolism induces the secretion of several molecules which in turn trigger the inflammatory pathway. Inflammation is the natural reaction of the immune system to a variety of stimuli, such as pathogens, damaged cells, and harmful substances. Metabolically triggered inflammation, also called metaflammation or low-grade chronic inflammation, is the consequence of a synergic interaction between the host and the exposome-a combination of environmental drivers, including diet, lifestyle, pollutants and other factors throughout the life span of an individual. Various levels of chronic inflammation are associated with several lifestyle-related diseases such as diabetes, obesity, metabolic associated fatty liver disease (MAFLD), cancers, cardiovascular disorders (CVDs), autoimmune diseases, and chronic lung diseases. Chronic diseases are a growing concern worldwide, placing a heavy burden on individuals, families, governments, and health-care systems. New strategies are needed to empower communities worldwide to prevent and treat these diseases. Precision medicine provides a model for the next generation of lifestyle modification. This will capitalize on the dynamic interaction between an individual's biology, lifestyle, behavior, and environment. The aim of precision medicine is to design and improve diagnosis, therapeutics and prognostication through the use of large complex datasets that incorporate individual gene, function, and environmental variations. The implementation of high-performance computing (HPC) and artificial intelligence (AI) can predict risks with greater accuracy based on available multidimensional clinical and biological datasets. AI-powered precision medicine provides clinicians with an opportunity to specifically tailor early interventions to each individual. In this article, we discuss the strengths and limitations of existing and evolving recent, data-driven technologies, such as AI, in preventing, treating and reversing lifestyle-related diseases.
Collapse
Affiliation(s)
- Murugan Subramanian
- Department of Microbiology and Immunology, Weill Cornell Medicine, New York, USA.,Genetic Intelligence Laboratory, Weill Cornell Medicine-Qatar, Qatar Foundation, Doha, Qatar
| | - Anne Wojtusciszyn
- Service of Endocrinology, Diabetology and Metabolism, Lausanne University Hospital, Lausanne, Switzerland
| | - Lucie Favre
- Service of Endocrinology, Diabetology and Metabolism, Lausanne University Hospital, Lausanne, Switzerland
| | - Sabri Boughorbel
- Clinical Bioinformatics Section, Research Division, Sidra Medicine, Doha, Qatar
| | - Jingxuan Shan
- Genetic Intelligence Laboratory, Weill Cornell Medicine-Qatar, Qatar Foundation, Doha, Qatar.,Department of Genetic Medicine, Weill Cornell Medicine, 45 E 69th Street, Suite 432, New York, NY, 10021, USA
| | - Khaled B Letaief
- Department of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Nelly Pitteloud
- Service of Endocrinology, Diabetology and Metabolism, Lausanne University Hospital, Lausanne, Switzerland.
| | - Lotfi Chouchane
- Department of Microbiology and Immunology, Weill Cornell Medicine, New York, USA. .,Genetic Intelligence Laboratory, Weill Cornell Medicine-Qatar, Qatar Foundation, Doha, Qatar. .,Department of Genetic Medicine, Weill Cornell Medicine, 45 E 69th Street, Suite 432, New York, NY, 10021, USA.
| |
Collapse
|
21
|
Oh M, Park S, Lee S, Lee D, Lim S, Jeong D, Jo K, Jung I, Kim S. DRIM: A Web-Based System for Investigating Drug Response at the Molecular Level by Condition-Specific Multi-Omics Data Integration. Front Genet 2020; 11:564792. [PMID: 33281870 PMCID: PMC7689278 DOI: 10.3389/fgene.2020.564792] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/14/2020] [Indexed: 12/11/2022] Open
Abstract
Pharmacogenomics is the study of how genes affect a person's response to drugs. Thus, understanding the effect of drug at the molecular level can be helpful in both drug discovery and personalized medicine. Over the years, transcriptome data upon drug treatment has been collected and several databases compiled before drug treatment cancer cell multi-omics data with drug sensitivity (IC 50, AUC) or time-series transcriptomic data after drug treatment. However, analyzing transcriptome data upon drug treatment is challenging since more than 20,000 genes interact in complex ways. In addition, due to the difficulty of both time-series analysis and multi-omics integration, current methods can hardly perform analysis of databases with different data characteristics. One effective way is to interpret transcriptome data in terms of well-characterized biological pathways. Another way is to leverage state-of-the-art methods for multi-omics data integration. In this paper, we developed Drug Response analysis Integrating Multi-omics and time-series data (DRIM), an integrative multi-omics and time-series data analysis framework that identifies perturbed sub-pathways and regulation mechanisms upon drug treatment. The system takes drug name and cell line identification numbers or user's drug control/treat time-series gene expression data as input. Then, analysis of multi-omics data upon drug treatment is performed in two perspectives. For the multi-omics perspective analysis, IC 50-related multi-omics potential mediator genes are determined by embedding multi-omics data to gene-centric vector space using a tensor decomposition method and an autoencoder deep learning model. Then, perturbed pathway analysis of potential mediator genes is performed. For the time-series perspective analysis, time-varying perturbed sub-pathways upon drug treatment are constructed. Additionally, a network involving transcription factors (TFs), multi-omics potential mediator genes, and perturbed sub-pathways is constructed, and paths to perturbed pathways from TFs are determined by an influence maximization method. To demonstrate the utility of our system, we provide analysis results of sub-pathway regulatory mechanisms in breast cancer cell lines of different drug sensitivity. DRIM is available at: http://biohealth.snu.ac.kr/software/DRIM/.
Collapse
Affiliation(s)
- Minsik Oh
- Department of Computer Science and Engineering, Seoul National University, Seoul, South Korea
| | - Sungjoon Park
- Department of Computer Science and Engineering, Seoul National University, Seoul, South Korea
| | - Sangseon Lee
- Bioinformatics Institute, Seoul National University, Seoul, South Korea
| | - Dohoon Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Sangsoo Lim
- Bioinformatics Institute, Seoul National University, Seoul, South Korea
| | - Dabin Jeong
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Kyuri Jo
- Department of Computer Engineering, Chungbuk National University, Cheongju, South Korea
| | - Inuk Jung
- Department of Computer Science and Engineering, Kyungpook National University, Daegu, South Korea
| | - Sun Kim
- Bioinformatics Institute, Seoul National University, Seoul, South Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
- Department of Computer Science and Engineering, Institute of Engineering Research, Seoul National University, Seoul, South Korea
| |
Collapse
|