1
|
Wang FA, Li Y, Zeng T. Deep Learning of radiology-genomics integration for computational oncology: A mini review. Comput Struct Biotechnol J 2024; 23:2708-2716. [PMID: 39035833 PMCID: PMC11260400 DOI: 10.1016/j.csbj.2024.06.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/18/2024] [Accepted: 06/18/2024] [Indexed: 07/23/2024] Open
Abstract
In the field of computational oncology, patient status is often assessed using radiology-genomics, which includes two key technologies and data, such as radiology and genomics. Recent advances in deep learning have facilitated the integration of radiology-genomics data, and even new omics data, significantly improving the robustness and accuracy of clinical predictions. These factors are driving artificial intelligence (AI) closer to practical clinical applications. In particular, deep learning models are crucial in identifying new radiology-genomics biomarkers and therapeutic targets, supported by explainable AI (xAI) methods. This review focuses on recent developments in deep learning for radiology-genomics integration, highlights current challenges, and outlines some research directions for multimodal integration and biomarker discovery of radiology-genomics or radiology-omics that are urgently needed in computational oncology.
Collapse
Affiliation(s)
- Feng-ao Wang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Yixue Li
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- Guangzhou National Laboratory, Guangzhou, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
2
|
Tang M, Antić Ž, Fardzadeh P, Pietzsch S, Schröder C, Eberhardt A, van Bömmel A, Escherich G, Hofmann W, Horstmann MA, Illig T, McCrary JM, Lentes J, Metzler M, Nejdl W, Schlegelberger B, Schrappe M, Zimmermann M, Miarka-Walczyk K, Pastorczak A, Cario G, Renard BY, Stanulla M, Bergmann AK. An artificial intelligence-assisted clinical framework to facilitate diagnostics and translational discovery in hematologic neoplasia. EBioMedicine 2024; 104:105171. [PMID: 38810562 PMCID: PMC11154115 DOI: 10.1016/j.ebiom.2024.105171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 05/10/2024] [Accepted: 05/15/2024] [Indexed: 05/31/2024] Open
Abstract
BACKGROUND The increasing volume and intricacy of sequencing data, along with other clinical and diagnostic data, like drug responses and measurable residual disease, creates challenges for efficient clinical comprehension and interpretation. Using paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) as a use case, we present an artificial intelligence (AI)-assisted clinical framework clinALL that integrates genomic and clinical data into a user-friendly interface to support routine diagnostics and reveal translational insights for hematologic neoplasia. METHODS We performed targeted RNA sequencing in 1365 cases with haematological neoplasms, primarily paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) from the AIEOP-BFM ALL study. We carried out fluorescence in situ hybridization (FISH), karyotyping and arrayCGH as part of the routine diagnostics. The analysis results of these assays as well as additional clinical information were integrated into an interactive web interface using Bokeh, where the main graph is based on Uniform Manifold Approximation and Projection (UMAP) analysis of the gene expression data. At the backend of the clinALL, we built both shallow machine learning models and a deep neural network using Scikit-learn and PyTorch respectively. FINDINGS By applying clinALL, 78% of undetermined patients under the current diagnostic protocol were stratified, and ambiguous cases were investigated. Translational insights were discovered, including IKZF1plus status dependent subpopulations of BCR::ABL1 positive patients, and a subpopulation within ETV6::RUNX1 positive patients that has a high relapse frequency. Our best machine learning models, LDA and PASNET-like neural network models, achieve F1 scores above 97% in predicting patients' subgroups. INTERPRETATION An AI-assisted clinical framework that integrates both genomic and clinical data can take full advantage of the available data, improve point-of-care decision-making and reveal clinically relevant insights promptly. Such a lightweight and easily transferable framework works for both whole transcriptome data as well as the cost-effective targeted RNA-seq, enabling efficient and equitable delivery of personalized medicine in small clinics in developing countries. FUNDING German Ministry of Education and Research (BMBF), German Research Foundation (DFG) and Foundation for Polish Science.
Collapse
Affiliation(s)
- Ming Tang
- Department of Human Genetics, Hannover Medical School, Hannover, Germany; L3S Research Centre, Leibniz University Hannover, Germany
| | - Željko Antić
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | | | - Stefan Pietzsch
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Charlotte Schröder
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | | | - Alena van Bömmel
- Leibniz Institute on Aging - Fritz Lipmann Institute (FLI), Jena, Germany
| | - Gabriele Escherich
- Clinic of Paediatric Haematology and Oncology, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Winfried Hofmann
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Martin A Horstmann
- Clinic of Paediatric Haematology and Oncology, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany; Research Institute Children's Cancer Centre Hamburg, Hamburg, Germany
| | - Thomas Illig
- Hannover Unified Bio Bank, Hannover Medical School, Hannover, Germany
| | - J Matt McCrary
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Jana Lentes
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Markus Metzler
- Department of Paediatrics, University Hospital Erlangen, Erlangen, Germany
| | - Wolfgang Nejdl
- L3S Research Centre, Leibniz University Hannover, Germany
| | | | - Martin Schrappe
- Department of Paediatrics, University Medical Centre Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Martin Zimmermann
- Department of Paediatric Haematology and Oncology, Hannover Medical School, Hannover, Germany
| | - Karolina Miarka-Walczyk
- Department of Paediatrics, Oncology and Haematology, Medical University of Lodz, Lodz, Poland
| | - Agata Pastorczak
- Department of Paediatrics, Oncology and Haematology, Medical University of Lodz, Lodz, Poland
| | - Gunnar Cario
- Department of Paediatrics, University Medical Centre Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Bernhard Y Renard
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
| | - Martin Stanulla
- Department of Paediatric Haematology and Oncology, Hannover Medical School, Hannover, Germany
| | | |
Collapse
|
3
|
Ko E, Kim Y, Shokoohi F, Mersha TB, Kang M. SPIN: sex-specific and pathway-based interpretable neural network for sexual dimorphism analysis. Brief Bioinform 2024; 25:bbae239. [PMID: 38807262 PMCID: PMC11133003 DOI: 10.1093/bib/bbae239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 03/29/2024] [Accepted: 04/26/2024] [Indexed: 05/30/2024] Open
Abstract
Sexual dimorphism in prevalence, severity and genetic susceptibility exists for most common diseases. However, most genetic and clinical outcome studies are designed in sex-combined framework considering sex as a covariate. Few sex-specific studies have analyzed males and females separately, which failed to identify gene-by-sex interaction. Here, we propose a novel unified biologically interpretable deep learning-based framework (named SPIN) for sexual dimorphism analysis. We demonstrate that SPIN significantly improved the C-index up to 23.6% in TCGA cancer datasets, and it was further validated using asthma datasets. In addition, SPIN identifies sex-specific and -shared risk loci that are often missed in previous sex-combined/-separate analysis. We also show that SPIN is interpretable for explaining how biological pathways contribute to sexual dimorphism and improve risk prediction in an individual level, which can result in the development of precision medicine tailored to a specific individual's characteristics.
Collapse
Affiliation(s)
- Euiseong Ko
- Department of Computer Science, University of Nevada, Las Vegas, Las Vegas, NV, USA
| | - Youngsoon Kim
- Department of Information and Statistics and Department of Bio&Medical Bigdata (BK21 Four program), Gyeongsang National University, Jinju, Republic of Korea
| | - Farhad Shokoohi
- Department of Mathematical Sciences, University of Nevada, Las Vegas, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| | - Mingon Kang
- Department of Computer Science, University of Nevada, Las Vegas, Las Vegas, NV, USA
| |
Collapse
|
4
|
Liu X, Tao Y, Cai Z, Bao P, Ma H, Li K, Li M, Zhu Y, Lu ZJ. Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data. Bioinformatics 2024; 40:btae316. [PMID: 38741230 PMCID: PMC11139513 DOI: 10.1093/bioinformatics/btae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/29/2024] [Accepted: 05/11/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Multi-omics data provide a comprehensive view of gene regulation at multiple levels, which is helpful in achieving accurate diagnosis of complex diseases like cancer. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability. RESULTS To integrate various multi-omics data of tissue and liquid biopsies for disease diagnosis and prognosis, we developed a biological pathway informed Transformer, Pathformer. It embeds multi-omics input with a compacted multi-modal vector and a pathway-based sparse neural network. Pathformer also leverages criss-cross attention mechanism to capture the crosstalk between different pathways and modalities. We first benchmarked Pathformer with 18 comparable methods on multiple cancer datasets, where Pathformer outperformed all the other methods, with an average improvement of 6.3%-14.7% in F1 score for cancer survival prediction, 5.1%-12% for cancer stage prediction, and 8.1%-13.6% for cancer drug response prediction. Subsequently, for cancer prognosis prediction based on tissue multi-omics data, we used a case study to demonstrate the biological interpretability of Pathformer by identifying key pathways and their biological crosstalk. Then, for cancer early diagnosis based on liquid biopsy data, we used plasma and platelet datasets to demonstrate Pathformer's potential of clinical applications in cancer screening. Moreover, we revealed deregulation of interesting pathways (e.g. scavenger receptor pathway) and their crosstalk in cancer patients' blood, providing potential candidate targets for cancer microenvironment study. AVAILABILITY AND IMPLEMENTATION Pathformer is implemented and freely available at https://github.com/lulab/Pathformer.
Collapse
Affiliation(s)
- Xiaofan Liu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Yuhuan Tao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Zilin Cai
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Pengfei Bao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Hongli Ma
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Kexing Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Mengtao Li
- Department of Rheumatology and Clinical Immunology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC-DID), MST State Key Laboratory of Complex Severe and Rare Diseases, MOE Key Laboratory of Rheumatology and Clinical Immunology, Beijing 100730, China
| | - Yunping Zhu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| |
Collapse
|
5
|
Shannon CP, Lee AH, Tebbutt SJ, Singh A. A Commentary on Multi-omics Data Integration in Systems Vaccinology. J Mol Biol 2024; 436:168522. [PMID: 38458605 DOI: 10.1016/j.jmb.2024.168522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 03/04/2024] [Accepted: 03/04/2024] [Indexed: 03/10/2024]
Affiliation(s)
| | - Amy Hy Lee
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada
| | - Scott J Tebbutt
- PROOF Centre of Excellence, Vancouver, Canada; Department of Medicine, The University of British Columbia, Vancouver, Canada; Centre for Heart Lung Innovation, Vancouver, Canada
| | - Amrit Singh
- Centre for Heart Lung Innovation, Vancouver, Canada; Department of Anesthesiology, Pharmacology and Therapeutics, The University of British Columbia, Vancouver, Canada.
| |
Collapse
|
6
|
Kumar S, Sarmah DT, Paul A, Chatterjee S. Exploration of functional relations among differentially co-expressed genes identifies regulators in glioblastoma. Comput Biol Chem 2024; 109:108024. [PMID: 38335855 DOI: 10.1016/j.compbiolchem.2024.108024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 12/15/2023] [Accepted: 02/02/2024] [Indexed: 02/12/2024]
Abstract
The conventional computational approaches to investigating a disease confront inherent constraints as they often need to improve in delving beyond protein functional associations and grasping their deeper contextual significance within the disease framework. Such context-specificity can be explored using clinical data by evaluating the change in interaction between the biological entities in different conditions by investigating the differential co-expression relationships. We believe that the integration and analysis of differential co-expression and the functional relationships, primarily focusing on the source nodes, will open novel insights about disease progression as the source proteins could trigger signaling cascades, mostly because they are transcription factors, cell surface receptors, or enzymes that respond instantly to a particular stimulus. A thorough contextual investigation of these nodes could lead to a helpful beginning point for identifying potential causal linkages and guiding subsequent scientific investigations to uncover mechanisms underlying observed associations. Our methodology includes functional protein-protein Interaction (PPI) data and co-expression information and filters functional linkages through a series of critical steps, culminating in the identification of a robust set of regulators. Our analysis identified eleven key regulators-AKT1, BRCA1, CAMK2G, CUL1, FGFR3, KIF3A, NUP210, PRKACB, RAB8A, RPS6KA2 and TGFB3-in glioblastoma. These regulators play a pivotal role in disease classification, cell growth control, and patient survivability and exhibit associations with immune infiltrations and disease hallmarks. This underscores the importance of assessing correlation towards causality in unraveling complex biological insights.
Collapse
Affiliation(s)
- Shivam Kumar
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Dipanka Tanu Sarmah
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Abhijit Paul
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India
| | - Samrat Chatterjee
- Complex Analysis Group, Translational Health Science and Technology Institute, NCR Biotech Science Cluster, Faridabad 121001, India.
| |
Collapse
|
7
|
Zhao X, Singhal A, Park S, Kong J, Bachelder R, Ideker T. Cancer Mutations Converge on a Collection of Protein Assemblies to Predict Resistance to Replication Stress. Cancer Discov 2024; 14:508-523. [PMID: 38236062 PMCID: PMC10905674 DOI: 10.1158/2159-8290.cd-23-0641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 10/25/2023] [Accepted: 12/21/2023] [Indexed: 01/19/2024]
Abstract
Rapid proliferation is a hallmark of cancer associated with sensitivity to therapeutics that cause DNA replication stress (RS). Many tumors exhibit drug resistance, however, via molecular pathways that are incompletely understood. Here, we develop an ensemble of predictive models that elucidate how cancer mutations impact the response to common RS-inducing (RSi) agents. The models implement recent advances in deep learning to facilitate multidrug prediction and mechanistic interpretation. Initial studies in tumor cells identify 41 molecular assemblies that integrate alterations in hundreds of genes for accurate drug response prediction. These cover roles in transcription, repair, cell-cycle checkpoints, and growth signaling, of which 30 are shown by loss-of-function genetic screens to regulate drug sensitivity or replication restart. The model translates to cisplatin-treated cervical cancer patients, highlighting an RTK-JAK-STAT assembly governing resistance. This study defines a compendium of mechanisms by which mutations affect therapeutic responses, with implications for precision medicine. SIGNIFICANCE Zhao and colleagues use recent advances in machine learning to study the effects of tumor mutations on the response to common therapeutics that cause RS. The resulting predictive models integrate numerous genetic alterations distributed across a constellation of molecular assemblies, facilitating a quantitative and interpretable assessment of drug response. This article is featured in Selected Articles from This Issue, p. 384.
Collapse
Affiliation(s)
- Xiaoyu Zhao
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
| | - Akshat Singhal
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California
| | - Sungjoon Park
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
| | - JungHo Kong
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
- Moores Cancer Center, School of Medicine, University of California, San Diego, La Jolla, California
| | - Robin Bachelder
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
| | - Trey Ideker
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California, San Diego, La Jolla, California
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, California
- Moores Cancer Center, School of Medicine, University of California, San Diego, La Jolla, California
- Department of Bioengineering, University of California, San Diego, La Jolla, California
| |
Collapse
|
8
|
Antić Ž, Lentes J, Bergmann AK. Cytogenetics and genomics in pediatric acute lymphoblastic leukaemia. Best Pract Res Clin Haematol 2023; 36:101511. [PMID: 38092485 DOI: 10.1016/j.beha.2023.101511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 07/24/2023] [Accepted: 08/15/2023] [Indexed: 12/18/2023]
Abstract
The last five decades have witnessed significant improvement in diagnostics, treatment and management of children with acute lymphoblastic leukaemia (ALL). These advancements have become possible through progress in our understanding of the genetic and biological background of ALL, resulting in the introduction of risk-adapted treatment and novel therapeutic targets, e.g., tyrosine kinase inhibitors for BCR::ABL1-positive ALL. Further advances in the taxonomy of ALL and the discovery of new genetic biomarkers and therapeutic targets, as well as the introduction of targeted and immunotherapies into the frontline treatment protocols, may improve management and outcome of children with ALL. In this review we describe the current developments in the (cyto)genetic diagnostics and management of children with ALL, and provide an overview of the most important advances in the genetic classification of ALL. Furthermore, we discuss perspectives resulting from the development of new techniques, including artificial intelligence (AI).
Collapse
Affiliation(s)
- Željko Antić
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Jana Lentes
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany
| | - Anke K Bergmann
- Department of Human Genetics, Hannover Medical School (MHH), Hannover, Germany.
| |
Collapse
|
9
|
Carrion SA, Michal JJ, Jiang Z. Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases. Genes (Basel) 2023; 14:2051. [PMID: 38002994 PMCID: PMC10671453 DOI: 10.3390/genes14112051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Manipulation using alternative exon splicing (AES), alternative transcription start (ATS), and alternative polyadenylation (APA) sites are key to transcript diversity underlying health and disease. All three are pervasive in organisms, present in at least 50% of human protein-coding genes. In fact, ATS and APA site use has the highest impact on protein identity, with their ability to alter which first and last exons are utilized as well as impacting stability and translation efficiency. These RNA variants have been shown to be highly specific, both in tissue type and stage, with demonstrated importance to cell proliferation, differentiation and the transition from fetal to adult cells. While alternative exon splicing has a limited effect on protein identity, its ubiquity highlights the importance of these minor alterations, which can alter other features such as localization. The three processes are also highly interwoven, with overlapping, complementary, and competing factors, RNA polymerase II and its CTD (C-terminal domain) chief among them. Their role in development means dysregulation leads to a wide variety of disorders and cancers, with some forms of disease disproportionately affected by specific mechanisms (AES, ATS, or APA). Challenges associated with the genome-wide profiling of RNA variants and their potential solutions are also discussed in this review.
Collapse
Affiliation(s)
| | | | - Zhihua Jiang
- Department of Animal Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA 99164-7620, USA; (S.A.C.); (J.J.M.)
| |
Collapse
|
10
|
Esser-Skala W, Fortelny N. Reliable interpretability of biology-inspired deep neural networks. NPJ Syst Biol Appl 2023; 9:50. [PMID: 37816807 PMCID: PMC10564878 DOI: 10.1038/s41540-023-00310-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 09/15/2023] [Indexed: 10/12/2023] Open
Abstract
Deep neural networks display impressive performance but suffer from limited interpretability. Biology-inspired deep learning, where the architecture of the computational graph is based on biological knowledge, enables unique interpretability where real-world concepts are encoded in hidden nodes, which can be ranked by importance and thereby interpreted. In such models trained on single-cell transcriptomes, we previously demonstrated that node-level interpretations lack robustness upon repeated training and are influenced by biases in biological knowledge. Similar studies are missing for related models. Here, we test and extend our methodology for reliable interpretability in P-NET, a biology-inspired model trained on patient mutation data. We observe variability of interpretations and susceptibility to knowledge biases, and identify the network properties that drive interpretation biases. We further present an approach to control the robustness and biases of interpretations, which leads to more specific interpretations. In summary, our study reveals the broad importance of methods to ensure robust and bias-aware interpretability in biology-inspired deep learning.
Collapse
Affiliation(s)
- Wolfgang Esser-Skala
- Computational Systems Biology Group, Department of Biosciences and Medical Biology, University of Salzburg, Hellbrunner Straße 34, 5020, Salzburg, Austria
| | - Nikolaus Fortelny
- Computational Systems Biology Group, Department of Biosciences and Medical Biology, University of Salzburg, Hellbrunner Straße 34, 5020, Salzburg, Austria.
| |
Collapse
|
11
|
Zhang L, Cao L, Li S, Wang L, Song Y, Huang Y, Xu Z, He J, Wang M, Li K. Biologically Interpretable Deep Learning To Predict Response to Immunotherapy In Advanced Melanoma Using Mutations and Copy Number Variations. J Immunother 2023; 46:221-231. [PMID: 37220017 DOI: 10.1097/cji.0000000000000475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 04/13/2023] [Indexed: 05/25/2023]
Abstract
Only 30-40% of advanced melanoma patients respond effectively to immunotherapy in clinical practice, so it is necessary to accurately identify the response of patients to immunotherapy pre-clinically. Here, we develop KP-NET, a deep learning model that is sparse on KEGG pathways, and combine it with transfer- learning to accurately predict the response of advanced melanomas to immunotherapy using KEGG pathway-level information enriched from gene mutation and copy number variation data. The KP-NET demonstrates best performance with AUROC of 0.886 on testing set and 0.803 on an unseen evaluation set when predicting responders (CR/PR/SD with PFS ≥6 mo) versus non-responders (PD/SD with PFS <6 mo) in anti-CTLA-4 treated melanoma patients. The model also achieves an AUROC of 0.917 and 0.833 in predicting CR/PR versus PD, respectively. Meanwhile, the AUROC is 0.913 when predicting responders versus non-responders in anti-PD-1/PD-L1 melanomas. Moreover, the KP-NET reveals some genes and pathways associated with response to anti-CTLA-4 treatment, such as genes PIK3CA, AOX1 and CBLB, and ErbB signaling pathway, T cell receptor signaling pathway, et al. In conclusion, the KP-NET can accurately predict the response of melanomas to immunotherapy and screen related biomarkers pre-clinically, which can contribute to precision medicine of melanoma.
Collapse
Affiliation(s)
- Liuchao Zhang
- Department of Epidemiology and Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Beaude A, Rafiee Vahid M, Augé F, Zehraoui F, Hanczar B. AttOmics: attention-based architecture for diagnosis and prognosis from omics data. Bioinformatics 2023; 39:i94-i102. [PMID: 37387182 DOI: 10.1093/bioinformatics/btad232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION The increasing availability of high-throughput omics data allows for considering a new medicine centered on individual patients. Precision medicine relies on exploiting these high-throughput data with machine-learning models, especially the ones based on deep-learning approaches, to improve diagnosis. Due to the high-dimensional small-sample nature of omics data, current deep-learning models end up with many parameters and have to be fitted with a limited training set. Furthermore, interactions between molecular entities inside an omics profile are not patient specific but are the same for all patients. RESULTS In this article, we propose AttOmics, a new deep-learning architecture based on the self-attention mechanism. First, we decompose each omics profile into a set of groups, where each group contains related features. Then, by applying the self-attention mechanism to the set of groups, we can capture the different interactions specific to a patient. The results of different experiments carried out in this article show that our model can accurately predict the phenotype of a patient with fewer parameters than deep neural networks. Visualizing the attention maps can provide new insights into the essential groups for a particular phenotype. AVAILABILITY AND IMPLEMENTATION The code and data are available at https://forge.ibisc.univ-evry.fr/abeaude/AttOmics. TCGA data can be downloaded from the Genomic Data Commons Data Portal.
Collapse
Affiliation(s)
- Aurélien Beaude
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Milad Rafiee Vahid
- Sanofi R&D Data and Data Science, Artificial Intelligence & Deep Analytics, Omics Data Science, 450 Water Street, Cambridge, MA 02142, United States
| | - Franck Augé
- Artificial Intelligence & Deep Analytics, Omics Data Science, Sanofi R&D Data and Data Science, 1 Av. Pierre Brossolette, Chilly-Mazarin 91385, France
| | - Farida Zehraoui
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| | - Blaise Hanczar
- IBISC, Université Paris-Saclay, Univ Evry, 23 Boulevard de France, Evry-Courcouronnes 91020, France
| |
Collapse
|
13
|
Wysocka M, Wysocki O, Zufferey M, Landers D, Freitas A. A systematic review of biologically-informed deep learning models for cancer: fundamental trends for encoding and interpreting oncology data. BMC Bioinformatics 2023; 24:198. [PMID: 37189058 DOI: 10.1186/s12859-023-05262-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 03/30/2023] [Indexed: 05/17/2023] Open
Abstract
BACKGROUND There is an increasing interest in the use of Deep Learning (DL) based methods as a supporting analytical framework in oncology. However, most direct applications of DL will deliver models with limited transparency and explainability, which constrain their deployment in biomedical settings. METHODS This systematic review discusses DL models used to support inference in cancer biology with a particular emphasis on multi-omics analysis. It focuses on how existing models address the need for better dialogue with prior knowledge, biological plausibility and interpretability, fundamental properties in the biomedical domain. For this, we retrieved and analyzed 42 studies focusing on emerging architectural and methodological advances, the encoding of biological domain knowledge and the integration of explainability methods. RESULTS We discuss the recent evolutionary arch of DL models in the direction of integrating prior biological relational and network knowledge to support better generalisation (e.g. pathways or Protein-Protein-Interaction networks) and interpretability. This represents a fundamental functional shift towards models which can integrate mechanistic and statistical inference aspects. We introduce a concept of bio-centric interpretability and according to its taxonomy, we discuss representational methodologies for the integration of domain prior knowledge in such models. CONCLUSIONS The paper provides a critical outlook into contemporary methods for explainability and interpretability used in DL for cancer. The analysis points in the direction of a convergence between encoding prior knowledge and improved interpretability. We introduce bio-centric interpretability which is an important step towards formalisation of biological interpretability of DL models and developing methods that are less problem- or application-specific.
Collapse
Affiliation(s)
- Magdalena Wysocka
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
| | - Oskar Wysocki
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK.
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920, Martigny, Switzerland.
| | - Marie Zufferey
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920, Martigny, Switzerland
| | - Dónal Landers
- DeLondra Oncology Ltd, 38 Carlton Avenue, Wilmslow, SK9 4EP, UK
| | - André Freitas
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK
- Department of Computer Science, University of Manchester, Oxford Rd, Manchester, M13 9 PL, UK
- Idiap Research Institute, National University of Sciences, Rue Marconi 19, CH - 1920, Martigny, Switzerland
| |
Collapse
|
14
|
Janizek JD, Spiro A, Celik S, Blue BW, Russell JC, Lee TI, Kaeberlin M, Lee SI. PAUSE: principled feature attribution for unsupervised gene expression analysis. Genome Biol 2023; 24:81. [PMID: 37076856 PMCID: PMC10114348 DOI: 10.1186/s13059-023-02901-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Accepted: 03/17/2023] [Indexed: 04/21/2023] Open
Abstract
As interest in using unsupervised deep learning models to analyze gene expression data has grown, an increasing number of methods have been developed to make these models more interpretable. These methods can be separated into two groups: post hoc analyses of black box models through feature attribution methods and approaches to build inherently interpretable models through biologically-constrained architectures. We argue that these approaches are not mutually exclusive, but can in fact be usefully combined. We propose PAUSE ( https://github.com/suinleelab/PAUSE ), an unsupervised pathway attribution method that identifies major sources of transcriptomic variation when combined with biologically-constrained neural network models.
Collapse
Affiliation(s)
- Joseph D Janizek
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
- Medical Scientist Training Program, University of Washington, Seattle, USA
| | - Anna Spiro
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
| | | | - Ben W Blue
- Department of Pathology, University of Washington, Seattle, USA
| | - John C Russell
- Department of Pathology, University of Washington, Seattle, USA
| | - Ting-I Lee
- Department of Pathology, University of Washington, Seattle, USA
| | - Matt Kaeberlin
- Department of Pathology, University of Washington, Seattle, USA
- Department of Genome Sciences, University of Washington, Seattle, USA
| | - Su-In Lee
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA.
| |
Collapse
|
15
|
Huang Y, Rong Z, Zhang L, Xu Z, Ji J, He J, Liu W, Hou Y, Li K. HiRAND: A novel GCN semi-supervised deep learning-based framework for classification and feature selection in drug research and development. Front Oncol 2023; 13:1047556. [PMID: 36776339 PMCID: PMC9909422 DOI: 10.3389/fonc.2023.1047556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 01/03/2023] [Indexed: 01/28/2023] Open
Abstract
The prediction of response to drugs before initiating therapy based on transcriptome data is a major challenge. However, identifying effective drug response label data costs time and resources. Methods available often predict poorly and fail to identify robust biomarkers due to the curse of dimensionality: high dimensionality and low sample size. Therefore, this necessitates the development of predictive models to effectively predict the response to drugs using limited labeled data while being interpretable. In this study, we report a novel Hierarchical Graph Random Neural Networks (HiRAND) framework to predict the drug response using transcriptome data of few labeled data and additional unlabeled data. HiRAND completes the information integration of the gene graph and sample graph by graph convolutional network (GCN). The innovation of our model is leveraging data augmentation strategy to solve the dilemma of limited labeled data and using consistency regularization to optimize the prediction consistency of unlabeled data across different data augmentations. The results showed that HiRAND achieved better performance than competitive methods in various prediction scenarios, including both simulation data and multiple drug response data. We found that the prediction ability of HiRAND in the drug vorinostat showed the best results across all 62 drugs. In addition, HiRAND was interpreted to identify the key genes most important to vorinostat response, highlighting critical roles for ribosomal protein-related genes in the response to histone deacetylase inhibition. Our HiRAND could be utilized as an efficient framework for improving the drug response prediction performance using few labeled data.
Collapse
Affiliation(s)
- Yue Huang
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Zhiwei Rong
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Liuchao Zhang
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Zhenyi Xu
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Jianxin Ji
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Jia He
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Weisha Liu
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China
| | - Yan Hou
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China,*Correspondence: Kang Li, ; Yan Hou,
| | - Kang Li
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin, China,*Correspondence: Kang Li, ; Yan Hou,
| |
Collapse
|
16
|
Prelaj A, Galli EG, Miskovic V, Pesenti M, Viscardi G, Pedica B, Mazzeo L, Bottiglieri A, Provenzano L, Spagnoletti A, Marinacci R, De Toma A, Proto C, Ferrara R, Brambilla M, Occhipinti M, Manglaviti S, Galli G, Signorelli D, Giani C, Beninato T, Pircher CC, Rametta A, Kosta S, Zanitti M, Di Mauro MR, Rinaldi A, Di Gregorio S, Antonia M, Garassino MC, de Braud FGM, Restelli M, Lo Russo G, Ganzinelli M, Trovò F, Pedrocchi ALG. Real-world data to build explainable trustworthy artificial intelligence models for prediction of immunotherapy efficacy in NSCLC patients. Front Oncol 2023; 12:1078822. [PMID: 36755856 PMCID: PMC9899835 DOI: 10.3389/fonc.2022.1078822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/14/2022] [Indexed: 01/24/2023] Open
Abstract
Introduction Artificial Intelligence (AI) methods are being increasingly investigated as a means to generate predictive models applicable in the clinical practice. In this study, we developed a model to predict the efficacy of immunotherapy (IO) in patients with advanced non-small cell lung cancer (NSCLC) using eXplainable AI (XAI) Machine Learning (ML) methods. Methods We prospectively collected real-world data from patients with an advanced NSCLC condition receiving immune-checkpoint inhibitors (ICIs) either as a single agent or in combination with chemotherapy. With regards to six different outcomes - Disease Control Rate (DCR), Objective Response Rate (ORR), 6 and 24-month Overall Survival (OS6 and OS24), 3-months Progression-Free Survival (PFS3) and Time to Treatment Failure (TTF3) - we evaluated five different classification ML models: CatBoost (CB), Logistic Regression (LR), Neural Network (NN), Random Forest (RF) and Support Vector Machine (SVM). We used the Shapley Additive Explanation (SHAP) values to explain model predictions. Results Of 480 patients included in the study 407 received immunotherapy and 73 chemo- and immunotherapy. From all the ML models, CB performed the best for OS6 and TTF3, (accuracy 0.83 and 0.81, respectively). CB and LR reached accuracy of 0.75 and 0.73 for the outcome DCR. SHAP for CB demonstrated that the feature that strongly influences models' prediction for all three outcomes was Neutrophil to Lymphocyte Ratio (NLR). Performance Status (ECOG-PS) was an important feature for the outcomes OS6 and TTF3, while PD-L1, Line of IO and chemo-immunotherapy appeared to be more important in predicting DCR. Conclusions In this study we developed a ML algorithm based on real-world data, explained by SHAP techniques, and able to accurately predict the efficacy of immunotherapy in sets of NSCLC patients.
Collapse
Affiliation(s)
- Arsela Prelaj
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy,*Correspondence: Arsela Prelaj,
| | - Edoardo Gregorio Galli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Vanja Miskovic
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Mattia Pesenti
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Giuseppe Viscardi
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Medical Oncology Unit, Department of Precision Medicine, University of Campania “Luigi Vanvitelli”, Naples, Italy
| | - Benedetta Pedica
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Laura Mazzeo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Achille Bottiglieri
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Leonardo Provenzano
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Andrea Spagnoletti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Roberto Marinacci
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Alessandro De Toma
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Claudia Proto
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Roberto Ferrara
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Marta Brambilla
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Mario Occhipinti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Sara Manglaviti
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Giulia Galli
- Medical Oncology Unit, Policlinico San Matteo Fondazione IRCCS, Pavia, Italy
| | - Diego Signorelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Niguarda Cancer Center, Grande Ospedale Metropolitano Niguarda, Milan, Italy
| | - Claudia Giani
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Teresa Beninato
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Chiara Carlotta Pircher
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Alessandro Rametta
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Sokol Kosta
- Department of Electronic System, Aalborg University, Copenhagen, Aalborg, Denmark
| | - Michele Zanitti
- Department of Electronic System, Aalborg University, Copenhagen, Aalborg, Denmark
| | - Maria Rosa Di Mauro
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Arturo Rinaldi
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Settimio Di Gregorio
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Martinetti Antonia
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Marina Chiara Garassino
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Thoracic Oncology Program, Section of Hematology/Oncology, University of Chicago, Chicago, IL, United States
| | - Filippo G. M. de Braud
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy,Oncology Department, University of Milan, Milan, Italy
| | - Marcello Restelli
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Giuseppe Lo Russo
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Monica Ganzinelli
- Medical Oncology Department, Fondazione IRCCS Istituto Nazionale Tumori, Milan, Italy
| | - Francesco Trovò
- Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milan, Italy
| | | |
Collapse
|
17
|
Assessing Metabolic Markers in Glioblastoma Using Machine Learning: A Systematic Review. Metabolites 2023; 13:metabo13020161. [PMID: 36837779 PMCID: PMC9958885 DOI: 10.3390/metabo13020161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/14/2023] [Accepted: 01/18/2023] [Indexed: 01/24/2023] Open
Abstract
Glioblastoma (GBM) is a common and deadly brain tumor with late diagnoses and poor prognoses. Machine learning (ML) is an emerging tool that can create highly accurate diagnostic and prognostic prediction models. This paper aimed to systematically search the literature on ML for GBM metabolism and assess recent advancements. A literature search was performed using predetermined search terms. Articles describing the use of an ML algorithm for GBM metabolism were included. Ten studies met the inclusion criteria for analysis: diagnostic (n = 3, 30%), prognostic (n = 6, 60%), or both (n = 1, 10%). Most studies analyzed data from multiple databases, while 50% (n = 5) included additional original samples. At least 2536 data samples were run through an ML algorithm. Twenty-seven ML algorithms were recorded with a mean of 2.8 algorithms per study. Algorithms were supervised (n = 24, 89%), unsupervised (n = 3, 11%), continuous (n = 19, 70%), or categorical (n = 8, 30%). The mean reported accuracy and AUC of ROC were 95.63% and 0.779, respectively. One hundred six metabolic markers were identified, but only EMP3 was reported in multiple studies. Many studies have identified potential biomarkers for GBM diagnosis and prognostication. These algorithms show promise; however, a consensus on even a handful of biomarkers has not yet been made.
Collapse
|
18
|
Huang J, Zhao C, Zhang X, Zhao Q, Zhang Y, Chen L, Dai G. Hepatitis B virus pathogenesis relevant immunosignals uncovering amino acids utilization related risk factors guide artificial intelligence-based precision medicine. Front Pharmacol 2022; 13:1079566. [PMID: 36569318 PMCID: PMC9780394 DOI: 10.3389/fphar.2022.1079566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/30/2022] [Indexed: 12/14/2022] Open
Abstract
Background: Although immune microenvironment-related chemokines, extracellular matrix (ECM), and intrahepatic immune cells are reported to be highly involved in hepatitis B virus (HBV)-related diseases, their roles in diagnosis, prognosis, and drug sensitivity evaluation remain unclear. Here, we aimed to study their clinical use to provide a basis for precision medicine in hepatocellular carcinoma (HCC) via the amalgamation of artificial intelligence. Methods: High-throughput liver transcriptomes from Gene Expression Omnibus (GEO), NODE (https://www.bio.sino.org/node), the Cancer Genome Atlas (TCGA), and our in-house hepatocellular carcinoma patients were collected in this study. Core immunosignals that participated in the entire diseases course of hepatitis B were explored using the "Gene set variation analysis" R package. Using ROC curve analysis, the impact of core immunosignals and amino acid utilization related gene on hepatocellular carcinoma patient's clinical outcome were calculated. The utility of core immunosignals as a classifier for hepatocellular carcinoma tumor tissue was evaluated using explainable machine-learning methods. A novel deep residual neural network model based on immunosignals was constructed for the long-term overall survival (LS) analysis. In vivo drug sensitivity was calculated by the "oncoPredict" R package. Results: We identified nine genes comprising chemokines and ECM related to hepatitis B virus-induced inflammation and fibrosis as CLST signals. Moreover, CLST was co-enriched with activated CD4+ T cells bearing harmful factors (aCD4) during all stages of hepatitis B virus pathogenesis, which was also verified by our hepatocellular carcinoma data. Unexpectedly, we found that hepatitis B virus-hepatocellular carcinoma patients in the CLSThighaCD4high subgroup had the shortest overall survival (OS) and were characterized by a risk gene signature associated with amino acids utilization. Importantly, characteristic genes specific to CLST/aCD4 showed promising clinical relevance in identifying patients with early-stage hepatocellular carcinoma via explainable machine learning. In addition, the 5-year long-term overall survival of hepatocellular carcinoma patients can be effectively classified by CLST/aCD4 based GeneSet-ResNet model. Subgroups defined by CLST and aCD4 were significantly involved in the sensitivity of hepatitis B virus-hepatocellular carcinoma patients to chemotherapy treatments. Conclusion: CLST and aCD4 are hepatitis B virus pathogenesis-relevant immunosignals that are highly involved in hepatitis B virus-induced inflammation, fibrosis, and hepatocellular carcinoma. Gene set variation analysis derived immunogenomic signatures enabled efficient diagnostic and prognostic model construction. The clinical application of CLST and aCD4 as indicators would be beneficial for the precision management of hepatocellular carcinoma.
Collapse
Affiliation(s)
- Jun Huang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China,*Correspondence: Jun Huang, ; Liping Chen, ; Guifu Dai,
| | - Chunbei Zhao
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Xinhe Zhang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Qiaohui Zhao
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Yanting Zhang
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China
| | - Liping Chen
- Key Laboratory of Gastroenterology and Hepatology, State Key Laboratory for Oncogenes and Related Genes, Department of Gastroenterology and Hepatology, Ministry of Health, Shanghai Institute of Digestive Disease, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China,Shanghai Public Health Clinical Center, Fudan University, Shanghai, China,*Correspondence: Jun Huang, ; Liping Chen, ; Guifu Dai,
| | - Guifu Dai
- School of Life Sciences, Zhengzhou University, Zhengzhou, Henan, China,*Correspondence: Jun Huang, ; Liping Chen, ; Guifu Dai,
| |
Collapse
|
19
|
Ding W, Abdel-Basset M, Hawash H, Ali AM. Explainability of artificial intelligence methods, applications and challenges: A comprehensive survey. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
20
|
Liang B, Gong H, Lu L, Xu J. Risk stratification and pathway analysis based on graph neural network and interpretable algorithm. BMC Bioinformatics 2022; 23:394. [PMID: 36167504 PMCID: PMC9516820 DOI: 10.1186/s12859-022-04950-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 09/19/2022] [Indexed: 12/01/2022] Open
Abstract
Background Pathway-based analysis of transcriptomic data has shown greater stability and better performance than traditional gene-based analysis. Until now, some pathway-based deep learning models have been developed for bioinformatic analysis, but these models have not fully considered the topological features of pathways, which limits the performance of the final prediction result. Results To address this issue, we propose a novel model, called PathGNN, which constructs a Graph Neural Networks (GNNs) model that can capture topological features of pathways. As a case, PathGNN was applied to predict long-term survival of four types of cancer and achieved promising predictive performance when compared to other common methods. Furthermore, the adoption of an interpretation algorithm enabled the identification of plausible pathways associated with survival. Conclusion PathGNN demonstrates that GNN can be effectively applied to build a pathway-based model, resulting in promising predictive power. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04950-1.
Collapse
Affiliation(s)
- Bilin Liang
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China
| | - Haifan Gong
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China
| | - Lu Lu
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China
| | - Jie Xu
- Shanghai Artificial Intelligence Laboratory, Yunjing Road 701, Shanghai, China.
| |
Collapse
|
21
|
A Deep Neural Network for Gastric Cancer Prognosis Prediction Based on Biological Information Pathways. JOURNAL OF ONCOLOGY 2022; 2022:2965166. [PMID: 36117847 PMCID: PMC9481367 DOI: 10.1155/2022/2965166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 07/09/2022] [Accepted: 07/22/2022] [Indexed: 11/18/2022]
Abstract
Background Gastric cancer (GC) is one of the deadliest cancers in the world, with a 5-year overall survival rate of lower than 20% for patients with advanced GC. Genomic information is now frequently employed for precision cancer treatment due to the rapid advancements of high-throughput sequencing technologies. As a result, integrating multiomics data to construct predictive models for the GC patient prognosis is critical for tailored medical care. Results In this study, we integrated multiomics data to design a biological pathway-based gastric cancer sparse deep neural network (GCS-Net) by modifying the P-NET model for long-term survival prediction of GC. The GCS-Net showed higher accuracy (accuracy = 0.844), area under the curve (AUC = 0.807), and F1 score (F1 = 0.913) than traditional machine learning models. Furthermore, the GCS-Net not only enables accurate patient survival prognosis but also provides model interpretability capabilities lacking in most traditional deep neural networks to describe the complex biological process of prognosis. The GCS-Net suggested the importance of genes (UBE2C, JAK2, RAD21, CEP250, NUP210, PTPN1, CDC27, NINL, NUP188, and PLK4) and biological pathways (Mitotic Anaphase, Resolution of Sister Chromatid Cohesion, and SUMO E3 ligases) to GC, which is consistent with the results revealed in biological- and medical-related studies of GC. Conclusion The GCS-Net is an interpretable deep neural network built using biological pathway information whose structure represents a nonlinear hierarchical representation of genes and biological pathways. It can not only accurately predict the prognosis of GC patients but also suggest the importance of genes and biological pathways. The GCS-Net opens up new avenues for biological research and could be adapted for other cancer prediction and discovery activities as well.
Collapse
|
22
|
Park C, Kim B, Park T. DeepHisCoM: deep learning pathway analysis using hierarchical structural component models. Brief Bioinform 2022; 23:6590446. [DOI: 10.1093/bib/bbac171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/04/2022] [Accepted: 04/18/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Many statistical methods for pathway analysis have been used to identify pathways associated with the disease along with biological factors such as genes and proteins. However, most pathway analysis methods neglect the complex nonlinear relationship between biological factors and pathways. In this study, we propose a Deep-learning pathway analysis using Hierarchical structured CoMponent models (DeepHisCoM) that utilize deep learning to consider a nonlinear complex contribution of biological factors to pathways by constructing a multilayered model which accounts for hierarchical biological structure. Through simulation studies, DeepHisCoM was shown to have a higher power in the nonlinear pathway effect and comparable power for the linear pathway effect when compared to the conventional pathway methods. Application to hepatocellular carcinoma (HCC) omics datasets, including metabolomic, transcriptomic and metagenomic datasets, demonstrated that DeepHisCoM successfully identified three well-known pathways that are highly associated with HCC, such as lysine degradation, valine, leucine and isoleucine biosynthesis and phenylalanine, tyrosine and tryptophan. Application to the coronavirus disease-2019 (COVID-19) single-nucleotide polymorphism (SNP) dataset also showed that DeepHisCoM identified four pathways that are highly associated with the severity of COVID-19, such as mitogen-activated protein kinase (MAPK) signaling pathway, gonadotropin-releasing hormone (GnRH) signaling pathway, hypertrophic cardiomyopathy and dilated cardiomyopathy. Codes are available at https://github.com/chanwoo-park-official/DeepHisCoM.
Collapse
Affiliation(s)
- Chanwoo Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
| | - Boram Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul 08826, Korea
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| |
Collapse
|
23
|
Li W, Sun T, Li M, He Y, Li L, Wang L, Wang H, Li J, Wen H, Liu Y, Chen Y, Fan Y, Xin B, Zhang J. GNIFdb: a neoantigen intrinsic feature database for glioma. Database (Oxford) 2022; 2022:6527499. [PMID: 35150127 PMCID: PMC9216533 DOI: 10.1093/database/baac004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 01/06/2022] [Accepted: 01/29/2022] [Indexed: 12/24/2022]
Abstract
ABSTRACT Neoantigens are mutation-containing immunogenic peptides from tumor cells. Neoantigen intrinsic features are neoantigens' sequence-associated features characterized by different amino acid descriptors and physical-chemical properties, which have a crucial function in prioritization of neoantigens with immunogenic potentials and predicting patients with better survival. Different intrinsic features might have functions to varying degrees in evaluating neoantigens' potentials of immunogenicity. Identification and comparison of intrinsic features among neoantigens are particularly important for developing neoantigen-based personalized immunotherapy. However, there is still no public repository to host the intrinsic features of neoantigens. Therefore, we developed GNIFdb, a glioma neoantigen intrinsic feature database specifically designed for hosting, exploring and visualizing neoantigen and intrinsic features. The database provides a comprehensive repository of computationally predicted Human leukocyte antigen class I (HLA-I) restricted neoantigens and their intrinsic features; a systematic annotation of neoantigens including sequence, neoantigen-associated mutation, gene expression, glioma prognosis, HLA-I subtype and binding affinity between neoantigens and HLA-I; and a genome browser to visualize them in an interactive manner. It represents a valuable resource for the neoantigen research community and is publicly available at http://www.oncoimmunobank.cn/index.php. DATABASE URL http://www.oncoimmunobank.cn/index.php.
Collapse
Affiliation(s)
- Wendong Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Ting Sun
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Muyang Li
- Department of Plant Genetics and Breeding, State Key Laboratory of Plant Physiology and Biochemistry & National Maize Improvement Center, China Agricultural University, No.17 Qinghua East Road, Haidian District, Beijing 100193, P. R. China
| | - Yufei He
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Lin Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Lu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Haoyu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Jing Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Hao Wen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Yong Liu
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Yifan Chen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Yubo Fan
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| | - Beibei Xin
- Department of Plant Genetics and Breeding, State Key Laboratory of Plant Physiology and Biochemistry & National Maize Improvement Center, China Agricultural University, No.17 Qinghua East Road, Haidian District, Beijing 100193, P. R. China
| | - Jing Zhang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing 100083, P. R. China
| |
Collapse
|
24
|
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform 2022; 23:bbab454. [PMID: 34791014 PMCID: PMC8769688 DOI: 10.1093/bib/bbab454] [Citation(s) in RCA: 81] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/18/2022] Open
Abstract
High-throughput next-generation sequencing now makes it possible to generate a vast amount of multi-omics data for various applications. These data have revolutionized biomedical research by providing a more comprehensive understanding of the biological systems and molecular mechanisms of disease development. Recently, deep learning (DL) algorithms have become one of the most promising methods in multi-omics data analysis, due to their predictive performance and capability of capturing nonlinear and hierarchical features. While integrating and translating multi-omics data into useful functional insights remain the biggest bottleneck, there is a clear trend towards incorporating multi-omics analysis in biomedical research to help explain the complex relationships between molecular layers. Multi-omics data have a role to improve prevention, early detection and prediction; monitor progression; interpret patterns and endotyping; and design personalized treatments. In this review, we outline a roadmap of multi-omics integration using DL and offer a practical perspective into the advantages, challenges and barriers to the implementation of DL in multi-omics data.
Collapse
Affiliation(s)
- Mingon Kang
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Euiseong Ko
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
25
|
Gundogdu P, Loucera C, Alamo-Alvarez I, Dopazo J, Nepomuceno I. Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data. BioData Min 2022; 15:1. [PMID: 34980200 PMCID: PMC8722116 DOI: 10.1186/s13040-021-00285-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/04/2021] [Indexed: 11/13/2022] Open
Abstract
Background Single-cell RNA sequencing (scRNA-seq) data provide valuable insights into cellular heterogeneity which is significantly improving the current knowledge on biology and human disease. One of the main applications of scRNA-seq data analysis is the identification of new cell types and cell states. Deep neural networks (DNNs) are among the best methods to address this problem. However, this performance comes with the trade-off for a lack of interpretability in the results. In this work we propose an intelligible pathway-driven neural network to correctly solve cell-type related problems at single-cell resolution while providing a biologically meaningful representation of the data. Results In this study, we explored the deep neural networks constrained by several types of prior biological information, e.g. signaling pathway information, as a way to reduce the dimensionality of the scRNA-seq data. We have tested the proposed biologically-based architectures on thousands of cells of human and mouse origin across a collection of public datasets in order to check the performance of the model. Specifically, we tested the architecture across different validation scenarios that try to mimic how unknown cell types are clustered by the DNN and how it correctly annotates cell types by querying a database in a retrieval problem. Moreover, our approach demonstrated to be comparable to other less interpretable DNN approaches constrained by using protein-protein interactions gene regulation data. Finally, we show how the latent structure learned by the network could be used to visualize and to interpret the composition of human single cell datasets. Conclusions Here we demonstrate how the integration of pathways, which convey fundamental information on functional relationships between genes, with DNNs, that provide an excellent classification framework, results in an excellent alternative to learn a biologically meaningful representation of scRNA-seq data. In addition, the introduction of prior biological knowledge in the DNN reduces the size of the network architecture. Comparative results demonstrate a superior performance of this approach with respect to other similar approaches. As an additional advantage, the use of pathways within the DNN structure enables easy interpretability of the results by connecting features to cell functionalities by means of the pathway nodes, as demonstrated with an example with human melanoma tumor cells. Supplementary Information The online version contains supplementary material available at 10.1186/s13040-021-00285-4.
Collapse
Affiliation(s)
- Pelin Gundogdu
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Inmaculada Alamo-Alvarez
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Joaquin Dopazo
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain. .,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain. .,Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, 41013, Sevilla, Spain. .,FPS/ELIXIR-es, Hospital Virgen del Rocío, 42013, Sevilla, Spain.
| | - Isabel Nepomuceno
- Department of Computer Languages and Systems, Universidad de Sevilla, Sevilla, Spain.
| |
Collapse
|
26
|
Yang G, Ye Q, Xia J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2022; 77:29-52. [PMID: 34980946 PMCID: PMC8459787 DOI: 10.1016/j.inffus.2021.07.016] [Citation(s) in RCA: 135] [Impact Index Per Article: 67.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 05/25/2021] [Accepted: 07/25/2021] [Indexed: 05/04/2023]
Abstract
Explainable Artificial Intelligence (XAI) is an emerging research topic of machine learning aimed at unboxing how AI systems' black-box choices are made. This research field inspects the measures and models involved in decision-making and seeks solutions to explain them explicitly. Many of the machine learning algorithms cannot manifest how and why a decision has been cast. This is particularly true of the most popular deep neural network approaches currently in use. Consequently, our confidence in AI systems can be hindered by the lack of explainability in these black-box models. The XAI becomes more and more crucial for deep learning powered applications, especially for medical and healthcare studies, although in general these deep neural networks can return an arresting dividend in performance. The insufficient explainability and transparency in most existing AI systems can be one of the major reasons that successful implementation and integration of AI tools into routine clinical practice are uncommon. In this study, we first surveyed the current progress of XAI and in particular its advances in healthcare applications. We then introduced our solutions for XAI leveraging multi-modal and multi-centre data fusion, and subsequently validated in two showcases following real clinical scenarios. Comprehensive quantitative and qualitative analyses can prove the efficacy of our proposed XAI solutions, from which we can envisage successful applications in a broader range of clinical questions.
Collapse
Affiliation(s)
- Guang Yang
- National Heart and Lung Institute, Imperial College London, London, UK
- Royal Brompton Hospital, London, UK
- Imperial Institute of Advanced Technology, Hangzhou, China
| | - Qinghao Ye
- Hangzhou Ocean’s Smart Boya Co., Ltd, China
- University of California, San Diego, La Jolla, CA, USA
| | - Jun Xia
- Radiology Department, Shenzhen Second People’s Hospital, Shenzhen, China
| |
Collapse
|
27
|
Scherer P, Trębacz M, Simidjievski N, Viñas R, Shams Z, Terre HA, Jamnik M, Liò P. Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases. Bioinformatics 2021; 38:1320-1327. [PMID: 34888618 PMCID: PMC8826027 DOI: 10.1093/bioinformatics/btab830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 09/29/2021] [Accepted: 12/03/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Gene expression data are commonly used at the intersection of cancer research and machine learning for better understanding of the molecular status of tumour tissue. Deep learning predictive models have been employed for gene expression data due to their ability to scale and remove the need for manual feature engineering. However, gene expression data are often very high dimensional, noisy and presented with a low number of samples. This poses significant problems for learning algorithms: models often overfit, learn noise and struggle to capture biologically relevant information. In this article, we utilize external biological knowledge embedded within structures of gene interaction graphs such as protein-protein interaction (PPI) networks to guide the construction of predictive models. RESULTS We present Gene Interaction Network Constrained Construction (GINCCo), an unsupervised method for automated construction of computational graph models for gene expression data that are structurally constrained by prior knowledge of gene interaction networks. We employ this methodology in a case study on incorporating a PPI network in cancer phenotype prediction tasks. Our computational graphs are structurally constructed using topological clustering algorithms on the PPI networks which incorporate inductive biases stemming from network biology research on protein complex discovery. Each of the entities in the GINCCo computational graph represents biological entities such as genes, candidate protein complexes and phenotypes instead of arbitrary hidden nodes of a neural network. This provides a biologically relevant mechanism for model regularization yielding strong predictive performance while drastically reducing the number of model parameters and enabling guided post-hoc enrichment analyses of influential gene sets with respect to target phenotypes. Our experiments analysing a variety of cancer phenotypes show that GINCCo often outperforms support vector machine, Fully Connected Multi-layer Perceptrons (MLP) and Randomly Connected MLPs despite greatly reduced model complexity. AVAILABILITY AND IMPLEMENTATION https://github.com/paulmorio/gincco contains the source code for our approach. We also release a library with algorithms for protein complex discovery within PPI networks at https://github.com/paulmorio/protclus. This repository contains implementations of the clustering algorithms used in this article. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paul Scherer
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK,To whom correspondence should be addressed.
| | - Maja Trębacz
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Nikola Simidjievski
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Ramon Viñas
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Zohreh Shams
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Helena Andres Terre
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Mateja Jamnik
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| | - Pietro Liò
- Department of Computer Science and Technology, University of Cambridge, Cambridge, CB3 0FD, UK
| |
Collapse
|
28
|
Deep Learning in Cancer Diagnosis and Prognosis Prediction: A Minireview on Challenges, Recent Trends, and Future Directions. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:9025470. [PMID: 34754327 PMCID: PMC8572604 DOI: 10.1155/2021/9025470] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/30/2022]
Abstract
Deep learning (DL) is a branch of machine learning and artificial intelligence that has been applied to many areas in different domains such as health care and drug design. Cancer prognosis estimates the ultimate fate of a cancer subject and provides survival estimation of the subjects. An accurate and timely diagnostic and prognostic decision will greatly benefit cancer subjects. DL has emerged as a technology of choice due to the availability of high computational resources. The main components in a standard computer-aided design (CAD) system are preprocessing, feature recognition, extraction and selection, categorization, and performance assessment. Reduction of costs associated with sequencing systems offers a myriad of opportunities for building precise models for cancer diagnosis and prognosis prediction. In this survey, we provided a summary of current works where DL has helped to determine the best models for the cancer diagnosis and prognosis prediction tasks. DL is a generic model requiring minimal data manipulations and achieves better results while working with enormous volumes of data. Aims are to scrutinize the influence of DL systems using histopathology images, present a summary of state-of-the-art DL methods, and give directions to future researchers to refine the existing methods.
Collapse
|
29
|
Classification and Functional Analysis between Cancer and Normal Tissues Using Explainable Pathway Deep Learning through RNA-Sequencing Gene Expression. Int J Mol Sci 2021; 22:ijms222111531. [PMID: 34768960 PMCID: PMC8584109 DOI: 10.3390/ijms222111531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 10/21/2021] [Accepted: 10/21/2021] [Indexed: 11/24/2022] Open
Abstract
Deep learning has proven advantageous in solving cancer diagnostic or classification problems. However, it cannot explain the rationale behind human decisions. Biological pathway databases provide well-studied relationships between genes and their pathways. As pathways comprise knowledge frameworks widely used by human researchers, representing gene-to-pathway relationships in deep learning structures may aid in their comprehension. Here, we propose a deep neural network (PathDeep), which implements gene-to-pathway relationships in its structure. We also provide an application framework measuring the contribution of pathways and genes in deep neural networks in a classification problem. We applied PathDeep to classify cancer and normal tissues based on the publicly available, large gene expression dataset. PathDeep showed higher accuracy than fully connected neural networks in distinguishing cancer from normal tissues (accuracy = 0.994) in 32 tissue samples. We identified 42 pathways related to 32 cancer tissues and 57 associated genes contributing highly to the biological functions of cancer. The most significant pathway was G-protein-coupled receptor signaling, and the most enriched function was the G1/S transition of the mitotic cell cycle, suggesting that these biological functions were the most common cancer characteristics in the 32 tissues.
Collapse
|
30
|
Tang B, Chen Y, Wang Y, Nie J. A Wavelet-Based Learning Model Enhances Molecular Prognosis in Pancreatic Adenocarcinoma. BIOMED RESEARCH INTERNATIONAL 2021; 2021:7865856. [PMID: 34697591 PMCID: PMC8541860 DOI: 10.1155/2021/7865856] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 09/21/2021] [Indexed: 12/24/2022]
Abstract
Genome-wide omics technology boosts deep interrogation into the clinical prognosis and inherent mechanism of pancreatic oncology. Classic LASSO methods coequally treat all candidates, ignoring individual characteristics, thus frequently deteriorating performance with comparatively more predictors. Here, we propose a wavelet-based deep learning method in variable selection and prognosis formulation for PAAD with small samples and multisource information. With the genomic, epigenomic, and clinical cohort information from The Cancer Genome Atlas, the constructed five-molecule model is validated via Kaplan-Meier survival estimate, rendering significant prognosis capability on high- and low-risk subcohorts (p value < 0.0001), together with three predictors manifesting the individual prognosis significance (p value: 0.0012~0.024). Moreover, the performance of the prognosis model has been benchmarked against the traditional LASSO and wavelet-based methods in the 3- and 5-year prediction AUC items, respectively. Specifically, the proposed model with discrete stationary wavelet base (bior1.5) overwhelmingly outperformed traditional LASSO and wavelet-based methods (AUC: 0.787 vs. 0.782 and 0.721 for the 3-year case; AUC: 0.937 vs. 0.802 and 0.859 for the 5-year case). Thus, the proposed model provides a more accurate perspective, but with less predictor burden for clinical prognosis in the pancreatic carcinoma study.
Collapse
Affiliation(s)
- Binhua Tang
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| | - Yu Chen
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| | - Yuqi Wang
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| | - Jiafei Nie
- Epigenetics & Function Group, Hohai University, Jiangsu 213022, China
| |
Collapse
|
31
|
Tran KA, Kondrashova O, Bradley A, Williams ED, Pearson JV, Waddell N. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med 2021; 13:152. [PMID: 34579788 PMCID: PMC8477474 DOI: 10.1186/s13073-021-00968-x] [Citation(s) in RCA: 221] [Impact Index Per Article: 73.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 09/12/2021] [Indexed: 12/13/2022] Open
Abstract
Deep learning is a subdiscipline of artificial intelligence that uses a machine learning technique called artificial neural networks to extract patterns and make predictions from large data sets. The increasing adoption of deep learning across healthcare domains together with the availability of highly characterised cancer datasets has accelerated research into the utility of deep learning in the analysis of the complex biology of cancer. While early results are promising, this is a rapidly evolving field with new knowledge emerging in both cancer biology and deep learning. In this review, we provide an overview of emerging deep learning techniques and how they are being applied to oncology. We focus on the deep learning applications for omics data types, including genomic, methylation and transcriptomic data, as well as histopathology-based genomic inference, and provide perspectives on how the different data types can be integrated to develop decision support tools. We provide specific examples of how deep learning may be applied in cancer diagnosis, prognosis and treatment management. We also assess the current limitations and challenges for the application of deep learning in precision oncology, including the lack of phenotypically rich data and the need for more explainable deep learning models. Finally, we conclude with a discussion of how current obstacles can be overcome to enable future clinical utilisation of deep learning.
Collapse
Affiliation(s)
- Khoa A. Tran
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
| | - Olga Kondrashova
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Andrew Bradley
- Faculty of Engineering, Queensland University of Technology (QUT), Brisbane, 4000 Australia
| | - Elizabeth D. Williams
- School of Biomedical Sciences, Faculty of Health, Queensland University of Technology (QUT), Brisbane, 4059 Australia
- Australian Prostate Cancer Research Centre - Queensland (APCRC-Q) and Queensland Bladder Cancer Initiative (QBCI), Brisbane, 4102 Australia
| | - John V. Pearson
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| | - Nicola Waddell
- Department of Genetics and Computational Biology, QIMR Berghofer Medical Research Institute, Brisbane, 4006 Australia
| |
Collapse
|
32
|
Elmarakeby HA, Hwang J, Arafeh R, Crowdis J, Gang S, Liu D, AlDubayan SH, Salari K, Kregel S, Richter C, Arnoff TE, Park J, Hahn WC, Van Allen EM. Biologically informed deep neural network for prostate cancer discovery. Nature 2021; 598:348-352. [PMID: 34552244 PMCID: PMC8514339 DOI: 10.1038/s41586-021-03922-4] [Citation(s) in RCA: 121] [Impact Index Per Article: 40.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2020] [Accepted: 08/17/2021] [Indexed: 12/20/2022]
Abstract
The determination of molecular features that mediate clinically aggressive phenotypes in prostate cancer remains a major biological and clinical challenge1,2. Recent advances in interpretability of machine learning models as applied to biomedical problems may enable discovery and prediction in clinical cancer genomics3-5. Here we developed P-NET-a biologically informed deep learning model-to stratify patients with prostate cancer by treatment-resistance state and evaluate molecular drivers of treatment resistance for therapeutic targeting through complete model interpretability. We demonstrate that P-NET can predict cancer state using molecular data with a performance that is superior to other modelling approaches. Moreover, the biological interpretability within P-NET revealed established and novel molecularly altered candidates, such as MDM4 and FGFR1, which were implicated in predicting advanced disease and validated in vitro. Broadly, biologically informed fully interpretable neural networks enable preclinical discovery and clinical prediction in prostate cancer and may have general applicability across cancer types.
Collapse
Affiliation(s)
- Haitham A Elmarakeby
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Al-Azhar University, Cairo, Egypt
| | - Justin Hwang
- University of Minnesota, Division of Hematology, Oncology and Transplantation, Minneapolis, MN, USA
| | - Rand Arafeh
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jett Crowdis
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sydney Gang
- Dana-Farber Cancer Institute, Boston, MA, USA
| | - David Liu
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Saud H AlDubayan
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Keyan Salari
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Urology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Steven Kregel
- Department of Pathology, University of Illinois at Chicago, Chicago, IL, USA
| | | | - Taylor E Arnoff
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jihye Park
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - William C Hahn
- Dana-Farber Cancer Institute, Boston, MA, USA.,Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eliezer M Van Allen
- Dana-Farber Cancer Institute, Boston, MA, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
33
|
Bourgeais V, Zehraoui F, Ben Hamdoune M, Hanczar B. Deep GONet: self-explainable deep neural network based on Gene Ontology for phenotype prediction from gene expression data. BMC Bioinformatics 2021; 22:455. [PMID: 34551707 PMCID: PMC8456586 DOI: 10.1186/s12859-021-04370-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 09/08/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND With the rapid advancement of genomic sequencing techniques, massive production of gene expression data is becoming possible, which prompts the development of precision medicine. Deep learning is a promising approach for phenotype prediction (clinical diagnosis, prognosis, and drug response) based on gene expression profile. Existing deep learning models are usually considered as black-boxes that provide accurate predictions but are not interpretable. However, accuracy and interpretation are both essential for precision medicine. In addition, most models do not integrate the knowledge of the domain. Hence, making deep learning models interpretable for medical applications using prior biological knowledge is the main focus of this paper. RESULTS In this paper, we propose a new self-explainable deep learning model, called Deep GONet, integrating the Gene Ontology into the hierarchical architecture of the neural network. This model is based on a fully-connected architecture constrained by the Gene Ontology annotations, such that each neuron represents a biological function. The experiments on cancer diagnosis datasets demonstrate that Deep GONet is both easily interpretable and highly performant to discriminate cancer and non-cancer samples. CONCLUSIONS Our model provides an explanation to its predictions by identifying the most important neurons and associating them with biological functions, making the model understandable for biologists and physicians.
Collapse
Affiliation(s)
- Victoria Bourgeais
- IBISC, Univ Evry, Université Paris-Saclay, 91020 Évry-Courcouronnes, France
| | - Farida Zehraoui
- IBISC, Univ Evry, Université Paris-Saclay, 91020 Évry-Courcouronnes, France
| | | | - Blaise Hanczar
- IBISC, Univ Evry, Université Paris-Saclay, 91020 Évry-Courcouronnes, France
| |
Collapse
|
34
|
Levy JJ, Chen Y, Azizgolshani N, Petersen CL, Titus AJ, Moen EL, Vaickus LJ, Salas LA, Christensen BC. MethylSPWNet and MethylCapsNet: Biologically Motivated Organization of DNAm Neural Networks, Inspired by Capsule Networks. NPJ Syst Biol Appl 2021; 7:33. [PMID: 34417465 PMCID: PMC8379254 DOI: 10.1038/s41540-021-00193-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 07/01/2021] [Indexed: 02/07/2023] Open
Abstract
DNA methylation (DNAm) alterations have been heavily implicated in carcinogenesis and the pathophysiology of diseases through upstream regulation of gene expression. DNAm deep-learning approaches are able to capture features associated with aging, cell type, and disease progression, but lack incorporation of prior biological knowledge. Here, we present modular, user-friendly deep-learning methodology and software, MethylCapsNet and MethylSPWNet, that group CpGs into biologically relevant capsules-such as gene promoter context, CpG island relationship, or user-defined groupings-and relate them to diagnostic and prognostic outcomes. We demonstrate these models' utility on 3,897 individuals in the classification of central nervous system (CNS) tumors. MethylCapsNet and MethylSPWNet provide an opportunity to increase DNAm deep-learning analyses' interpretability by enabling a flexible organization of DNAm data into biologically relevant capsules.
Collapse
Affiliation(s)
- Joshua J Levy
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA.
| | - Youdinghuan Chen
- Program in Quantitative Biomedical Sciences, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Nasim Azizgolshani
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Curtis L Petersen
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
| | - Alexander J Titus
- Department of Life Sciences, University of New Hampshire, Manchester, NH, USA
| | - Erika L Moen
- The Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, NH, USA
- Department of Biomedical Data Science, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Louis J Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH, USA
| | - Lucas A Salas
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Brock C Christensen
- Department of Epidemiology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
- Department of Community and Family Medicine, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| |
Collapse
|
35
|
Sun T, He Y, Li W, Liu G, Li L, Wang L, Xiao Z, Han X, Wen H, Liu Y, Chen Y, Wang H, Li J, Fan Y, Zhang W, Zhang J. neoDL: a novel neoantigen intrinsic feature-based deep learning model identifies IDH wild-type glioblastomas with the longest survival. BMC Bioinformatics 2021; 22:382. [PMID: 34301201 PMCID: PMC8299600 DOI: 10.1186/s12859-021-04301-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 07/07/2021] [Indexed: 12/18/2022] Open
Abstract
Background Neoantigen based personalized immune therapies achieve promising results in melanoma and lung cancer, but few neoantigen based models perform well in IDH wild-type GBM, and the association between neoantigen intrinsic features and prognosis remain unclear in IDH wild-type GBM. We presented a novel neoantigen intrinsic feature-based deep learning model (neoDL) to stratify IDH wild-type GBMs into subgroups with different survivals. Results We first derived intrinsic features for each neoantigen associated with survival, followed by applying neoDL in TCGA data cohort(AUC = 0.988, p value < 0.0001). Leave one out cross validation (LOOCV) in TCGA demonstrated that neoDL successfully classified IDH wild-type GBMs into different prognostic subgroups, which was further validated in an independent data cohort from Asian population. Long-term survival IDH wild-type GBMs identified by neoDL were found characterized by 12 protective neoantigen intrinsic features and enriched in development and cell cycle. Conclusions The model can be therapeutically exploited to identify IDH wild-type GBM with good prognosis who will most likely benefit from neoantigen based personalized immunetherapy. Furthermore, the prognostic intrinsic features of the neoantigens inferred from this study can be used for identifying neoantigens with high potentials of immunogenicity. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04301-6.
Collapse
Affiliation(s)
- Ting Sun
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yufei He
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Wendong Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Guang Liu
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Lin Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Lu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Zixuan Xiao
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Xiaohan Han
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Hao Wen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yong Liu
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yifan Chen
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Haoyu Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Jing Li
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China
| | - Yubo Fan
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China.
| | - Wei Zhang
- Department of Molecular Neuropathology, Beijing Neurosurgical Institute, Capital Medical University, Beijing, 100070, People's Republic of China. .,Department of Neurosurgery, Beijing Tiantan Hospital, Capital Medical University, No. 119 South Fourth Ring Road West, Fengtai District, Beijing, 100070, People's Republic of China.
| | - Jing Zhang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing Advanced Innovation Centre for Biomedical Engineering, School of Engineering Medicine, School of Biological Science and Medical Engineering, Beihang University, No.37 Xueyuan Road, Haidian District, Beijing, 100083, People's Republic of China.
| |
Collapse
|
36
|
Li Y, Wang F, Yan M, Cantu E, Yang FN, Rao H, Feng R. Peel Learning for Pathway-Related Outcome Prediction. Bioinformatics 2021; 37:4108-4114. [PMID: 34042937 PMCID: PMC9502230 DOI: 10.1093/bioinformatics/btab402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 05/07/2021] [Accepted: 05/26/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Traditional regression models are limited in outcome prediction due to their parametric nature. Current deep learning methods allow for various effects and interactions and have shown improved performance, but they typically need to be trained on a large amount of data to obtain reliable results. Gene expression studies often have small sample sizes but high dimensional correlated predictors so that traditional deep learning methods are not readily applicable. RESULTS In this paper, we proposed peel learning, a novel neural network that incorporates the prior relationship among genes. In each layer of learning, overall structure is peeled into multiple local substructures. Within the substructure, dependency among variables is reduced through linear projections. The overall structure is gradually simplified over layers and weight parameters are optimized through a revised backpropagation. We applied PL to a small lung transplantation study to predict recipients' post-surgery primary graft dysfunction using donors' gene expressions within several immunology pathways, where PL showed improved prediction accuracy compared to conventional penalized regression, classification trees, feed-forward neural network, and a neural network assuming prior network structure. Through simulation studies, we also demonstrated the advantage of adding specific structure among predictor variables in neural network, over no or uniform group structure, which is more favorable in smaller studies. The empirical evidence is consistent with our theoretical proof of improved upper bound of PL's complexity over ordinary neural networks. AVAILABILITY AND IMPLEMENTATION PL algorithm was implemented in Python and the open-source code and instruction will be available at https://github.com/Likelyt/Peel-Learning.
Collapse
Affiliation(s)
- Yuantong Li
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
| | - Fei Wang
- Department of Healthcare Policy and Research, Cornell University Weill Medical School, New York, NY, 10065, USA
| | - Mengying Yan
- Department of Statistics, George Washington University, Washington, DC, 20052, USA
| | - Edward Cantu
- Department of Surgery, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Fan Nils Yang
- Department of Neuroscience, Georgetown University, Washington, D.C, 20057, USA
| | - Hengyi Rao
- epartment of Neurology, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Rui Feng
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, PA, 19104, USA
| |
Collapse
|
37
|
Zhang H, Chen Y, Li F. Predicting Anticancer Drug Response With Deep Learning Constrained by Signaling Pathways. FRONTIERS IN BIOINFORMATICS 2021; 1:639349. [PMID: 36303766 PMCID: PMC9581064 DOI: 10.3389/fbinf.2021.639349] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 03/25/2021] [Indexed: 12/13/2022] Open
Abstract
Thanks to the availability of multiomics data of individual cancer patients, precision medicine or personalized medicine is becoming a promising treatment for individual cancer patients. However, the association patterns, that is, the mechanism of response (MoR) between large-scale multiomics features and drug response are complex and heterogeneous and remain unclear. Although there are existing computational models for predicting drug response using the high-dimensional multiomics features, it remains challenging to uncover the complex molecular mechanism of drug responses. To reduce the number of predictors/features and make the model more interpretable, in this study, 46 signaling pathways were used to build a deep learning model constrained by signaling pathways, consDeepSignaling, for anti–drug response prediction. Multiomics data, like gene expression and copy number variation, of individual genes can be integrated naturally in this model. The signaling pathway–constrained deep learning model was evaluated using the multiomics data of ∼1000 cancer cell lines in the Broad Institute Cancer Cell Line Encyclopedia (CCLE) database and the corresponding drug–cancer cell line response data set in the Genomics of Drug Sensitivity in Cancer (GDSC) database. The evaluation results showed that the proposed model outperformed the existing deep neural network models. Also, the model interpretation analysis indicated the distinctive patterns of importance of signaling pathways in anticancer drug response prediction.
Collapse
Affiliation(s)
- Heming Zhang
- Department of Computer Science, Washington University in St. Louis, St. Louis, MO, United States
- *Correspondence: Heming Zhang, ; Yixin Chen, ; Fuhai Li,
| | - Yixin Chen
- Department of Computer Science, Washington University in St. Louis, St. Louis, MO, United States
- *Correspondence: Heming Zhang, ; Yixin Chen, ; Fuhai Li,
| | - Fuhai Li
- Institute for Informatics, Washington University School of Medicine, St. Louis, MO, United States
- Department of Pediatrics, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, United States
- *Correspondence: Heming Zhang, ; Yixin Chen, ; Fuhai Li,
| |
Collapse
|
38
|
Pasquini L, Napolitano A, Tagliente E, Dellepiane F, Lucignani M, Vidiri A, Ranazzi G, Stoppacciaro A, Moltoni G, Nicolai M, Romano A, Di Napoli A, Bozzao A. Deep Learning Can Differentiate IDH-Mutant from IDH-Wild GBM. J Pers Med 2021; 11:290. [PMID: 33918828 PMCID: PMC8069494 DOI: 10.3390/jpm11040290] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 04/02/2021] [Accepted: 04/07/2021] [Indexed: 12/16/2022] Open
Abstract
Isocitrate dehydrogenase (IDH) mutant and wildtype glioblastoma multiforme (GBM) often show overlapping features on magnetic resonance imaging (MRI), representing a diagnostic challenge. Deep learning showed promising results for IDH identification in mixed low/high grade glioma populations; however, a GBM-specific model is still lacking in the literature. Our aim was to develop a GBM-tailored deep-learning model for IDH prediction by applying convoluted neural networks (CNN) on multiparametric MRI. We selected 100 adult patients with pathologically demonstrated WHO grade IV gliomas and IDH testing. MRI sequences included: MPRAGE, T1, T2, FLAIR, rCBV and ADC. The model consisted of a 4-block 2D CNN, applied to each MRI sequence. Probability of IDH mutation was obtained from the last dense layer of a softmax activation function. Model performance was evaluated in the test cohort considering categorical cross-entropy loss (CCEL) and accuracy. Calculated performance was: rCBV (accuracy 83%, CCEL 0.64), T1 (accuracy 77%, CCEL 1.4), FLAIR (accuracy 77%, CCEL 1.98), T2 (accuracy 67%, CCEL 2.41), MPRAGE (accuracy 66%, CCEL 2.55). Lower performance was achieved on ADC maps. We present a GBM-specific deep-learning model for IDH mutation prediction, with a maximal accuracy of 83% on rCBV maps. Highest predictivity achieved on perfusion images possibly reflects the known link between IDH and neoangiogenesis through the hypoxia inducible factor.
Collapse
Affiliation(s)
- Luca Pasquini
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
- Neuroradiology Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY 10065, USA
| | - Antonio Napolitano
- Medical Physics Department, Bambino Gesù Children’s Hospital, IRCCS, Piazza di Sant’Onofrio, 4, 00165 Rome, Italy; (E.T.); (M.L.)
| | - Emanuela Tagliente
- Medical Physics Department, Bambino Gesù Children’s Hospital, IRCCS, Piazza di Sant’Onofrio, 4, 00165 Rome, Italy; (E.T.); (M.L.)
| | - Francesco Dellepiane
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
| | - Martina Lucignani
- Medical Physics Department, Bambino Gesù Children’s Hospital, IRCCS, Piazza di Sant’Onofrio, 4, 00165 Rome, Italy; (E.T.); (M.L.)
| | - Antonello Vidiri
- Radiology and Diagnostic Imaging Department, Regina Elena National Cancer Institute, IRCCS, Via Elio Chianesi 53, 00144 Rome, Italy;
| | - Giulio Ranazzi
- Surgical Pathology Unit, Department of Clinical and Molecular Medicine, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (G.R.); (A.S.)
| | - Antonella Stoppacciaro
- Surgical Pathology Unit, Department of Clinical and Molecular Medicine, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (G.R.); (A.S.)
| | - Giulia Moltoni
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
| | - Matteo Nicolai
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
| | - Andrea Romano
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
| | - Alberto Di Napoli
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
| | - Alessandro Bozzao
- Neuroradiology Unit, NESMOS Department, Sant’Andrea Hospital, La Sapienza University, Via di Grottarossa 1035, 00189 Rome, Italy; (L.P.); (F.D.); (G.M.); (M.N.); (A.R.); (A.D.N.); (A.B.)
| |
Collapse
|
39
|
Bhinder B, Gilvary C, Madhukar NS, Elemento O. Artificial Intelligence in Cancer Research and Precision Medicine. Cancer Discov 2021; 11:900-915. [PMID: 33811123 DOI: 10.1158/2159-8290.cd-21-0090] [Citation(s) in RCA: 168] [Impact Index Per Article: 56.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 02/06/2021] [Accepted: 02/08/2021] [Indexed: 11/16/2022]
Abstract
Artificial intelligence (AI) is rapidly reshaping cancer research and personalized clinical care. Availability of high-dimensionality datasets coupled with advances in high-performance computing, as well as innovative deep learning architectures, has led to an explosion of AI use in various aspects of oncology research. These applications range from detection and classification of cancer, to molecular characterization of tumors and their microenvironment, to drug discovery and repurposing, to predicting treatment outcomes for patients. As these advances start penetrating the clinic, we foresee a shifting paradigm in cancer care becoming strongly driven by AI. SIGNIFICANCE: AI has the potential to dramatically affect nearly all aspects of oncology-from enhancing diagnosis to personalizing treatment and discovering novel anticancer drugs. Here, we review the recent enormous progress in the application of AI to oncology, highlight limitations and pitfalls, and chart a path for adoption of AI in the cancer clinic.
Collapse
Affiliation(s)
- Bhavneet Bhinder
- Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, New York.,Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York
| | | | | | - Olivier Elemento
- Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, New York. .,Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York.,OneThree Biotech, New York, New York
| |
Collapse
|
40
|
Henze L, Walter U, Murua Escobar H, Junghanss C, Jaster R, Köhling R, Lange F, Salehzadeh-Yazdi A, Wolkenhauer O, Hamed M, Barrantes I, Palmer D, Möller S, Kowald A, Heussen N, Fuellen G. Towards biomarkers for outcomes after pancreatic ductal adenocarcinoma and ischaemic stroke, with focus on (co)-morbidity and ageing/cellular senescence (SASKit): protocol for a prospective cohort study. BMJ Open 2020; 10:e039560. [PMID: 33334830 PMCID: PMC7747584 DOI: 10.1136/bmjopen-2020-039560] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
INTRODUCTION Ageing-related processes such as cellular senescence are believed to underlie the accumulation of diseases in time, causing (co)morbidity, including cancer, thromboembolism and stroke. Interfering with these processes may delay, stop or reverse morbidity. The aim of this study is to investigate the link between (co)morbidity and ageing by exploring biomarkers and molecular mechanisms of disease-triggered deterioration in patients with pancreatic ductal adenocarcinoma (PDAC) and (thromboembolic) ischaemic stroke (IS). METHODS AND ANALYSIS We will recruit 50 patients with PDAC, 50 patients with (thromboembolic) IS and 50 controls at Rostock University Medical Center, Germany. We will gather routine blood data, clinical performance measurements and patient-reported outcomes at up to seven points in time, alongside in-depth transcriptomics and proteomics at two of the early time points. Aiming for clinically relevant biomarkers, the primary outcome is a composite of probable sarcopenia, clinical performance (described by ECOG Performance Status for patients with PDAC and the Modified Rankin Scale for patients with stroke) and quality of life. Further outcomes cover other aspects of morbidity such as cognitive decline and of comorbidity such as vascular or cancerous events. The data analysis is comprehensive in that it includes biostatistics and machine learning, both following standard role models and additional explorative approaches. Prognostic and predictive biomarkers for interventions addressing senescence may become available if the biomarkers that we find are specifically related to ageing/cellular senescence. Similarly, diagnostic biomarkers will be explored. Our findings will require validation in independent studies, and our dataset shall be useful to validate the findings of other studies. In some of the explorative analyses, we shall include insights from systems biology modelling as well as insights from preclinical animal models. We anticipate that our detailed study protocol and data analysis plan may also guide other biomarker exploration trials. ETHICS AND DISSEMINATION The study was approved by the local ethics committee (Ethikkommission an der Medizinischen Fakultät der Universität Rostock, A2019-0174), registered at the German Clinical Trials Register (DRKS00021184), and results will be published following standard guidelines.
Collapse
Affiliation(s)
- Larissa Henze
- Department of Medicine, Clinic III, Hematology, Oncology, Palliative Medicine, Rostock University Medical Center and Research Focus Oncology, Rostock, Germany
| | - Uwe Walter
- Department of Neurology, Rostock University Medical Center and Centre for Transdisciplinary Neurosciences Rostock, Rostock, Germany
| | - Hugo Murua Escobar
- Department of Medicine, Clinic III, Hematology, Oncology, Palliative Medicine, Rostock University Medical Center and Research Focus Oncology, Rostock, Germany
| | - Christian Junghanss
- Department of Medicine, Clinic III, Hematology, Oncology, Palliative Medicine, Rostock University Medical Center and Research Focus Oncology, Rostock, Germany
| | - Robert Jaster
- Department of Gastroenterology, Rostock University Medical Center and Research Focus Oncology, Rostock, Germany
| | - Rüdiger Köhling
- Oscar Langendorff Institute of Physiology, Rostock University Medical Center and Centre for Transdisciplinary Neurosciences Rostock and Ageing of Individuals and Society, Interdisciplinary Faculty, Rostock University, Rostock, Germany
| | - Falko Lange
- Oscar Langendorff Institute of Physiology, Rostock University Medical Center, Rostock, Germany
| | - Ali Salehzadeh-Yazdi
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock and Centre for Transdisciplinary Neurosciences Rostock, Rostock University Medical Center, Rostock, Germany
| | - Mohamed Hamed
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center and Research Focus Oncology, Rostock, Germany
| | - Israel Barrantes
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center and Research Focus Oncology, Rostock, Germany
| | - Daniel Palmer
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany
| | - Steffen Möller
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany
| | - Axel Kowald
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center, Rostock, Germany
| | - Nicole Heussen
- Department of Medical Statistics, RWTH Aachen, Aachen, Germany
| | - Georg Fuellen
- Institute for Biostatistics and Informatics in Medicine and Ageing Research, Rostock University Medical Center and Centre for Transdisciplinary Neurosciences Rostock and Research Focus Oncology, Rostock and Ageing of Individuals and Society, Interdisciplinary Faculty, Rostock University, Rostock, Germany
| |
Collapse
|
41
|
Payrovnaziri SN, Chen Z, Rengifo-Moreno P, Miller T, Bian J, Chen JH, Liu X, He Z. Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review. J Am Med Inform Assoc 2020; 27:1173-1185. [PMID: 32417928 PMCID: PMC7647281 DOI: 10.1093/jamia/ocaa053] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 04/01/2020] [Accepted: 04/07/2020] [Indexed: 01/08/2023] Open
Abstract
OBJECTIVE To conduct a systematic scoping review of explainable artificial intelligence (XAI) models that use real-world electronic health record data, categorize these techniques according to different biomedical applications, identify gaps of current studies, and suggest future research directions. MATERIALS AND METHODS We searched MEDLINE, IEEE Xplore, and the Association for Computing Machinery (ACM) Digital Library to identify relevant papers published between January 1, 2009 and May 1, 2019. We summarized these studies based on the year of publication, prediction tasks, machine learning algorithm, dataset(s) used to build the models, the scope, category, and evaluation of the XAI methods. We further assessed the reproducibility of the studies in terms of the availability of data and code and discussed open issues and challenges. RESULTS Forty-two articles were included in this review. We reported the research trend and most-studied diseases. We grouped XAI methods into 5 categories: knowledge distillation and rule extraction (N = 13), intrinsically interpretable models (N = 9), data dimensionality reduction (N = 8), attention mechanism (N = 7), and feature interaction and importance (N = 5). DISCUSSION XAI evaluation is an open issue that requires a deeper focus in the case of medical applications. We also discuss the importance of reproducibility of research work in this field, as well as the challenges and opportunities of XAI from 2 medical professionals' point of view. CONCLUSION Based on our review, we found that XAI evaluation in medicine has not been adequately and formally practiced. Reproducibility remains a critical concern. Ample opportunities exist to advance XAI research in medicine.
Collapse
Affiliation(s)
| | - Zhaoyi Chen
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
| | - Pablo Rengifo-Moreno
- College of Medicine, Florida State University, Tallahassee, Florida, USA
- Tallahassee Memorial Hospital, Tallahassee, Florida, USA
| | - Tim Miller
- School of Computing and Information Systems, The University of Melbourne, Melbourne, Victoria, Australia
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, Florida, USA
| | - Jonathan H Chen
- Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, California, USA
- Division of Hospital Medicine, Department of Medicine, Stanford University, Stanford, California, USA
| | - Xiuwen Liu
- Department of Computer Science, Florida State University, Tallahassee, Florida, USA
| | - Zhe He
- School of Information, Florida State University, Tallahassee, Florida, USA
| |
Collapse
|
42
|
Gonçalves FG, Chawla S, Mohan S. Emerging MRI Techniques to Redefine Treatment Response in Patients With Glioblastoma. J Magn Reson Imaging 2020; 52:978-997. [PMID: 32190946 DOI: 10.1002/jmri.27105] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 01/28/2020] [Accepted: 01/30/2020] [Indexed: 12/14/2022] Open
Abstract
Glioblastoma is the most common and most malignant primary brain tumor. Despite aggressive multimodal treatment, its prognosis remains poor. Even with continuous developments in MRI, which has provided us with newer insights into the diagnosis and understanding of tumor biology, response assessment in the posttherapy setting remains challenging. We believe that the integration of additional information from advanced neuroimaging techniques can further improve the diagnostic accuracy of conventional MRI. In this article, we review the utility of advanced neuroimaging techniques such as diffusion-weighted imaging, diffusion tensor imaging, perfusion-weighted imaging, proton magnetic resonance spectroscopy, and chemical exchange saturation transfer in characterizing and evaluating treatment response in patients with glioblastoma. We will also discuss the existing challenges and limitations of using these techniques in clinical settings and possible solutions to avoiding pitfalls in study design, data acquisition, and analysis for future studies. LEVEL OF EVIDENCE: 2 TECHNICAL EFFICACY STAGE: 3 J. Magn. Reson. Imaging 2020;52:978-997.
Collapse
Affiliation(s)
| | - Sanjeev Chawla
- Department of Radiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Suyash Mohan
- Department of Radiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
43
|
Zhu W, Xie L, Han J, Guo X. The Application of Deep Learning in Cancer Prognosis Prediction. Cancers (Basel) 2020; 12:E603. [PMID: 32150991 PMCID: PMC7139576 DOI: 10.3390/cancers12030603] [Citation(s) in RCA: 120] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 02/28/2020] [Accepted: 03/02/2020] [Indexed: 12/11/2022] Open
Abstract
Deep learning has been applied to many areas in health care, including imaging diagnosis, digital pathology, prediction of hospital admission, drug design, classification of cancer and stromal cells, doctor assistance, etc. Cancer prognosis is to estimate the fate of cancer, probabilities of cancer recurrence and progression, and to provide survival estimation to the patients. The accuracy of cancer prognosis prediction will greatly benefit clinical management of cancer patients. The improvement of biomedical translational research and the application of advanced statistical analysis and machine learning methods are the driving forces to improve cancer prognosis prediction. Recent years, there is a significant increase of computational power and rapid advancement in the technology of artificial intelligence, particularly in deep learning. In addition, the cost reduction in large scale next-generation sequencing, and the availability of such data through open source databases (e.g., TCGA and GEO databases) offer us opportunities to possibly build more powerful and accurate models to predict cancer prognosis more accurately. In this review, we reviewed the most recent published works that used deep learning to build models for cancer prognosis prediction. Deep learning has been suggested to be a more generic model, requires less data engineering, and achieves more accurate prediction when working with large amounts of data. The application of deep learning in cancer prognosis has been shown to be equivalent or better than current approaches, such as Cox-PH. With the burst of multi-omics data, including genomics data, transcriptomics data and clinical information in cancer studies, we believe that deep learning would potentially improve cancer prognosis.
Collapse
Affiliation(s)
- Wan Zhu
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics center, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China;
- Department of Anesthesia, Stanford University, 300 Pasteur Drive, Stanford, CA 94305, USA
| | - Longxiang Xie
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics center, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China;
| | - Jianye Han
- Department of Computer Science, University of Illinois, Urbana Champions, IL 61820, USA;
| | - Xiangqian Guo
- Department of Preventive Medicine, Institute of Biomedical Informatics, Cell Signal Transduction Laboratory, Bioinformatics center, School of Basic Medical Sciences, Henan University, Kaifeng 475004, China;
| |
Collapse
|
44
|
Crawford J, Greene CS. Incorporating biological structure into machine learning models in biomedicine. Curr Opin Biotechnol 2020; 63:126-134. [PMID: 31962244 PMCID: PMC7308204 DOI: 10.1016/j.copbio.2019.12.021] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 12/17/2019] [Accepted: 12/19/2019] [Indexed: 12/19/2022]
Abstract
In biomedical applications of machine learning, relevant information
often has a rich structure that is not easily encoded as real-valued predictors.
Examples of such data include DNA or RNA sequences, gene sets or pathways, gene
interaction or coexpression networks, ontologies, and phylogenetic trees. We
highlight recent examples of machine learning models that use structure to
constrain model architecture or incorporate structured data into model training.
For machine learning in biomedicine, where sample size is limited and model
interpretability is crucial, incorporating prior knowledge in the form of
structured data can be particularly useful. The area of research would benefit
from performant open source implementations and independent benchmarking
efforts.
Collapse
Affiliation(s)
- Jake Crawford
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States; Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Casey S Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States; Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA, United States.
| |
Collapse
|
45
|
Sotoudeh H, Shafaat O, Bernstock JD, Brooks MD, Elsayed GA, Chen JA, Szerip P, Chagoya G, Gessler F, Sotoudeh E, Shafaat A, Friedman GK. Artificial Intelligence in the Management of Glioma: Era of Personalized Medicine. Front Oncol 2019; 9:768. [PMID: 31475111 PMCID: PMC6702305 DOI: 10.3389/fonc.2019.00768] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 07/30/2019] [Indexed: 12/13/2022] Open
Abstract
Purpose: Artificial intelligence (AI) has accelerated novel discoveries across multiple disciplines including medicine. Clinical medicine suffers from a lack of AI-based applications, potentially due to lack of awareness of AI methodology. Future collaboration between computer scientists and clinicians is critical to maximize the benefits of transformative technology in this field for patients. To illustrate, we describe AI-based advances in the diagnosis and management of gliomas, the most common primary central nervous system (CNS) malignancy. Methods: Presented is a succinct description of foundational concepts of AI approaches and their relevance to clinical medicine, geared toward clinicians without computer science backgrounds. We also review novel AI approaches in the diagnosis and management of glioma. Results: Novel AI approaches in gliomas have been developed to predict the grading and genomics from imaging, automate the diagnosis from histopathology, and provide insight into prognosis. Conclusion: Novel AI approaches offer acceptable performance in gliomas. Further investigation is necessary to improve the methodology and determine the full clinical utility of these novel approaches.
Collapse
Affiliation(s)
- Houman Sotoudeh
- Department of Neuroradiology, University of Alabama, Birmingham, AL, United States
| | - Omid Shafaat
- Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | - Joshua D Bernstock
- Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States
| | - Michael David Brooks
- Department of Neuroradiology, University of Alabama, Birmingham, AL, United States
| | - Galal A Elsayed
- Department of Neurosurgery, University of Alabama, Birmingham, AL, United States
| | - Jason A Chen
- Medical Scientist Training Program, University of California, Los Angeles, Los Angeles, CA, United States
| | - Paul Szerip
- Senior Research Scientist, Uber AI Labs, San Francisco, CA, United States
| | - Gustavo Chagoya
- Department of Neurosurgery, University of Alabama, Birmingham, AL, United States
| | - Florian Gessler
- Department of Neurosurgery, Goethe University, Frankfurt, Germany
| | - Ehsan Sotoudeh
- Department of Surgery, Iranian Hospital, Dubai, United Arab Emirates
| | - Amir Shafaat
- Department of Mechanical Engineering, Arak University of Technology, Arak, Iran
| | - Gregory K Friedman
- Division of Pediatric Hematology and Oncology, Department of Pediatrics, University of Alabama, Birmingham, AL, United States
| |
Collapse
|