1
|
Ballard JL, Wang Z, Li W, Shen L, Long Q. Deep learning-based approaches for multi-omics data integration and analysis. BioData Min 2024; 17:38. [PMID: 39358793 DOI: 10.1186/s13040-024-00391-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 09/06/2024] [Indexed: 10/04/2024] Open
Abstract
BACKGROUND The rapid growth of deep learning, as well as the vast and ever-growing amount of available data, have provided ample opportunity for advances in fusion and analysis of complex and heterogeneous data types. Different data modalities provide complementary information that can be leveraged to gain a more complete understanding of each subject. In the biomedical domain, multi-omics data includes molecular (genomics, transcriptomics, proteomics, epigenomics, metabolomics, etc.) and imaging (radiomics, pathomics) modalities which, when combined, have the potential to improve performance on prediction, classification, clustering and other tasks. Deep learning encompasses a wide variety of methods, each of which have certain strengths and weaknesses for multi-omics integration. METHOD In this review, we categorize recent deep learning-based approaches by their basic architectures and discuss their unique capabilities in relation to one another. We also discuss some emerging themes advancing the field of multi-omics integration. RESULTS Deep learning-based multi-omics integration methods were categorized broadly into non-generative (feedforward neural networks, graph convolutional neural networks, and autoencoders) and generative (variational methods, generative adversarial models, and a generative pretrained model). Generative methods have the advantage of being able to impose constraints on the shared representations to enforce certain properties or incorporate prior knowledge. They can also be used to generate or impute missing modalities. Recent advances achieved by these methods include the ability to handle incomplete data as well as going beyond the traditional molecular omics data types to integrate other modalities such as imaging data. CONCLUSION We expect to see further growth in methods that can handle missingness, as this is a common challenge in working with complex and heterogeneous data. Additionally, methods that integrate more data types are expected to improve performance on downstream tasks by capturing a comprehensive view of each sample.
Collapse
Affiliation(s)
- Jenna L Ballard
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.
| | - Zexuan Wang
- Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania, 209 S. 33rd Street, Philadelphia, PA, 19104, USA
| | - Wenrui Li
- Department of Statistics, University of Connecticut, 215 Glenbrook Road, Storrs, CT, 06269, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, 19104, USA.
| | - Qi Long
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, PA, 19104, USA.
| |
Collapse
|
2
|
Tak D, Garomsa BA, Zapaishchykova A, Ye Z, Vajapeyam S, Mahootiha M, Climent Pardo JC, Smith C, Familiar AM, Chaunzwa T, Liu KX, Prabhu S, Bandopadhayay P, Nabavizadeh A, Mueller S, Aerts HJ, Haas-Kogan D, Poussaint TY, Kann BH. Longitudinal risk prediction for pediatric glioma with temporal deep learning. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.04.24308434. [PMID: 38978642 PMCID: PMC11230342 DOI: 10.1101/2024.06.04.24308434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Pediatric glioma recurrence can cause morbidity and mortality; however, recurrence pattern and severity are heterogeneous and challenging to predict with established clinical and genomic markers. Resultingly, almost all children undergo frequent, long-term, magnetic resonance (MR) brain surveillance regardless of individual recurrence risk. Deep learning analysis of longitudinal MR may be an effective approach for improving individualized recurrence prediction in gliomas and other cancers but has thus far been infeasible with current frameworks. Here, we propose a self-supervised, deep learning approach to longitudinal medical imaging analysis, temporal learning, that models the spatiotemporal information from a patient's current and prior brain MRs to predict future recurrence. We apply temporal learning to pediatric glioma surveillance imaging for 715 patients (3,994 scans) from four distinct clinical settings. We find that longitudinal imaging analysis with temporal learning improves recurrence prediction performance by up to 41% compared to traditional approaches, with improvements in performance in both low- and high-grade glioma. We find that recurrence prediction accuracy increases incrementally with the number of historical scans available per patient. Temporal deep learning may enable point-of-care decision-support for pediatric brain tumors and be adaptable more broadly to patients with other cancers and chronic diseases undergoing surveillance imaging.
Collapse
|
3
|
Odusami M, Maskeliūnas R, Damaševičius R, Misra S. Machine learning with multimodal neuroimaging data to classify stages of Alzheimer's disease: a systematic review and meta-analysis. Cogn Neurodyn 2024; 18:775-794. [PMID: 38826669 PMCID: PMC11143094 DOI: 10.1007/s11571-023-09993-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 06/23/2023] [Accepted: 07/17/2023] [Indexed: 06/04/2024] Open
Abstract
In recent years, Alzheimer's disease (AD) has been a serious threat to human health. Researchers and clinicians alike encounter a significant obstacle when trying to accurately identify and classify AD stages. Several studies have shown that multimodal neuroimaging input can assist in providing valuable insights into the structural and functional changes in the brain related to AD. Machine learning (ML) algorithms can accurately categorize AD phases by identifying patterns and linkages in multimodal neuroimaging data using powerful computational methods. This study aims to assess the contribution of ML methods to the accurate classification of the stages of AD using multimodal neuroimaging data. A systematic search is carried out in IEEE Xplore, Science Direct/Elsevier, ACM DigitalLibrary, and PubMed databases with forward snowballing performed on Google Scholar. The quantitative analysis used 47 studies. The explainable analysis was performed on the classification algorithm and fusion methods used in the selected studies. The pooled sensitivity and specificity, including diagnostic efficiency, were evaluated by conducting a meta-analysis based on a bivariate model with the hierarchical summary receiver operating characteristics (ROC) curve of multimodal neuroimaging data and ML methods in the classification of AD stages. Wilcoxon signed-rank test is further used to statistically compare the accuracy scores of the existing models. With a 95% confidence interval of 78.87-87.71%, the combined sensitivity for separating participants with mild cognitive impairment (MCI) from healthy control (NC) participants was 83.77%; for separating participants with AD from NC, it was 94.60% (90.76%, 96.89%); for separating participants with progressive MCI (pMCI) from stable MCI (sMCI), it was 80.41% (74.73%, 85.06%). With a 95% confidence interval (78.87%, 87.71%), the Pooled sensitivity for distinguishing mild cognitive impairment (MCI) from healthy control (NC) participants was 83.77%, with a 95% confidence interval (90.76%, 96.89%), the Pooled sensitivity for distinguishing AD from NC was 94.60%, likewise (MCI) from healthy control (NC) participants was 83.77% progressive MCI (pMCI) from stable MCI (sMCI) was 80.41% (74.73%, 85.06%), and early MCI (EMCI) from NC was 86.63% (82.43%, 89.95%). Pooled specificity for differentiating MCI from NC was 79.16% (70.97%, 87.71%), AD from NC was 93.49% (91.60%, 94.90%), pMCI from sMCI was 81.44% (76.32%, 85.66%), and EMCI from NC was 85.68% (81.62%, 88.96%). The Wilcoxon signed rank test showed a low P-value across all the classification tasks. Multimodal neuroimaging data with ML is a promising future in classifying the stages of AD but more research is required to increase the validity of its application in clinical practice.
Collapse
Affiliation(s)
- Modupe Odusami
- Department of Multimedia Engineering, Kaunas University of Technology, Kaunas, Lithuania
| | - Rytis Maskeliūnas
- Department of Multimedia Engineering, Kaunas University of Technology, Kaunas, Lithuania
| | | | - Sanjay Misra
- Department of Applied Data Science, Institute for Energy Technology, Halden, Norway
| |
Collapse
|
4
|
Cheng N, Wang L, Liu Y, Song B, Ding C. HANSynergy: Heterogeneous Graph Attention Network for Drug Synergy Prediction. J Chem Inf Model 2024; 64:4334-4347. [PMID: 38709204 PMCID: PMC11135324 DOI: 10.1021/acs.jcim.4c00003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 05/07/2024]
Abstract
Drug synergy therapy is a promising strategy for cancer treatment. However, the extensive variety of available drugs and the time-intensive process of determining effective drug combinations through clinical trials pose significant challenges. It requires a reliable method for the rapid and precise selection of drug synergies. In response, various computational strategies have been developed for predicting drug synergies, yet the exploitation of heterogeneous biological network features remains underexplored. In this study, we construct a heterogeneous graph that encompasses diverse biological entities and interactions, utilizing rich data sets from sources, such as DrugCombDB, PubChem, UniProt, and cancer cell line encyclopedia (CCLE). We initialize node feature representations and introduce a novel virtual node to enhance drug representation. Our proposed method, the heterogeneous graph attention network for drug-drug synergy prediction (HANSynergy), has been experimentally validated to demonstrate that the heterogeneous graph attention network can extract key node features, efficiently harness the diversity of information, and further enhance network functionality through the incorporation of a multihead attention mechanism. In the comparative experiment, the highest accuracy (Acc) and area under the curve (AUC) are 0.877 and 0.947, respectively, in DrugCombDB_early data set, demonstrating the superiority of HANSynergy over the competing methods. Moreover, protein-protein interactions are important in understanding the mechanism of action of drugs. The heterogeneous attention mechanism facilitates protein-protein interaction analysis. By analyzing the changes of attention weight before and after heterogeneous network training, we investigated proteins that may be associated with drug combinations. Additionally, case studies align our findings with existing research, underscoring the potential of HANSynergy in drug synergy prediction. This advancement not only contributes to the burgeoning field of drug synergy prediction but also holds the potential to provide valuable insights and uncover new drug synergies for combating cancer.
Collapse
Affiliation(s)
- Ning Cheng
- School
of Informatics, Hunan University of Chinese
Medicine, Changsha, Hunan 410208, China
| | - Li Wang
- Degree
Programs in Systems and information Engineering, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
| | - Yiping Liu
- College
of Information Science and Engineering, Hunan University, Changsha, Hunan 410082, China
| | - Bosheng Song
- College
of Information Science and Engineering, Hunan University, Changsha, Hunan 410082, China
| | - Changsong Ding
- School
of Informatics, Hunan University of Chinese
Medicine, Changsha, Hunan 410208, China
- Big
Data Analysis Laboratory of Traditional Chinese Medicine, Hunan University of Chinese Medicine, Changsha, Hunan 410208, China
| |
Collapse
|
5
|
Jain S, Safo SE. DeepIDA-GRU: a deep learning pipeline for integrative discriminant analysis of cross-sectional and longitudinal multiview data with applications to inflammatory bowel disease classification. Brief Bioinform 2024; 25:bbae339. [PMID: 39007595 DOI: 10.1093/bib/bbae339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/29/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024] Open
Abstract
Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, which presents limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. In addition, it identifies key variables that contribute to the association between views and the separation between classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks for cross-sectional data and recurrent neural networks for longitudinal data. We applied this pipeline to cross-sectional and longitudinal multiomics data (metagenomics, transcriptomics and metabolomics) from an inflammatory bowel disease (IBD) study and identified microbial pathways, metabolites and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods.
Collapse
Affiliation(s)
- Sarthak Jain
- Department of Electrical Engineering, University of Minnesota, Minneapolis, MN 55455, United States
| | - Sandra E Safo
- Division of Biostatistics and Health Data Science, University of Minnesota, Minneapolis, MN 55455, United States
| |
Collapse
|
6
|
Bordukova M, Makarov N, Rodriguez-Esteban R, Schmich F, Menden MP. Generative artificial intelligence empowers digital twins in drug discovery and clinical trials. Expert Opin Drug Discov 2024; 19:33-42. [PMID: 37887266 DOI: 10.1080/17460441.2023.2273839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 10/18/2023] [Indexed: 10/28/2023]
Abstract
INTRODUCTION The concept of Digital Twins (DTs) translated to drug development and clinical trials describes virtual representations of systems of various complexities, ranging from individual cells to entire humans, and enables in silico simulations and experiments. DTs increase the efficiency of drug discovery and development by digitalizing processes associated with high economic, ethical, or social burden. The impact is multifaceted: DT models sharpen disease understanding, support biomarker discovery and accelerate drug development, thus advancing precision medicine. One way to realize DTs is by generative artificial intelligence (AI), a cutting-edge technology that enables the creation of novel, realistic and complex data with desired properties. AREAS COVERED The authors provide a brief introduction to generative AI and describe how it facilitates the modeling of DTs. In addition, they compare existing implementations of generative AI for DTs in drug discovery and clinical trials. Finally, they discuss technical and regulatory challenges that should be addressed before DTs can transform drug discovery and clinical trials. EXPERT OPINION The current state of DTs in drug discovery and clinical trials does not exploit the entire power of generative AI yet and is limited to simulation of a small number of characteristics. Nonetheless, generative AI has the potential to transform the field by leveraging recent developments in deep learning and customizing models for the needs of scientists, physicians and patients.
Collapse
Affiliation(s)
- Maria Bordukova
- Data & Analytics, Pharmaceutical Research and Early Development, Roche Innovation Center Munich (RICM), Penzberg, Germany
- Institute of Computational Biology, Computational Health Center, Helmholtz Munich, Munich, Germany
- Department of Biology, Ludwig-Maximilians University Munich, Munich, Germany
| | - Nikita Makarov
- Data & Analytics, Pharmaceutical Research and Early Development, Roche Innovation Center Munich (RICM), Penzberg, Germany
- Institute of Computational Biology, Computational Health Center, Helmholtz Munich, Munich, Germany
- Department of Biology, Ludwig-Maximilians University Munich, Munich, Germany
| | - Raul Rodriguez-Esteban
- Data & Analytics, Pharmaceutical Research and Early Development, Roche Innovation Center Basel (RICB), Basel, Switzerland
| | - Fabian Schmich
- Data & Analytics, Pharmaceutical Research and Early Development, Roche Innovation Center Munich (RICM), Penzberg, Germany
| | - Michael P Menden
- Institute of Computational Biology, Computational Health Center, Helmholtz Munich, Munich, Germany
- Department of Biology, Ludwig-Maximilians University Munich, Munich, Germany
- Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Australia
- German Center for Diabetes Research (DZD e.V.), Munich, Germany
| |
Collapse
|
7
|
Baum L, Johns M, Poikela M, Möller R, Ananthasubramaniam B, Prasser F. Data integration and analysis for circadian medicine. Acta Physiol (Oxf) 2023; 237:e13951. [PMID: 36790321 DOI: 10.1111/apha.13951] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 02/04/2023] [Accepted: 02/12/2023] [Indexed: 02/16/2023]
Abstract
Data integration, data sharing, and standardized analyses are important enablers for data-driven medical research. Circadian medicine is an emerging field with a particularly high need for coordinated and systematic collaboration between researchers from different disciplines. Datasets in circadian medicine are multimodal, ranging from molecular circadian profiles and clinical parameters to physiological measurements and data obtained from (wearable) sensors or reported by patients. Uniquely, data spanning both the time dimension and the spatial dimension (across tissues) are needed to obtain a holistic view of the circadian system. The study of human rhythms in the context of circadian medicine has to confront the heterogeneity of clock properties within and across subjects and our inability to repeatedly obtain relevant biosamples from one subject. This requires informatics solutions for integrating and visualizing relevant data types at various temporal resolutions ranging from milliseconds and seconds to minutes and several hours. Associated challenges range from a lack of standards that can be used to represent all required data in a common interoperable form, to challenges related to data storage, to the need to perform transformations for integrated visualizations, and to privacy issues. The downstream analysis of circadian rhythms requires specialized approaches for the identification, characterization, and discrimination of rhythms. We conclude that circadian medicine research provides an ideal environment for developing innovative methods to address challenges related to the collection, integration, visualization, and analysis of multimodal multidimensional biomedical data.
Collapse
Affiliation(s)
- Lena Baum
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Marco Johns
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Maija Poikela
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Ralf Möller
- Institute of Information Systems, University of Lübeck, Lübeck, Germany
| | | | - Fabian Prasser
- Berlin Institute of Health at Charité-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
8
|
Multimodal brain tumor detection using multimodal deep transfer learning. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
9
|
Yoo J, Yoo I, Youn I, Kim SM, Yu R, Kim K, Kim K, Lee SB. Residual one-dimensional convolutional neural network for neuromuscular disorder classification from needle electromyography signals with explainability. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107079. [PMID: 36191354 DOI: 10.1016/j.cmpb.2022.107079] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 07/25/2022] [Accepted: 08/20/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND AND OBJECTIVE Neuromuscular disorders are diseases that damage our ability to control body movements. Needle electromyography (nEMG) is often used to diagnose neuromuscular disorders, which is an electrophysiological test measuring electric signals generated from a muscle using an invasive needle. Characteristics of nEMG signals are manually analyzed by an electromyographer to diagnose the types of neuromuscular disorders, and this process is highly dependent on the subjective experience of the electromyographer. Contemporary computer-aided methods utilized deep learning image classification models to classify nEMG signals which are not optimized for classifying signals. Additionally, model explainability was not addressed which is crucial in medical applications. This study aims to improve prediction accuracy, inference time, and explain model predictions in nEMG neuromuscular disorder classification. METHODS This study introduces the nEMGNet, a one-dimensional convolutional neural network with residual connections designed to extract features from raw signals with higher accuracy and faster speed compared to image classification models from previous works. Next, the divide-and-vote (DiVote) algorithm was designed to integrate each subject's heterogeneous nEMG signal data structures and to utilize muscle subtype information for higher accuracy. Finally, feature visualization was used to identify the causality of nEMGNet diagnosis predictions, to ensure that nEMGNet made predictions on valid features, not artifacts. RESULTS The proposed method was tested using 376 nEMG signals measured from 57 subjects between June 2015 to July 2020 in Seoul National University Hospital. The results from the three-class classification task demonstrated that nEMGNet's prediction accuracy of nEMG signal segments was 62.35%, and the subject diagnosis prediction accuracy of nEMGNet and the DiVote algorithm was 83.69 %, over 5-fold cross-validation. nEMGNet outperformed all models from previous works on nEMG diagnosis classification, and heuristic analysis of feature visualization results indicate that nEMGNet learned relevant nEMG signal characteristics. CONCLUSIONS This study introduced nEMGNet and DiVote algorithm which demonstrated fast and accurate performance in predicting neuromuscular disorders based on nEMG signals. The proposed method may be applied in medicine to support real-time electrophysiologic diagnosis.
Collapse
Affiliation(s)
- Jaesung Yoo
- School of Electrical Engineering, Korea University, Seoul, Republic of Korea
| | - Ilhan Yoo
- Department of Neurology, Nowon Eulji Medical Center, Eulji University School of Medicine, Seoul, Republic of Korea
| | - Ina Youn
- Department of Computer Science, New York University, NY, USA
| | - Sung-Min Kim
- Department of Neurology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Ri Yu
- Department of Software and Computer Engineering, Department of Artificial Intelligence, Ajou University
| | - Kwangsoo Kim
- Transdisciplinary Department of Medicine and Advanced Technology, Seoul National University Hospital, Seoul, Republic of Korea
| | - Keewon Kim
- Department of Rehabilitation Medicine, Seoul National University Hospital, Seoul, Republic of Korea.
| | - Seung-Bo Lee
- Department of Medical Informatics: Keimyung University School of Medicine, Daegu, Republic of Korea.
| |
Collapse
|
10
|
Thibeau-Sutre E, Díaz M, Hassanaly R, Routier A, Dormont D, Colliot O, Burgos N. ClinicaDL: An open-source deep learning software for reproducible neuroimaging processing. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 220:106818. [PMID: 35483271 DOI: 10.1016/j.cmpb.2022.106818] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 02/14/2022] [Accepted: 04/14/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE As deep learning faces a reproducibility crisis and studies on deep learning applied to neuroimaging are contaminated by methodological flaws, there is an urgent need to provide a safe environment for deep learning users to help them avoid common pitfalls that will bias and discredit their results. Several tools have been proposed to help deep learning users design their framework for neuroimaging data sets. Software overview: We present here ClinicaDL, one of these software tools. ClinicaDL interacts with BIDS, a standard format in the neuroimaging field, and its derivatives, so it can be used with a large variety of data sets. Moreover, it checks the absence of data leakage when inferring the results of new data with trained networks, and saves all necessary information to guarantee the reproducibility of results. The combination of ClinicaDL and its companion project Clinica allows performing an end-to-end neuroimaging analysis, from the download of raw data sets to the interpretation of trained networks, including neuroimaging preprocessing, quality check, label definition, architecture search, and network training and evaluation. CONCLUSIONS We implemented ClinicaDL to bring answers to three common issues encountered by deep learning users who are not always familiar with neuroimaging data: (1) the format and preprocessing of neuroimaging data sets, (2) the contamination of the evaluation procedure by data leakage and (3) a lack of reproducibility. We hope that its use by researchers will allow producing more reliable and thus valuable scientific studies in our field.
Collapse
Affiliation(s)
- Elina Thibeau-Sutre
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, Paris, F-75013, France
| | - Mauricio Díaz
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, Paris, F-75013, France
| | - Ravi Hassanaly
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, Paris, F-75013, France
| | - Alexandre Routier
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, Paris, F-75013, France
| | - Didier Dormont
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, DMU DIAMENT, Paris, F-75013, France
| | - Olivier Colliot
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, Paris, F-75013, France
| | - Ninon Burgos
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, Paris, F-75013, France.
| |
Collapse
|
11
|
Rambhatla S, Huang S, Trinh L, Zhang M, Long B, Dong M, Unadkat V, Yenikomshian HA, Gillenwater J, Liu Y. DL4Burn: Burn Surgical Candidacy Prediction using Multimodal Deep Learning. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2022; 2021:1039-1048. [PMID: 35308958 PMCID: PMC8861767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Burn wounds are most commonly evaluated through visual inspection to determine surgical candidacy, taking into account burn depth and individualized patient factors. This process, though cost effective, is subjective and varies by provider experience. Deep learning models can assist in burn wound surgical candidacy with predictions based on the wound and patient characteristics. To this end, we present a multimodal deep learning approach and a complementary mobile application - DL4Burn - for predicting burn surgical candidacy, to emulate the multi-factored approach used by clinicians. Specifically, we propose a ResNet50-based multimodal model and validate it using retrospectively obtained patient burn images, demographic, and injury data.
Collapse
Affiliation(s)
- Sirisha Rambhatla
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Samantha Huang
- Keck School of Medicine, University of Southern California, Los Angeles, CA, U.S.A
| | - Loc Trinh
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Mengfei Zhang
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Boyuan Long
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Mingtao Dong
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Vyom Unadkat
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| | - Haig A Yenikomshian
- Southern California Regional Burn Center at LAC+USC, University of Southern California, Los Angeles, CA
| | - Justin Gillenwater
- Southern California Regional Burn Center at LAC+USC, University of Southern California, Los Angeles, CA
| | - Yan Liu
- Computer Science Department, University of Southern California, Los Angeles, CA, U.S.A
| |
Collapse
|
12
|
Boehm KM, Khosravi P, Vanguri R, Gao J, Shah SP. Harnessing multimodal data integration to advance precision oncology. Nat Rev Cancer 2022; 22:114-126. [PMID: 34663944 PMCID: PMC8810682 DOI: 10.1038/s41568-021-00408-3] [Citation(s) in RCA: 158] [Impact Index Per Article: 79.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/08/2021] [Indexed: 02/07/2023]
Abstract
Advances in quantitative biomarker development have accelerated new forms of data-driven insights for patients with cancer. However, most approaches are limited to a single mode of data, leaving integrated approaches across modalities relatively underdeveloped. Multimodal integration of advanced molecular diagnostics, radiological and histological imaging, and codified clinical data presents opportunities to advance precision oncology beyond genomics and standard molecular techniques. However, most medical datasets are still too sparse to be useful for the training of modern machine learning techniques, and significant challenges remain before this is remedied. Combined efforts of data engineering, computational methods for analysis of heterogeneous data and instantiation of synergistic data models in biomedical research are required for success. In this Perspective, we offer our opinions on synthesizing complementary modalities of data with emerging multimodal artificial intelligence methods. Advancing along this direction will result in a reimagined class of multimodal biomarkers to propel the field of precision oncology in the coming decade.
Collapse
Affiliation(s)
- Kevin M Boehm
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Pegah Khosravi
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Rami Vanguri
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Jianjiong Gao
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Sohrab P Shah
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| |
Collapse
|
13
|
Pena D, Suescun J, Schiess M, Ellmore TM, Giancardo L. Toward a Multimodal Computer-Aided Diagnostic Tool for Alzheimer's Disease Conversion. Front Neurosci 2022; 15:744190. [PMID: 35046766 PMCID: PMC8761739 DOI: 10.3389/fnins.2021.744190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Accepted: 11/09/2021] [Indexed: 01/21/2023] Open
Abstract
Alzheimer’s disease (AD) is a progressive neurodegenerative disorder. It is one of the leading sources of morbidity and mortality in the aging population AD cardinal symptoms include memory and executive function impairment that profoundly alters a patient’s ability to perform activities of daily living. People with mild cognitive impairment (MCI) exhibit many of the early clinical symptoms of patients with AD and have a high chance of converting to AD in their lifetime. Diagnostic criteria rely on clinical assessment and brain magnetic resonance imaging (MRI). Many groups are working to help automate this process to improve the clinical workflow. Current computational approaches are focused on predicting whether or not a subject with MCI will convert to AD in the future. To our knowledge, limited attention has been given to the development of automated computer-assisted diagnosis (CAD) systems able to provide an AD conversion diagnosis in MCI patient cohorts followed longitudinally. This is important as these CAD systems could be used by primary care providers to monitor patients with MCI. The method outlined in this paper addresses this gap and presents a computationally efficient pre-processing and prediction pipeline, and is designed for recognizing patterns associated with AD conversion. We propose a new approach that leverages longitudinal data that can be easily acquired in a clinical setting (e.g., T1-weighted magnetic resonance images, cognitive tests, and demographic information) to identify the AD conversion point in MCI subjects with AUC = 84.7. In contrast, cognitive tests and demographics alone achieved AUC = 80.6, a statistically significant difference (n = 669, p < 0.05). We designed a convolutional neural network that is computationally efficient and requires only linear registration between imaging time points. The model architecture combines Attention and Inception architectures while utilizing both cross-sectional and longitudinal imaging and clinical information. Additionally, the top brain regions and clinical features that drove the model’s decision were investigated. These included the thalamus, caudate, planum temporale, and the Rey Auditory Verbal Learning Test. We believe our method could be easily translated into the healthcare setting as an objective AD diagnostic tool for patients with MCI.
Collapse
|
14
|
Sanchez-Martinez S, Camara O, Piella G, Cikes M, González-Ballester MÁ, Miron M, Vellido A, Gómez E, Fraser AG, Bijnens B. Machine Learning for Clinical Decision-Making: Challenges and Opportunities in Cardiovascular Imaging. Front Cardiovasc Med 2022; 8:765693. [PMID: 35059445 PMCID: PMC8764455 DOI: 10.3389/fcvm.2021.765693] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 12/07/2021] [Indexed: 11/30/2022] Open
Abstract
The use of machine learning (ML) approaches to target clinical problems is called to revolutionize clinical decision-making in cardiology. The success of these tools is dependent on the understanding of the intrinsic processes being used during the conventional pathway by which clinicians make decisions. In a parallelism with this pathway, ML can have an impact at four levels: for data acquisition, predominantly by extracting standardized, high-quality information with the smallest possible learning curve; for feature extraction, by discharging healthcare practitioners from performing tedious measurements on raw data; for interpretation, by digesting complex, heterogeneous data in order to augment the understanding of the patient status; and for decision support, by leveraging the previous steps to predict clinical outcomes, response to treatment or to recommend a specific intervention. This paper discusses the state-of-the-art, as well as the current clinical status and challenges associated with the two later tasks of interpretation and decision support, together with the challenges related to the learning process, the auditability/traceability, the system infrastructure and the integration within clinical processes in cardiovascular imaging.
Collapse
Affiliation(s)
| | - Oscar Camara
- Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, Spain
| | - Gemma Piella
- Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, Spain
| | - Maja Cikes
- Department of Cardiovascular Diseases, University of Zagreb School of Medicine, University Hospital Centre Zagreb, Zagreb, Croatia
| | | | - Marius Miron
- Joint Research Centre, European Commission, Seville, Spain
| | - Alfredo Vellido
- Computer Science Department, Intelligent Data Science and Artificial Intelligence (IDEAI-UPC) Research Center, Universitat Politècnica de Catalunya, Barcelona, Spain
| | - Emilia Gómez
- Department of Information and Communication Technologies, University Pompeu Fabra, Barcelona, Spain
- Joint Research Centre, European Commission, Seville, Spain
| | - Alan G. Fraser
- School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Bart Bijnens
- August Pi i Sunyer Biomedical Research Institute (IDIBAPS), Barcelona, Spain
- ICREA, Barcelona, Spain
- Department of Cardiovascular Sciences, KU Leuven, Leuven, Belgium
| |
Collapse
|
15
|
Carrillo-Perez F, Morales JC, Castillo-Secilla D, Molina-Castro Y, Guillén A, Rojas I, Herrera LJ. Non-small-cell lung cancer classification via RNA-Seq and histology imaging probability fusion. BMC Bioinformatics 2021; 22:454. [PMID: 34551733 PMCID: PMC8456075 DOI: 10.1186/s12859-021-04376-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 09/11/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Adenocarcinoma and squamous cell carcinoma are the two most prevalent lung cancer types, and their distinction requires different screenings, such as the visual inspection of histology slides by an expert pathologist, the analysis of gene expression or computer tomography scans, among others. In recent years, there has been an increasing gathering of biological data for decision support systems in the diagnosis (e.g. histology imaging, next-generation sequencing technologies data, clinical information, etc.). Using all these sources to design integrative classification approaches may improve the final diagnosis of a patient, in the same way that doctors can use multiple types of screenings to reach a final decision on the diagnosis. In this work, we present a late fusion classification model using histology and RNA-Seq data for adenocarcinoma, squamous-cell carcinoma and healthy lung tissue. RESULTS The classification model improves results over using each source of information separately, being able to reduce the diagnosis error rate up to a 64% over the isolate histology classifier and a 24% over the isolate gene expression classifier, reaching a mean F1-Score of 95.19% and a mean AUC of 0.991. CONCLUSIONS These findings suggest that a classification model using a late fusion methodology can considerably help clinicians in the diagnosis between the aforementioned lung cancer cancer subtypes over using each source of information separately. This approach can also be applied to any cancer type or disease with heterogeneous sources of information.
Collapse
Affiliation(s)
- Francisco Carrillo-Perez
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain.
| | - Juan Carlos Morales
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Daniel Castillo-Secilla
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Yésica Molina-Castro
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Alberto Guillén
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Ignacio Rojas
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Luis Javier Herrera
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| |
Collapse
|
16
|
Raveh B, Sun L, White KL, Sanyal T, Tempkin J, Zheng D, Bharath K, Singla J, Wang C, Zhao J, Li A, Graham NA, Kesselman C, Stevens RC, Sali A. Bayesian metamodeling of complex biological systems across varying representations. Proc Natl Acad Sci U S A 2021; 118:e2104559118. [PMID: 34453000 PMCID: PMC8536362 DOI: 10.1073/pnas.2104559118] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Comprehensive modeling of a whole cell requires an integration of vast amounts of information on various aspects of the cell and its parts. To divide and conquer this task, we introduce Bayesian metamodeling, a general approach to modeling complex systems by integrating a collection of heterogeneous input models. Each input model can in principle be based on any type of data and can describe a different aspect of the modeled system using any mathematical representation, scale, and level of granularity. These input models are 1) converted to a standardized statistical representation relying on probabilistic graphical models, 2) coupled by modeling their mutual relations with the physical world, and 3) finally harmonized with respect to each other. To illustrate Bayesian metamodeling, we provide a proof-of-principle metamodel of glucose-stimulated insulin secretion by human pancreatic β-cells. The input models include a coarse-grained spatiotemporal simulation of insulin vesicle trafficking, docking, and exocytosis; a molecular network model of glucose-stimulated insulin secretion signaling; a network model of insulin metabolism; a structural model of glucagon-like peptide-1 receptor activation; a linear model of a pancreatic cell population; and ordinary differential equations for systemic postprandial insulin response. Metamodeling benefits from decentralized computing, while often producing a more accurate, precise, and complete model that contextualizes input models as well as resolves conflicting information. We anticipate Bayesian metamodeling will facilitate collaborative science by providing a framework for sharing expertise, resources, data, and models, as exemplified by the Pancreatic β-Cell Consortium.
Collapse
Affiliation(s)
- Barak Raveh
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158
- Quantitative Biosciences Institute, University of California, San Francisco, CA 94158
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190416, Israel
| | - Liping Sun
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
| | - Kate L White
- Department of Biological Sciences, Bridge Institute, University of Southern California, Los Angeles, CA 90089
| | - Tanmoy Sanyal
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158
- Quantitative Biosciences Institute, University of California, San Francisco, CA 94158
| | - Jeremy Tempkin
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158
- Quantitative Biosciences Institute, University of California, San Francisco, CA 94158
| | - Dongqing Zheng
- Mork Family Department of Chemical Engineering and Materials Science, Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089
| | - Kala Bharath
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158
- Quantitative Biosciences Institute, University of California, San Francisco, CA 94158
| | - Jitin Singla
- Department of Biological Sciences, Bridge Institute, University of Southern California, Los Angeles, CA 90089
- Epstein Department of Industrial and Systems Engineering, The Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089
- Information Science Institute, The Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089
| | - Chenxi Wang
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Jihui Zhao
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
| | - Angdi Li
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Nicholas A Graham
- Mork Family Department of Chemical Engineering and Materials Science, Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089
| | - Carl Kesselman
- Epstein Department of Industrial and Systems Engineering, The Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089
- Information Science Institute, The Viterbi School of Engineering, University of Southern California, Los Angeles, CA 90089
| | - Raymond C Stevens
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
- Department of Biological Sciences, Bridge Institute, University of Southern California, Los Angeles, CA 90089
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158;
- Quantitative Biosciences Institute, University of California, San Francisco, CA 94158
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94158
| |
Collapse
|
17
|
Huang M, Lai H, Yu Y, Chen X, Wang T, Feng Q. Deep-gated recurrent unit and diet network-based genome-wide association analysis for detecting the biomarkers of Alzheimer's disease. Med Image Anal 2021; 73:102189. [PMID: 34343841 DOI: 10.1016/j.media.2021.102189] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 05/30/2021] [Accepted: 07/16/2021] [Indexed: 01/01/2023]
Abstract
Genome-wide association analysis (GWAS) is a commonly used method to detect the potential biomarkers of Alzheimer's disease (AD). Most existing GWAS methods entail a high computational cost, disregard correlations among imaging data and correlations among genetic data, and ignore various associations between longitudinal imaging and genetic data. A novel GWAS method was proposed to identify potential AD biomarkers and address these problems. A network based on a gated recurrent unit was applied without imputing incomplete longitudinal imaging data to integrate the longitudinal data of variable lengths and extract an image representation. In this study, a modified diet network that can considerably reduce the number of parameters in the genetic network was proposed to perform GWAS between image representation and genetic data. Genetic representation can be extracted in this way. A link between genetic representation and AD was established to detect potential AD biomarkers. The proposed method was tested on a set of simulated data and a real AD dataset. Results of the simulated data showed that the proposed method can accurately detect relevant biomarkers. Moreover, the results of real AD dataset showed that the proposed method can detect some new risk-related genes of AD. Based on previous reports, no research has incorporated a deep-learning model into a GWAS framework to investigate the potential information on super-high-dimensional genetic data and longitudinal imaging data and create a link between imaging genetics and AD for detecting potential AD biomarkers. Therefore, the proposed method may provide new insights into the underlying pathological mechanism of AD.
Collapse
Affiliation(s)
- Meiyan Huang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | - Haoran Lai
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Yuwei Yu
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Xiumei Chen
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Tao Wang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | | |
Collapse
|
18
|
Kumar S, Oh I, Schindler S, Lai AM, Payne PRO, Gupta A. Machine learning for modeling the progression of Alzheimer disease dementia using clinical data: a systematic literature review. JAMIA Open 2021; 4:ooab052. [PMID: 34350389 PMCID: PMC8327375 DOI: 10.1093/jamiaopen/ooab052] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 06/21/2021] [Accepted: 06/30/2021] [Indexed: 11/17/2022] Open
Abstract
OBJECTIVE Alzheimer disease (AD) is the most common cause of dementia, a syndrome characterized by cognitive impairment severe enough to interfere with activities of daily life. We aimed to conduct a systematic literature review (SLR) of studies that applied machine learning (ML) methods to clinical data derived from electronic health records in order to model risk for progression of AD dementia. MATERIALS AND METHODS We searched for articles published between January 1, 2010, and May 31, 2020, in PubMed, Scopus, ScienceDirect, IEEE Explore Digital Library, Association for Computing Machinery Digital Library, and arXiv. We used predefined criteria to select relevant articles and summarized them according to key components of ML analysis such as data characteristics, computational algorithms, and research focus. RESULTS There has been a considerable rise over the past 5 years in the number of research papers using ML-based analysis for AD dementia modeling. We reviewed 64 relevant articles in our SLR. The results suggest that majority of existing research has focused on predicting progression of AD dementia using publicly available datasets containing both neuroimaging and clinical data (neurobehavioral status exam scores, patient demographics, neuroimaging data, and laboratory test values). DISCUSSION Identifying individuals at risk for progression of AD dementia could potentially help to personalize disease management to plan future care. Clinical data consisting of both structured data tables and clinical notes can be effectively used in ML-based approaches to model risk for AD dementia progression. Data sharing and reproducibility of results can enhance the impact, adaptation, and generalizability of this research.
Collapse
Affiliation(s)
- Sayantan Kumar
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Inez Oh
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Suzanne Schindler
- Department of Neurology, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Albert M Lai
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Philip R O Payne
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| | - Aditi Gupta
- Institute for Informatics, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
19
|
Yang S, Zhu F, Ling X, Liu Q, Zhao P. Intelligent Health Care: Applications of Deep Learning in Computational Medicine. Front Genet 2021; 12:607471. [PMID: 33912213 PMCID: PMC8075004 DOI: 10.3389/fgene.2021.607471] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 03/05/2021] [Indexed: 12/24/2022] Open
Abstract
With the progress of medical technology, biomedical field ushered in the era of big data, based on which and driven by artificial intelligence technology, computational medicine has emerged. People need to extract the effective information contained in these big biomedical data to promote the development of precision medicine. Traditionally, the machine learning methods are used to dig out biomedical data to find the features from data, which generally rely on feature engineering and domain knowledge of experts, requiring tremendous time and human resources. Different from traditional approaches, deep learning, as a cutting-edge machine learning branch, can automatically learn complex and robust feature from raw data without the need for feature engineering. The applications of deep learning in medical image, electronic health record, genomics, and drug development are studied, where the suggestion is that deep learning has obvious advantage in making full use of biomedical data and improving medical health level. Deep learning plays an increasingly important role in the field of medical health and has a broad prospect of application. However, the problems and challenges of deep learning in computational medical health still exist, including insufficient data, interpretability, data privacy, and heterogeneity. Analysis and discussion on these problems provide a reference to improve the application of deep learning in medical health.
Collapse
Affiliation(s)
- Sijie Yang
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Xinghong Ling
- School of Computer Science and Technology, Soochow University, Suzhou, China
- WenZheng College of Soochow University, Suzhou, China
| | - Quan Liu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Peiyao Zhao
- School of Computer Science and Technology, Soochow University, Suzhou, China
| |
Collapse
|
20
|
Jin C, Yu H, Ke J, Ding P, Yi Y, Jiang X, Duan X, Tang J, Chang DT, Wu X, Gao F, Li R. Predicting treatment response from longitudinal images using multi-task deep learning. Nat Commun 2021; 12:1851. [PMID: 33767170 PMCID: PMC7994301 DOI: 10.1038/s41467-021-22188-y] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 03/02/2021] [Indexed: 12/24/2022] Open
Abstract
Radiographic imaging is routinely used to evaluate treatment response in solid tumors. Current imaging response metrics do not reliably predict the underlying biological response. Here, we present a multi-task deep learning approach that allows simultaneous tumor segmentation and response prediction. We design two Siamese subnetworks that are joined at multiple layers, which enables integration of multi-scale feature representations and in-depth comparison of pre-treatment and post-treatment images. The network is trained using 2568 magnetic resonance imaging scans of 321 rectal cancer patients for predicting pathologic complete response after neoadjuvant chemoradiotherapy. In multi-institution validation, the imaging-based model achieves AUC of 0.95 (95% confidence interval: 0.91–0.98) and 0.92 (0.87–0.96) in two independent cohorts of 160 and 141 patients, respectively. When combined with blood-based tumor markers, the integrated model further improves prediction accuracy with AUC 0.97 (0.93–0.99). Our approach to capturing dynamic information in longitudinal images may be broadly used for screening, treatment response evaluation, disease monitoring, and surveillance. Radiographic imaging is routinely used to evaluate treatment response in solid tumors. Here, the authors present a multi-task deep learning approach that allows simultaneous tumor segmentation and response prediction from longitudinal images in a multi-center study on rectal cancer.
Collapse
Affiliation(s)
- Cheng Jin
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Heng Yu
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jia Ke
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China
| | - Peirong Ding
- Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China.,Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Yongju Yi
- Center for Network Information, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Xiaofeng Jiang
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China
| | - Xin Duan
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China.,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China
| | - Jinghua Tang
- Department of Colorectal Surgery, Sun Yat-sen University Cancer Center, Guangzhou, China.,Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Daniel T Chang
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA
| | - Xiaojian Wu
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China. .,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China.
| | - Feng Gao
- Department of Colorectal Surgery, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China. .,Guangdong Institute of Gastroenterology, Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, Guangzhou, China.
| | - Ruijiang Li
- Department of Radiation Oncology, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
21
|
Kim SY, Choe EK, Shivakumar M, Kim D, Sohn KA. Multi-layered network-based pathway activity inference using directed random walks: application to predicting clinical outcomes in urologic cancer. Bioinformatics 2021; 37:2405-2413. [PMID: 33543748 PMCID: PMC8388033 DOI: 10.1093/bioinformatics/btab086] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 12/11/2020] [Accepted: 02/02/2021] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION To better understand the molecular features of cancers, a comprehensive analysis using multiomics data has been conducted. Additionally, a pathway activity inference method has been developed to facilitate the integrative effects of multiple genes. In this respect, we have recently proposed a novel integrative pathway activity inference approach, iDRW, and demonstrated the effectiveness of the method with respect to dichotomizing two survival groups. However, there were several limitations, such as a lack of generality. In this study, we designed a directed gene-gene graph using pathway information by assigning interactions between genes in multiple layers of networks. RESULTS : As a proof-of-concept study, it was evaluated using three genomic profiles of urologic cancer patients. The proposed integrative approach achieved improved outcome prediction performances compared with a single genomic profile alone and other existing pathway activity inference methods. The integrative approach also identified common/cancer-specific candidate driver pathways as predictive prognostic features in urologic cancers. Furthermore, it provides better biological insights into the prioritized pathways and genes in an integrated view using a multi-layered gene-gene network. Our framework is not specifically designed for urologic cancers and can be generally applicable for various datasets. AVAILABILITY iDRW is implemented as the R software package. The source codes are available at https://github.com/sykim122/iDRW. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- So Yeon Kim
- Department of Software and Computer Engineering, Ajou University, Suwon 16499, South Korea
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Eun Kyung Choe
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Surgery, Seoul National University Hospital Healthcare System Gangnam Center, Seoul 06236, South Korea
| | - Manu Shivakumar
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA
- To whom correspondence should be addressed. or
| | - Kyung-Ah Sohn
- Department of Software and Computer Engineering, Ajou University, Suwon 16499, South Korea
- Department of Artificial Intelligence, Ajou University, Suwon 16499, South Korea
- To whom correspondence should be addressed. or
| |
Collapse
|