1
|
Huang W, Tan K, Zhang Z, Hu J, Dong S. A Review of Fusion Methods for Omics and Imaging Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:74-93. [PMID: 35044920 DOI: 10.1109/tcbb.2022.3143900] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The development of omics data and biomedical images has greatly advanced the progress of precision medicine in diagnosis, treatment, and prognosis. The fusion of omics and imaging data, i.e., omics-imaging fusion, offers a new strategy for understanding complex diseases. However, due to a variety of issues such as the limited number of samples, high dimensionality of features, and heterogeneity of different data types, efficiently learning complementary or associated discriminative fusion information from omics and imaging data remains a challenge. Recently, numerous machine learning methods have been proposed to alleviate these problems. In this review, from the perspective of fusion levels and fusion methods, we first provide an overview of preprocessing and feature extraction methods for omics and imaging data, and comprehensively analyze and summarize the basic forms and variations of commonly used and newly emerging fusion methods, along with their advantages, disadvantages and the applicable scope. We then describe public datasets and compare experimental results of various fusion methods on the ADNI and TCGA datasets. Finally, we discuss future prospects and highlight remaining challenges in the field.
Collapse
|
2
|
Wang X, Yu G, Yan Z, Wan L, Wang W, Cui L. Lung Cancer Subtype Diagnosis by Fusing Image-Genomics Data and Hybrid Deep Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:512-523. [PMID: 34855599 DOI: 10.1109/tcbb.2021.3132292] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accurate diagnosis of cancer subtypes is crucial for precise treatment, because different cancer subtypes are involved with different pathology and require different therapies. Although deep learning techniques have made great success in computer vision and other fields, they do not work well on Lung cancer subtype diagnosis, due to the distinction of slide images between different cancer subtypes is ambiguous. Furthermore, they often over-fit to high-dimensional genomics data with limited samples, and do not fuse the image and genomics data in a sensible way. In this paper, we propose a hybrid deep network based approach LungDIG for Lung cancer subtype Diagnosis by fusing Image-Genomics data. LungDIG first tiles the tissue slide image into small patches and extracts the patch-level features by fine-tuning an Inception-V3 model. Since the patches may contain some false positives in non-diagnostic regions, it further designs a patch-level feature combination strategy to integrate the extracted patch features and maintain the diversity between different cancer subtypes. At the same time, it extracts the genomics features from Copy Number Variation data by an attention based nonlinear extractor. Next, it fuses the image and genomics features by an attention based multilayer perceptron (MLP) to diagnose cancer subtype. Experiments on TCGA lung cancer data show that LungDIG can not only achieve higher accuracy for cancer subtype diagnosis than state-of-the-art methods, but also have a high authenticity and good interpretability.
Collapse
|
3
|
Patel S, Sharma D, Uniyal A, Gadepalli A, Tiwari V. Recent advancements in biomarker research in schizophrenia: mapping the road from bench to bedside. Metab Brain Dis 2022; 37:2197-2211. [PMID: 35239143 DOI: 10.1007/s11011-022-00926-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 02/04/2022] [Indexed: 10/19/2022]
Abstract
Schizophrenia (SZ) is a severe progressive neurodegenerative as well as disruptive behavior disorder affecting innumerable people throughout the world. The discovery of potential biomarkers in the clinical scenario would lead to the development of effective methods of diagnosis and would provide an understanding of the prognosis of the disease. Moreover, breakthrough inventions for the treatment and prevention of this mysterious disease could evolve as a result of a thorough understanding of the clinical biomarkers. In this review, we have discussed about specific biomarkers of SZ an emphasis has been laid to delineate (1) diagnostic biomarkers like neuroimmune biomarkers, metabolic biomarkers, oligodendrocyte biomarkers and biomarkers of negative and cognitive symptoms, (2) therapeutic biomarkers like various neurotransmitter systems and (3) prognostic biomarkers. All the biomarkers were evaluated in drug-naïve (at least for 4 weeks) patients in order to achieve a clear comparison between schizophrenic patients and healthy controls. Also, an attempt has been made to elucidate the potential genes which serve as predictors and tools for the determination of biomarkers and would ultimately help in the prevention and treatment of this deadly illness.
Collapse
Affiliation(s)
- Shivangi Patel
- Department of Pharmacology, Bombay College of Pharmacy, 400098, Mumbai, India
| | - Dilip Sharma
- Rutgers New Jersey Medical School, 07103, Newark, NJ, United States
| | - Ankit Uniyal
- Department of Pharmaceutical Engineering, Indian Institute of Technology (Banaras Hindu University), 221005, Varanasi, U.P, India
| | - Anagha Gadepalli
- Department of Pharmaceutical Engineering, Indian Institute of Technology (Banaras Hindu University), 221005, Varanasi, U.P, India
| | - Vinod Tiwari
- Department of Pharmaceutical Engineering, Indian Institute of Technology (Banaras Hindu University), 221005, Varanasi, U.P, India.
| |
Collapse
|
4
|
Parcerisas A, Ortega-Gascó A, Pujadas L, Soriano E. The Hidden Side of NCAM Family: NCAM2, a Key Cytoskeleton Organization Molecule Regulating Multiple Neural Functions. Int J Mol Sci 2021; 22:10021. [PMID: 34576185 PMCID: PMC8471948 DOI: 10.3390/ijms221810021] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 09/12/2021] [Accepted: 09/14/2021] [Indexed: 02/07/2023] Open
Abstract
Although it has been over 20 years since Neural Cell Adhesion Molecule 2 (NCAM2) was identified as the second member of the NCAM family with a high expression in the nervous system, the knowledge of NCAM2 is still eclipsed by NCAM1. The first studies with NCAM2 focused on the olfactory bulb, where this protein has a key role in axonal projection and axonal/dendritic compartmentalization. In contrast to NCAM1, NCAM2's functions and partners in the brain during development and adulthood have remained largely unknown until not long ago. Recent studies have revealed the importance of NCAM2 in nervous system development. NCAM2 governs neuronal morphogenesis and axodendritic architecture, and controls important neuron-specific processes such as neuronal differentiation, synaptogenesis and memory formation. In the adult brain, NCAM2 is highly expressed in dendritic spines, and it regulates synaptic plasticity and learning processes. NCAM2's functions are related to its ability to adapt to the external inputs of the cell and to modify the cytoskeleton accordingly. Different studies show that NCAM2 interacts with proteins involved in cytoskeleton stability and proteins that regulate calcium influx, which could also modify the cytoskeleton. In this review, we examine the evidence that points to NCAM2 as a crucial cytoskeleton regulation protein during brain development and adulthood. This key function of NCAM2 may offer promising new therapeutic approaches for the treatment of neurodevelopmental diseases and neurodegenerative disorders.
Collapse
Affiliation(s)
- Antoni Parcerisas
- Department of Cell Biology, Physiology and Immunology, Institute of Neurosciences, University of Barcelona, 08028 Barcelona, Spain; (A.O.-G.); (L.P.)
- Centro de Investigación Biomédica en Red Sobre Enfermedades Neurodegenerativas (CIBERNED), 28031 Madrid, Spain
- Department of Basic Sciences, Universitat Internacional de Catalunya, 08195 Sant Cugat del Vallès, Spain
| | - Alba Ortega-Gascó
- Department of Cell Biology, Physiology and Immunology, Institute of Neurosciences, University of Barcelona, 08028 Barcelona, Spain; (A.O.-G.); (L.P.)
- Centro de Investigación Biomédica en Red Sobre Enfermedades Neurodegenerativas (CIBERNED), 28031 Madrid, Spain
| | - Lluís Pujadas
- Department of Cell Biology, Physiology and Immunology, Institute of Neurosciences, University of Barcelona, 08028 Barcelona, Spain; (A.O.-G.); (L.P.)
- Centro de Investigación Biomédica en Red Sobre Enfermedades Neurodegenerativas (CIBERNED), 28031 Madrid, Spain
| | - Eduardo Soriano
- Department of Cell Biology, Physiology and Immunology, Institute of Neurosciences, University of Barcelona, 08028 Barcelona, Spain; (A.O.-G.); (L.P.)
- Centro de Investigación Biomédica en Red Sobre Enfermedades Neurodegenerativas (CIBERNED), 28031 Madrid, Spain
| |
Collapse
|
5
|
Acharya S, Cui L, Pan Y. Multi-view feature selection for identifying gene markers: a diversified biological data driven approach. BMC Bioinformatics 2020; 21:483. [PMID: 33375940 PMCID: PMC7772934 DOI: 10.1186/s12859-020-03810-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 10/13/2020] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene expression data sets. In that context, designing an efficient feature selection algorithm exploiting knowledge from multiple potential biological resources may be an effective way to understand the spectrum of cancer or other diseases with applications in specific epidemiology for a particular population. RESULTS In the current article, we design the feature selection and marker gene detection as a multi-view multi-objective clustering problem. Regarding that, we propose an Unsupervised Multi-View Multi-Objective clustering-based gene selection approach called UMVMO-select. Three important resources of biological data (gene ontology, protein interaction data, protein sequence) along with gene expression values are collectively utilized to design two different views. UMVMO-select aims to reduce gene space without/minimally compromising the sample classification efficiency and determines relevant and non-redundant gene markers from three cancer gene expression benchmark data sets. CONCLUSION A thorough comparative analysis has been performed with five clustering and nine existing feature selection methods with respect to several internal and external validity metrics. Obtained results reveal the supremacy of the proposed method. Reported results are also validated through a proper biological significance test and heatmap plotting.
Collapse
Affiliation(s)
- Sudipta Acharya
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, People’s Republic of China
| | - Laizhong Cui
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, People’s Republic of China
| | - Yi Pan
- Department of Computer Science, Georgia State University, Atlanta, USA
| |
Collapse
|
6
|
Yang J, Ji X, Quan W, Liu Y, Wei B, Wu T. Classification of Schizophrenia by Functional Connectivity Strength Using Functional Near Infrared Spectroscopy. Front Neuroinform 2020; 14:40. [PMID: 33117140 PMCID: PMC7575761 DOI: 10.3389/fninf.2020.00040] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 07/22/2020] [Indexed: 01/21/2023] Open
Abstract
Functional near-infrared spectroscopy (fNIRS) has been widely employed in the objective diagnosis of patients with schizophrenia during a verbal fluency task (VFT). Most of the available methods depended on the time-domain features extracted from the data of single or multiple channels. The present study proposed an alternative method based on the functional connectivity strength (FCS) derived from an individual channel. The data measured 100 patients with schizophrenia and 100 healthy controls, who were used to train the classifiers and to evaluate their performance. Different classifiers were evaluated, and support machine vector achieved the best performance. In order to reduce the dimensional complexity of the feature domain, principal component analysis (PCA) was applied. The classification results by using an individual channel, a combination of several channels, and 52 ensemble channels with and without the dimensional reduced technique were compared. It provided a new approach to identify schizophrenia, improving the objective diagnosis of this mental disorder. FCS from three channels on the medial prefrontal and left ventrolateral prefrontal cortices rendered accuracy as high as 84.67%, sensitivity at 92.00%, and specificity at 70%. The neurophysiological significance of the change at these regions was consistence with the major syndromes of schizophrenia.
Collapse
Affiliation(s)
- Jiayi Yang
- China Academy of Information and Communications Technology, Beijing, China.,Institute of Electrical Engineering, Chinese Academy of Sciences, Beijing, China
| | - Xiaoyu Ji
- China Academy of Information and Communications Technology, Beijing, China
| | - Wenxiang Quan
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China
| | - Yunshan Liu
- China Academy of Information and Communications Technology, Beijing, China.,School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Bowen Wei
- China Academy of Information and Communications Technology, Beijing, China.,School of Computer Science and Technology, Xidian University, Xian, China
| | - Tongning Wu
- China Academy of Information and Communications Technology, Beijing, China
| |
Collapse
|
7
|
Kim M, Won JH, Youn J, Park H. Joint-Connectivity-Based Sparse Canonical Correlation Analysis of Imaging Genetics for Detecting Biomarkers of Parkinson's Disease. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:23-34. [PMID: 31144631 DOI: 10.1109/tmi.2019.2918839] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Imaging genetics is a method used to detect associations between imaging and genetic variables. Some researchers have used sparse canonical correlation analysis (SCCA) for imaging genetics. This study was conducted to improve the efficiency and interpretability of SCCA. We propose a connectivity-based penalty for incorporating biological prior information. Our proposed approach, named joint connectivity-based SCCA (JCB-SCCA), includes the proposed penalty and can handle multi-modal neuroimaging datasets. Different neuroimaging techniques provide distinct information on the brain and have been used to investigate various neurological disorders, including Parkinson's disease (PD). We applied our algorithm to simulated and real imaging genetics datasets for performance evaluation. Our algorithm was able to select important features in a more robust manner compared with other multivariate methods. The algorithm revealed promising features of single-nucleotide polymorphisms and brain regions related to PD by using a real imaging genetic dataset. The proposed imaging genetics model can be used to improve clinical diagnosis in the form of novel potential biomarkers. We hope to apply our algorithm to cohorts such as Alzheimer's patients or healthy subjects to determine the generalizability of our algorithm.
Collapse
|
8
|
Deng J, Zeng W, Kong W, Shi Y, Mou X, Guo J. Multi-Constrained Joint Non-Negative Matrix Factorization With Application to Imaging Genomic Study of Lung Metastasis in Soft Tissue Sarcomas. IEEE Trans Biomed Eng 2019; 67:2110-2118. [PMID: 31751222 DOI: 10.1109/tbme.2019.2954989] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
OBJECTIVE The study of pathogenic mechanism at the genetic level by imaging genetics methods enables to effectively reveal the association of histopathology and genetics. However, there is a lack of effective and accurate tools to establish association models from macroscopic to microscopic. METHODS The multi-constrained joint non-negative matrix factorization (MCJNMF) was developed for simultaneous integration of genomic data and image data to identify common modules related to disease. Two types of data matrices were projected onto a common feature space, in which heterogeneous variables with large coefficients in the same projected direction form a common module. Meanwhile, the correlation between original data features was integrated by using regularization constraints to improve the biological relevance. Sparsity constraints and orthogonal constraints were performed on decomposition factors to minimize the redundancy between different bases and to reduce algorithm complexity. RESULTS This algorithm was successfully performed on the module identification of lung metastasis in soft tissue sarcomas (STSs) by integrating FDG-PET image and DNA methylation data features. Multilevel analysis on the top extracted modules revealed that these modules were closely related to the lung metastasis. Particularly, several genes with diagnostic potential for lung metastasis can be discovered from high score modules. CONCLUSION This method not only can be applied for the accurate identification of patterns related to pathogenic mechanism of diseases, but also has a significant implication for discovering protein biomarkers. SIGNIFICANCE This method provides avenues for further studies of identifying complex association patterns of diseases according to different types of biological data.
Collapse
|
9
|
Hu W, Lin D, Cao S, Liu J, Chen J, Calhoun VD, Wang YP. Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia. IEEE Trans Biomed Eng 2018; 65:390-399. [PMID: 29364120 PMCID: PMC5826588 DOI: 10.1109/tbme.2017.2771483] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of "unfair combination of pairwise covariances" restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real-data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well suited for detecting three-way or multiway correlations and thus can find widespread applications in multiple omics and imaging data integration.
Collapse
Affiliation(s)
- Wenxing Hu
- Biomedical Engineering Department, Tulane University, New Orleans, LA 70118, USA
| | - Dongdong Lin
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Shaolong Cao
- Department of Bioinformatics & Computational Biology, UT MD Anderson Cancer Center, Houston, TX
| | - Jingyu Liu
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Jiayu Chen
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Vince D. Calhoun
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Yu-Ping Wang
- Biomedical Engineering Department, Tulane University, New Orleans, LA 70118, USA
| |
Collapse
|