1
|
Liu JX, Wang SQ, Jiao CN, Wu TR, Cui XC, Zheng CH. Deep self-representation learning with hyper-laplacian regularization for brain imaging genetics association analysis. Methods 2025; 234:333-341. [PMID: 39837433 DOI: 10.1016/j.ymeth.2025.01.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 12/26/2024] [Accepted: 01/16/2025] [Indexed: 01/23/2025] Open
Abstract
Brain imaging genetics aims to explore the association between genetic factors such as single nucleotide polymorphisms (SNPs) and brain imaging quantitative traits (QTs). However, most existing methods do not consider the nonlinear correlations between genotypic and phenotypic data, as well as potential higher-order relationships among subjects when identifying bi-multivariate associations. In this paper, a novel method called deep hyper-Laplacian regularized self-representation learning based structured association analysis (DHRSAA) is proposed which can learn genotype-phenotype associations and obtain relevant biomarkers. Specifically, a deep neural network is used first to explore the nonlinear relationships among samples. Secondly, self-representation learning based on hyper-Laplacian regularization is utilized to reconstruct the original data. In particular, the introduction of hyper-Laplacian regularization ensures the local structure of the high-dimensional spatial embedding and explores the higher-order relationships among the samples. Moreover, the structural regularization term in the association analysis uncovers chain relationships among SNPs and graphical relationships among imaging QTs, thus making the obtained markers more interpretable and enhancing the biological significance of the method. The performance of the proposed method is validated on real neuroimaging genetics data. Experimental results show that DHRSAA displays better canonical correlation coefficients and recognizes clearer canonical weight patterns compared to several state-of-the-art methods, which suggests that the proposed DHRSAA achieves better performance and identifies disease-related biomarkers.
Collapse
Affiliation(s)
- Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; School of Health and Life Sciences, University of Health and Rehabilitation Sciences, Qingdao, 266113, China.
| | - Shuang-Qing Wang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Cui-Na Jiao
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Tian-Ru Wu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Xin-Chun Cui
- School of Foundational Education, University of Health and Rehabilitation Sciences, Qingdao, 266072, China; Qingdao Municipal Hospital, University of Health and Rehabilitation Sciences, Qingdao, 266011, China
| | - Chun-Hou Zheng
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| |
Collapse
|
2
|
Peng W, Ma Y, Li C, Dai W, Fu X, Liu L, Liu L, Liu J. Fusion of brain imaging genetic data for alzheimer's disease diagnosis and causal factors identification using multi-stream attention mechanisms and graph convolutional networks. Neural Netw 2024; 184:107020. [PMID: 39721106 DOI: 10.1016/j.neunet.2024.107020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 11/03/2024] [Accepted: 12/03/2024] [Indexed: 12/28/2024]
Abstract
Correctly diagnosing Alzheimer's disease (AD) and identifying pathogenic brain regions and genes play a vital role in understanding the AD and developing effective prevention and treatment strategies. Recent works combine imaging and genetic data, and leverage the strengths of both modalities to achieve better classification results. In this work, we propose MCA-GCN, a Multi-stream Cross-Attention and Graph Convolutional Network-based classification method for AD patients. It first constructs a brain region-gene association network based on brain region fMRI time series and gene SNP data. Then it integrates the absolute and relative positions of the brain region time series to obtain a new brain region time series containing temporal information. Then long-range and local association features between brain regions and genes are sequentially aggregated by multi-stream cross-attention and graph convolutional networks. Finally, the learned brain region and gene features are input to the fully connected network to predict AD types. Experimental results on the ADNI dataset show that our model outperforms other methods in AD classification tasks. Moreover, MCA-GCN designed a multi-stage feature scoring process to extract high-risk genes and brain regions related to disease classification.
Collapse
Affiliation(s)
- Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China.
| | - Yanhan Ma
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China
| | - Chunshan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China
| | - Xiaodong Fu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China
| | - Li Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China
| | - Lijun Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology; Kunming 650500, PR China; Computer Technology Application Key Lab of Yunnan Province; Kunming 650500, PR China
| | - Jin Liu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, PR China
| |
Collapse
|
3
|
Bi XA, Yang Z, Huang Y, Xing Z, Xu L, Wu Z, Liu Z, Li X, Liu T. CE-GAN: Community Evolutionary Generative Adversarial Network for Alzheimer's Disease Risk Prediction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3663-3675. [PMID: 38587958 DOI: 10.1109/tmi.2024.3385756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
In the studies of neurodegenerative diseases such as Alzheimer's Disease (AD), researchers often focus on the associations among multi-omics pathogeny based on imaging genetics data. However, current studies overlook the communities in brain networks, leading to inaccurate models of disease development. This paper explores the developmental patterns of AD from the perspective of community evolution. We first establish a mathematical model to describe functional degeneration in the brain as the community evolution driven by entropy information propagation. Next, we propose an interpretable Community Evolutionary Generative Adversarial Network (CE-GAN) to predict disease risk. In the generator of CE-GAN, community evolutionary convolutions are designed to capture the evolutionary patterns of AD. The experiments are conducted using functional magnetic resonance imaging (fMRI) data and single nucleotide polymorphism (SNP) data. CE-GAN achieves 91.67% accuracy and 91.83% area under curve (AUC) in AD risk prediction tasks, surpassing advanced methods on the same dataset. In addition, we validated the effectiveness of CE-GAN for pathogeny extraction. The source code of this work is available at https://github.com/fmri123456/CE-GAN.
Collapse
|
4
|
Wang L, Sheng J, Zhang Q, Yang Z, Xin Y, Song Y, Zhang Q, Wang B. A novel sand cat swarm optimization algorithm-based SVM for diagnosis imaging genomics in Alzheimer's disease. Cereb Cortex 2024; 34:bhae329. [PMID: 39147391 DOI: 10.1093/cercor/bhae329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2024] [Revised: 07/14/2024] [Accepted: 07/25/2024] [Indexed: 08/17/2024] Open
Abstract
In recent years, brain imaging genomics has advanced significantly in revealing underlying pathological mechanisms of Alzheimer's disease (AD) and providing early diagnosis. In this paper, we present a framework for diagnosing AD that integrates magnetic resonance imaging (fMRI) genetic preprocessing, feature selection, and a support vector machine (SVM) model. In particular, a novel sand cat swarm optimization (SCSO) algorithm, named SS-SCSO, which integrates the spiral search strategy and alert mechanism from the sparrow search algorithm, is proposed to optimize the SVM parameters. The optimization efficacy of the SS-SCSO algorithm is evaluated using CEC2017 benchmark functions, with results compared with other metaheuristic algorithms (MAs). The proposed SS-SCSO-SVM framework has been effectively employed to classify different stages of cognitive impairment in Alzheimer's Disease using imaging genetic datasets from the Alzheimer's Disease Neuroimaging Initiative. It has demonstrated excellent classification accuracies for four typical cases, including AD, early mild cognitive impairment, late mild cognitive impairment, and healthy control. Furthermore, experiment results indicate that the SS-SCSO-SVM algorithm has a stronger exploration capability for diagnosing AD compared to other well-established MAs and machine learning techniques.
Collapse
Affiliation(s)
- Luyun Wang
- School of Computer Science and Technology, Hangzhou Dianzi University, 1158 2nd Street, Hangzhou, Zhejiang 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, 215 6th Street, Hangzhou, Zhejiang 310018, China
- Hangzhou Vocational & Technical College, 68 Xueyuan Street, Hangzhou, Zhejiang 310018, China
| | - Jinhua Sheng
- School of Computer Science and Technology, Hangzhou Dianzi University, 1158 2nd Street, Hangzhou, Zhejiang 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, 215 6th Street, Hangzhou, Zhejiang 310018, China
| | - Qiao Zhang
- Beijing Hospital, 1 Dahua Road, Beijing 100730, China
- National Center of Gerontology, 1 Dahua Road, Beijing 100730, China
- Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, 1 Dahua Road, Beijing 100730, China
| | - Ze Yang
- School of Computer Science and Technology, Hangzhou Dianzi University, 1158 2nd Street, Hangzhou, Zhejiang 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, 215 6th Street, Hangzhou, Zhejiang 310018, China
| | - Yu Xin
- School of Computer Science and Technology, Hangzhou Dianzi University, 1158 2nd Street, Hangzhou, Zhejiang 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, 215 6th Street, Hangzhou, Zhejiang 310018, China
| | - Yan Song
- Beijing Hospital, 1 Dahua Road, Beijing 100730, China
- National Center of Gerontology, 1 Dahua Road, Beijing 100730, China
- Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, 1 Dahua Road, Beijing 100730, China
| | - Qian Zhang
- School of Computer Science and Technology, Hangzhou Dianzi University, 1158 2nd Street, Hangzhou, Zhejiang 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, 215 6th Street, Hangzhou, Zhejiang 310018, China
| | - Binbing Wang
- School of Computer Science and Technology, Hangzhou Dianzi University, 1158 2nd Street, Hangzhou, Zhejiang 310018, China
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, 215 6th Street, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
5
|
Wei K, Qian F, Li Y, Zeng T, Huang T. Integrating multi-omics data of childhood asthma using a deep association model. FUNDAMENTAL RESEARCH 2024; 4:738-751. [PMID: 39156565 PMCID: PMC11330118 DOI: 10.1016/j.fmre.2024.03.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 03/06/2024] [Accepted: 03/17/2024] [Indexed: 08/20/2024] Open
Abstract
Childhood asthma is one of the most common respiratory diseases with rising mortality and morbidity. The multi-omics data is providing a new chance to explore collaborative biomarkers and corresponding diagnostic models of childhood asthma. To capture the nonlinear association of multi-omics data and improve interpretability of diagnostic model, we proposed a novel deep association model (DAM) and corresponding efficient analysis framework. First, the Deep Subspace Reconstruction was used to fuse the omics data and diagnostic information, thereby correcting the distribution of the original omics data and reducing the influence of unnecessary data noises. Second, the Joint Deep Semi-Negative Matrix Factorization was applied to identify different latent sample patterns and extract biomarkers from different omics data levels. Third, our newly proposed Deep Orthogonal Canonical Correlation Analysis can rank features in the collaborative module, which are able to construct the diagnostic model considering nonlinear correlation between different omics data levels. Using DAM, we deeply analyzed the transcriptome and methylation data of childhood asthma. The effectiveness of DAM is verified from the perspectives of algorithm performance and biological significance on the independent test dataset, by ablation experiment and comparison with many baseline methods from clinical and biological studies. The DAM-induced diagnostic model can achieve a prediction AUC of 0.912, which is higher than that of many other alternative methods. Meanwhile, relevant pathways and biomarkers of childhood asthma are also recognized to be collectively altered on the gene expression and methylation levels. As an interpretable machine learning approach, DAM simultaneously considers the non-linear associations among samples and those among biological features, which should help explore interpretative biomarker candidates and efficient diagnostic models from multi-omics data analysis for human complex diseases.
Collapse
Affiliation(s)
- Kai Wei
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Guoke Ningbo Life Science and Health Industry Research Institute, Ningbo 315000, China
| | - Fang Qian
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yixue Li
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Guangzhou National Laboratory, Guangzhou 510000, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou 510000, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Tao Zeng
- Guangzhou National Laboratory, Guangzhou 510000, China
- GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macau Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou 510000, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
6
|
Wu TR, Jiao CN, Cui X, Wang YL, Zheng CH, Liu JX. Deep Self-Reconstruction Fusion Similarity Hashing for the Diagnosis of Alzheimer's Disease on Multi-Modal Data. IEEE J Biomed Health Inform 2024; 28:3513-3522. [PMID: 38568771 DOI: 10.1109/jbhi.2024.3383885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
The pathogenesis of Alzheimer's disease (AD) is extremely intricate, which makes AD patients almost incurable. Recent studies have demonstrated that analyzing multi-modal data can offer a comprehensive perspective on the different stages of AD progression, which is beneficial for early diagnosis of AD. In this paper, we propose a deep self-reconstruction fusion similarity hashing (DS-FSH) method to effectively capture the AD-related biomarkers from the multi-modal data and leverage them to diagnose AD. Given that most existing methods ignore the topological structure of the data, a deep self-reconstruction model based on random walk graph regularization is designed to reconstruct the multi-modal data, thereby learning the nonlinear relationship between samples. Additionally, a fused similarity hash based on anchor graph is proposed to generate discriminative binary hash codes for multi-modal reconstructed data. This allows sample fused similarity to be effectively modeled by a fusion similarity matrix based on anchor graph while modal correlation can be approximated by Hamming distance. Especially, extracted features from the multi-modal data are classified using deep sparse autoencoders classifier. Finally, experiments conduct on the AD Neuroimaging Initiative database show that DS-FSH outperforms comparable methods of AD classification. To conclude, DS-FSH identifies multi-modal features closely associated with AD, which are expected to contribute significantly to understanding of the pathogenesis of AD.
Collapse
|
7
|
Lee H, Ma T, Ke H, Ye Z, Chen S. dCCA: detecting differential covariation patterns between two types of high-throughput omics data. Brief Bioinform 2024; 25:bbae288. [PMID: 38888456 PMCID: PMC11184902 DOI: 10.1093/bib/bbae288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 05/01/2024] [Accepted: 06/03/2024] [Indexed: 06/20/2024] Open
Abstract
MOTIVATION The advent of multimodal omics data has provided an unprecedented opportunity to systematically investigate underlying biological mechanisms from distinct yet complementary angles. However, the joint analysis of multi-omics data remains challenging because it requires modeling interactions between multiple sets of high-throughput variables. Furthermore, these interaction patterns may vary across different clinical groups, reflecting disease-related biological processes. RESULTS We propose a novel approach called Differential Canonical Correlation Analysis (dCCA) to capture differential covariation patterns between two multivariate vectors across clinical groups. Unlike classical Canonical Correlation Analysis, which maximizes the correlation between two multivariate vectors, dCCA aims to maximally recover differentially expressed multivariate-to-multivariate covariation patterns between groups. We have developed computational algorithms and a toolkit to sparsely select paired subsets of variables from two sets of multivariate variables while maximizing the differential covariation. Extensive simulation analyses demonstrate the superior performance of dCCA in selecting variables of interest and recovering differential correlations. We applied dCCA to the Pan-Kidney cohort from the Cancer Genome Atlas Program database and identified differentially expressed covariations between noncoding RNAs and gene expressions. AVAILABILITY AND IMPLEMENTATION The R package that implements dCCA is available at https://github.com/hwiyoungstat/dCCA.
Collapse
Affiliation(s)
- Hwiyoung Lee
- Maryland Psychiatric Research Center, School of Medicine, University of Maryland, Baltimore, MD 21201, United States
- The University of Maryland Institute for Health Computing (UM-IHC), North Bethesda, MD 20852, United States
| | - Tianzhou Ma
- Department of Epidemiology and Biostatistics, University of Maryland, College Park, MD 20742, United States
| | - Hongjie Ke
- Department of Epidemiology and Biostatistics, University of Maryland, College Park, MD 20742, United States
| | - Zhenyao Ye
- The University of Maryland Institute for Health Computing (UM-IHC), North Bethesda, MD 20852, United States
- Division of Biostatistics and Bioinformatics, Department of Epidemiology and Public Health, School of Medicine, University of Maryland, Baltimore, MD 21201, United States
| | - Shuo Chen
- Maryland Psychiatric Research Center, School of Medicine, University of Maryland, Baltimore, MD 21201, United States
- The University of Maryland Institute for Health Computing (UM-IHC), North Bethesda, MD 20852, United States
- Division of Biostatistics and Bioinformatics, Department of Epidemiology and Public Health, School of Medicine, University of Maryland, Baltimore, MD 21201, United States
| |
Collapse
|
8
|
Jiao CN, Shang J, Li F, Cui X, Wang YL, Gao YL, Liu JX. Diagnosis-Guided Deep Subspace Clustering Association Study for Pathogenetic Markers Identification of Alzheimer's Disease Based on Comparative Atlases. IEEE J Biomed Health Inform 2024; 28:3029-3041. [PMID: 38427553 DOI: 10.1109/jbhi.2024.3372294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
The roles of brain region activities and genotypic functions in the pathogenesis of Alzheimer's disease (AD) remain unclear. Meanwhile, current imaging genetics methods are difficult to identify potential pathogenetic markers by correlation analysis between brain network and genetic variation. To discover disease-related brain connectome from the specific brain structure and the fine-grained level, based on the Automated Anatomical Labeling (AAL) and human Brainnetome atlases, the functional brain network is first constructed for each subject. Specifically, the upper triangle elements of the functional connectivity matrix are extracted as connectivity features. The clustering coefficient and the average weighted node degree are developed to assess the significance of every brain area. Since the constructed brain network and genetic data are characterized by non-linearity, high-dimensionality, and few subjects, the deep subspace clustering algorithm is proposed to reconstruct the original data. Our multilayer neural network helps capture the non-linear manifolds, and subspace clustering learns pairwise affinities between samples. Moreover, most approaches in neuroimaging genetics are unsupervised learning, neglecting the diagnostic information related to diseases. We presented a label constraint with diagnostic status to instruct the imaging genetics correlation analysis. To this end, a diagnosis-guided deep subspace clustering association (DDSCA) method is developed to discover brain connectome and risk genetic factors by integrating genotypes with functional network phenotypes. Extensive experiments prove that DDSCA achieves superior performance to most association methods and effectively selects disease-relevant genetic markers and brain connectome at the coarse-grained and fine-grained levels.
Collapse
|
9
|
Song P, Li X, Yuan X, Pang L, Song X, Wang Y. Identifying frequency-dependent imaging genetic associations via hypergraph-structured multi-task sparse canonical correlation analysis. Comput Biol Med 2024; 171:108051. [PMID: 38335819 DOI: 10.1016/j.compbiomed.2024.108051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 01/03/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024]
Abstract
Identifying complex associations between genetic variations and imaging phenotypes is a challenging task in the research of brain imaging genetics. The previous study has proved that neuronal oscillations within distinct frequency bands are derived from frequency-dependent genetic modulation. Thus it is meaningful to explore frequency-dependent imaging genetic associations, which may give important insights into the pathogenesis of brain disorders. In this work, the hypergraph-structured multi-task sparse canonical correlation analysis (HS-MTSCCA) was developed to explore the associations between multi-frequency imaging phenotypes and single-nucleotide polymorphisms (SNPs). Specifically, we first created a hypergraph for the imaging phenotypes of each frequency and the SNPs, respectively. Then, a new hypergraph-structured constraint was proposed to learn high-order relationships among features in each hypergraph, which can introduce biologically meaningful information into the model. The frequency-shared and frequency-specific imaging phenotypes and SNPs could be identified using the multi-task learning framework. We also proposed a useful strategy to tackle this algorithm and then demonstrated its convergence. The proposed method was evaluated on four simulation datasets and a real schizophrenia dataset. The experimental results on synthetic data showed that HS-MTSCCA outperforms the other competing methods according to canonical correlation coefficients, canonical weights, and cosine similarity. And the results on real data showed that HS-MTSCCA could obtain superior canonical coefficients and canonical weights. Furthermore, the identified frequency-shared and frequency-specific biomarkers could provide more interesting and meaningful information, demonstrating that HS-MTSCCA is a powerful method for brain imaging genetics.
Collapse
Affiliation(s)
- Peilun Song
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Xue Li
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Xiuxia Yuan
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Lijuan Pang
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Xueqin Song
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Yaping Wang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| |
Collapse
|
10
|
Choi H, Byeon K, Lee J, Hong S, Park B, Park H. Identifying subgroups of eating behavior traits unrelated to obesity using functional connectivity and feature representation learning. Hum Brain Mapp 2024; 45:e26581. [PMID: 38224537 PMCID: PMC10789215 DOI: 10.1002/hbm.26581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/13/2023] [Accepted: 12/20/2023] [Indexed: 01/17/2024] Open
Abstract
Eating behavior is highly heterogeneous across individuals and cannot be fully explained using only the degree of obesity. We utilized unsupervised machine learning and functional connectivity measures to explore the heterogeneity of eating behaviors measured by a self-assessment instrument using 424 healthy adults (mean ± standard deviation [SD] age = 47.07 ± 18.89 years; 67% female). We generated low-dimensional representations of functional connectivity using resting-state functional magnetic resonance imaging and estimated latent features using the feature representation capabilities of an autoencoder by nonlinearly compressing the functional connectivity information. The clustering approaches applied to latent features identified three distinct subgroups. The subgroups exhibited different levels of hunger traits, while their body mass indices were comparable. The results were replicated in an independent dataset consisting of 212 participants (mean ± SD age = 38.97 ± 19.80 years; 35% female). The model interpretation technique of integrated gradients revealed that the between-group differences in the integrated gradient maps were associated with functional reorganization in heteromodal association and limbic cortices and reward-related subcortical structures such as the accumbens, amygdala, and caudate. The cognitive decoding analysis revealed that these systems are associated with reward- and emotion-related systems. Our findings provide insights into the macroscopic brain organization of eating behavior-related subgroups independent of obesity.
Collapse
Affiliation(s)
- Hyoungshin Choi
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwonRepublic of Korea
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwonRepublic of Korea
| | | | - Jong‐eun Lee
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwonRepublic of Korea
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwonRepublic of Korea
| | - Seok‐Jun Hong
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwonRepublic of Korea
- Center for the Developing BrainChild Mind InstituteNew YorkUSA
- Department of Biomedical EngineeringSungkyunkwan UniversitySuwonRepublic of Korea
| | - Bo‐yong Park
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwonRepublic of Korea
- Department of Data ScienceInha UniversityIncheonRepublic of Korea
- Department of Statistics and Data ScienceInha UniversityIncheonRepublic of Korea
| | - Hyunjin Park
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwonRepublic of Korea
- School of Electronic and Electrical EngineeringSungkyunkwan UniversitySuwonRepublic of Korea
| |
Collapse
|
11
|
Mohammed S, Kurtek S, Bharath K, Rao A, Baladandayuthapani V. Tumor radiogenomics in gliomas with Bayesian layered variable selection. Med Image Anal 2023; 90:102964. [PMID: 37797481 PMCID: PMC10653647 DOI: 10.1016/j.media.2023.102964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/09/2022] [Accepted: 07/03/2023] [Indexed: 10/07/2023]
Abstract
We propose a statistical framework to analyze radiological magnetic resonance imaging (MRI) and genomic data to identify the underlying radiogenomic associations in lower grade gliomas (LGG). We devise a novel imaging phenotype by dividing the tumor region into concentric spherical layers that mimics the tumor evolution process. MRI data within each layer is represented by voxel-intensity-based probability density functions which capture the complete information about tumor heterogeneity. Under a Riemannian-geometric framework these densities are mapped to a vector of principal component scores which act as imaging phenotypes. Subsequently, we build Bayesian variable selection models for each layer with the imaging phenotypes as the response and the genomic markers as predictors. Our novel hierarchical prior formulation incorporates the interior-to-exterior structure of the layers, and the correlation between the genomic markers. We employ a computationally-efficient Expectation-Maximization-based strategy for estimation. Simulation studies demonstrate the superior performance of our approach compared to other approaches. With a focus on the cancer driver genes in LGG, we discuss some biologically relevant findings. Genes implicated with survival and oncogenesis are identified as being associated with the spherical layers, which could potentially serve as early-stage diagnostic markers for disease monitoring, prior to routine invasive approaches. We provide a R package that can be used to deploy our framework to identify radiogenomic associations.
Collapse
Affiliation(s)
- Shariq Mohammed
- Department of Biostatistics, Boston University, 801 Massachusetts Ave, Boston, MA 02118, United States; Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48103, United States; Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, United States.
| | - Sebastian Kurtek
- Department of Statistics, The Ohio State University, 1958 Neil Avenue, Columbus, OH 43210, United States
| | - Karthik Bharath
- School of Mathematical Sciences, University Park, Nottingham, NG7 2RD, United Kingdom
| | - Arvind Rao
- Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, MI 48103, United States; Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, United States; Department of Radiation Oncology, University of Michigan, 1500 E Medical Center Dr, Ann Arbor, MI 48109, United States
| | | |
Collapse
|
12
|
Wang M, Shao W, Huang S, Zhang D. Hypergraph-regularized multimodal learning by graph diffusion for imaging genetics based Alzheimer's Disease diagnosis. Med Image Anal 2023; 89:102883. [PMID: 37467641 DOI: 10.1016/j.media.2023.102883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 04/06/2023] [Accepted: 06/28/2023] [Indexed: 07/21/2023]
Abstract
Recent studies show that multi-modal data fusion techniques combining information from diverse sources are helpful to diagnose and predict complex brain disorders. However, most existing diagnosis methods have only simply employed a feature combination strategy for multiple imaging and genetic data, ignoring the imaging phenotypes associated with the risk gene information. To this end, we present a hypergraph-regularized multimodal learning by graph diffusion (HMGD) for joint association learning and outcome prediction. Specifically, we first present a graph diffusion method for enhancing similarity measures among subjects given from multi-modality phenotypes, which fully uses multiple input similarity graphs and integrates them into a unified graph with valuable geometric structures among different imaging phenotypes. Then, we employ the unified graph to represent the high-order similarity relationships among subjects, and enforce a hypergraph-regularized term to incorporate both inter- and cross-modality information for selecting the imaging phenotypes associated with the risk single nucleotide polymorphism (SNP). Finally, a multi-kernel support vector machine (MK-SVM) is adopted to fuse such phenotypic features selected from different modalities for the final diagnosis and prediction. The proposed approach is experimentally explored on brain imaging genetic data of the Alzheimer's Disease Neuroimaging Initiative (ADNI) datasets. Relevant results present that the proposed approach is superior to several competing algorithms, and realizes strong associations and discovers significant consistent and robust ROIs across different imaging phenotypes associated with the genetic risk biomarkers to guide disease interpretation and prediction.
Collapse
Affiliation(s)
- Meiling Wang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China
| | - Wei Shao
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China
| | - Shuo Huang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China
| | - Daoqiang Zhang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China.
| |
Collapse
|
13
|
Chegraoui H, Guillemot V, Rebei A, Gloaguen A, Grill J, Philippe C, Frouin V. Integrating multiomics and prior knowledge: a study of the Graphnet penalty impact. Bioinformatics 2023; 39:btad454. [PMID: 37490467 PMCID: PMC10403429 DOI: 10.1093/bioinformatics/btad454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 05/04/2023] [Accepted: 07/24/2023] [Indexed: 07/27/2023] Open
Abstract
MOTIVATION In the field of oncology, statistical models are used for the discovery of candidate factors that influence the development of the pathology or its outcome. These statistical models can be designed in a multiblock framework to study the relationship between different multiomic data, and variable selection is often achieved by imposing constraints on the model parameters. A priori graph constraints have been used in the literature as a way to improve feature selection in the model, yielding more interpretability. However, it is still unclear how these graphs interact with the models and how they impact the feature selection. Additionally, with the availability of different graphs encoding different information, one can wonder how the choice of the graph meaningfully impacts the results obtained. RESULTS We proposed to study the graph penalty impact on a multiblock model. Specifically, we used the SGCCA as the multiblock framework. We studied the effect of the penalty on the model using the TCGA-LGG dataset. Our findings are 3-fold. We showed that the graph penalty increases the number of selected genes from this dataset, while selecting genes already identified in other works as pertinent biomarkers in the pathology. We demonstrated that using different graphs leads to different though consistent results, but that graph density is the main factor influencing the obtained results. Finally, we showed that the graph penalty increases the performance of the survival prediction from the model-derived components and the interpretability of the results. AVAILABILITY AND IMPLEMENTATION Source code is freely available at https://github.com/neurospin/netSGCCA.
Collapse
Affiliation(s)
- Hamza Chegraoui
- Université Paris-Saclay, CEA, Neurospin, 91191 Gif-sur-Yvette, France
| | - Vincent Guillemot
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, 75015 Paris, France
| | - Amine Rebei
- Université Paris-Saclay, CEA, Neurospin, 91191 Gif-sur-Yvette, France
| | - Arnaud Gloaguen
- Centre National de Recherche en Génomique Humaine, Institut de Biologie François Jacob, CEA, Université Paris-Saclay, 91000 Evry, France
| | - Jacques Grill
- Département Cancérologie de l’enfant et de l’adolescent, Gustave-Roussy, 94800 Villejuif, France
- Prédicteurs Moléculaires et Nouvelles Cibles en Oncologie—U981, Inserm, Université Paris-Saclay, 94800 Villejuif, France
| | - Cathy Philippe
- Université Paris-Saclay, CEA, Neurospin, 91191 Gif-sur-Yvette, France
| | - Vincent Frouin
- Université Paris-Saclay, CEA, Neurospin, 91191 Gif-sur-Yvette, France
| |
Collapse
|
14
|
Du L, Zhang J, Zhao Y, Shang M, Guo L, Han J. inMTSCCA: An Integrated Multi-task Sparse Canonical Correlation Analysis for Multi-omic Brain Imaging Genetics. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:396-413. [PMID: 37442417 PMCID: PMC10634656 DOI: 10.1016/j.gpb.2023.03.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 01/29/2023] [Accepted: 03/14/2023] [Indexed: 07/15/2023]
Abstract
Identifying genetic risk factors for Alzheimer's disease (AD) is an important research topic. To date, different endophenotypes, such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes, have shown the great value in uncovering risk genes compared to case-control studies. Biologically, a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis. However, existing methods mainly focus on the effect of endophenotypes alone; the effect of cross-endophenotype (CEP) associations remains largely unexploited. In this study, we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors, and proposed two integrated multi-task sparse canonical correlation analysis (inMTSCCA) methods, i.e., pairwise endophenotype correlation-guided MTSCCA (pcMTSCCA) and high-order endophenotype correlation-guided MTSCCA (hocMTSCCA). pcMTSCCA employed pairwise correlations between magnetic resonance imaging (MRI)-derived, plasma-derived, and cerebrospinal fluid (CSF)-derived endophenotypes as an additional penalty. hocMTSCCA used high-order correlations among these multi-omic data for regularization. To figure out genetic risk factors at individual and group levels, as well as altered endophenotypic markers, we introduced sparsity-inducing penalties for both models. We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real (consisting of neuroimaging data, proteomic analytes, and genetic data) datasets. The results showed that our methods obtained better or comparable canonical correlation coefficients (CCCs) and better feature subsets than benchmarks. Most importantly, the identified genetic loci and heterogeneous endophenotypic markers showed high relevance. Therefore, jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors. The source code and manual of inMTSCCA are available at https://ngdc.cncb.ac.cn/biocode/tools/BT007330.
Collapse
Affiliation(s)
- Lei Du
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China.
| | - Jin Zhang
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Ying Zhao
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Muheng Shang
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Lei Guo
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Junwei Han
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
15
|
Leng Y, Cui W, Peng Y, Yan C, Cao Y, Yan Z, Chen S, Jiang X, Zheng J. Multimodal cross enhanced fusion network for diagnosis of Alzheimer's disease and subjective memory complaints. Comput Biol Med 2023; 157:106788. [PMID: 36958233 DOI: 10.1016/j.compbiomed.2023.106788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 02/09/2023] [Accepted: 03/11/2023] [Indexed: 03/15/2023]
Abstract
Deep learning methods using multimodal imagings have been proposed for the diagnosis of Alzheimer's disease (AD) and its early stages (SMC, subjective memory complaints), which may help to slow the progression of the disease through early intervention. However, current fusion methods for multimodal imagings are generally coarse and may lead to suboptimal results through the use of shared extractors or simple downscaling stitching. Another issue with diagnosing brain diseases is that they often affect multiple areas of the brain, making it important to consider potential connections throughout the brain. However, traditional convolutional neural networks (CNNs) may struggle with this issue due to their limited local receptive fields. To address this, many researchers have turned to transformer networks, which can provide global information about the brain but can be computationally intensive and perform poorly on small datasets. In this work, we propose a novel lightweight network called MENet that adaptively recalibrates the multiscale long-range receptive field to localize discriminative brain regions in a computationally efficient manner. Based on this, the network extracts the intensity and location responses between structural magnetic resonance imagings (sMRI) and 18-Fluoro-Deoxy-Glucose Positron Emission computed Tomography (FDG-PET) as an enhancement fusion for AD and SMC diagnosis. Our method is evaluated on the publicly available ADNI datasets and achieves 97.67% accuracy in AD diagnosis tasks and 81.63% accuracy in SMC diagnosis tasks using sMRI and FDG-PET. These results achieve state-of-the-art (SOTA) performance in both tasks. To the best of our knowledge, this is one of the first deep learning research methods for SMC diagnosis with FDG-PET.
Collapse
Affiliation(s)
- Yilin Leng
- Institute of Biomedical Engineering, School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China; Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Wenju Cui
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
| | - Yunsong Peng
- Department of Medical Imaging, International Exemplary Cooperation Base of Precision Imaging for Diagnosis and Treatment, Guizhou Provincial People's Hospital, Guizhou, 550002, China
| | - Caiying Yan
- Department of Radiology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou, 211103, China
| | - Yuzhu Cao
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
| | - Zhuangzhi Yan
- Institute of Biomedical Engineering, School of Communication and Information Engineering, Shanghai University, Shanghai, 200444, China
| | - Shuangqing Chen
- Department of Radiology, The Affiliated Suzhou Hospital of Nanjing Medical University, Suzhou, 211103, China.
| | - Xi Jiang
- Clinical Hospital of Chengdu Brain Science Institute, MOE Key Lab for Neuroinformation, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Jian Zheng
- Department of Medical Imaging, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China; School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China.
| | | |
Collapse
|
16
|
Wang S, Zheng K, Kong W, Huang R, Liu L, Wen G, Yu Y. Multimodal data fusion based on IGERNNC algorithm for detecting pathogenic brain regions and genes in Alzheimer's disease. Brief Bioinform 2023; 24:6887308. [PMID: 36502428 DOI: 10.1093/bib/bbac515] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 09/28/2022] [Accepted: 10/30/2022] [Indexed: 12/14/2022] Open
Abstract
At present, the study on the pathogenesis of Alzheimer's disease (AD) by multimodal data fusion analysis has been attracted wide attention. It often has the problems of small sample size and high dimension with the multimodal medical data. In view of the characteristics of multimodal medical data, the existing genetic evolution random neural network cluster (GERNNC) model combine genetic evolution algorithm and neural network for the classification of AD patients and the extraction of pathogenic factors. However, the model does not take into account the non-linear relationship between brain regions and genes and the problem that the genetic evolution algorithm can fall into local optimal solutions, which leads to the overall performance of the model is not satisfactory. In order to solve the above two problems, this paper made some improvements on the construction of fusion features and genetic evolution algorithm in GERNNC model, and proposed an improved genetic evolution random neural network cluster (IGERNNC) model. The IGERNNC model uses mutual information correlation analysis method to combine resting-state functional magnetic resonance imaging data with single nucleotide polymorphism data for the construction of fusion features. Based on the traditional genetic evolution algorithm, elite retention strategy and large variation genetic algorithm are added to avoid the model falling into the local optimal solution. Through multiple independent experimental comparisons, the IGERNNC model can more effectively identify AD patients and extract relevant pathogenic factors, which is expected to become an effective tool in the field of AD research.
Collapse
Affiliation(s)
- Shuaiqun Wang
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Kai Zheng
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Wei Kong
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Ruiwen Huang
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Lulu Liu
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Gen Wen
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Yaling Yu
- School of Information Engineering, Shanghai Maritime University, Shanghai, China
| |
Collapse
|
17
|
Ning S, Xie J, Mo J, Pan Y, Huang R, Huang Q, Feng J. Imaging genetic association analysis of triple-negative breast cancer based on the integration of prior sample information. Front Genet 2023; 14:1090847. [PMID: 36911413 PMCID: PMC9992804 DOI: 10.3389/fgene.2023.1090847] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 02/10/2023] [Indexed: 02/25/2023] Open
Abstract
Triple-negative breast cancer (TNBC) is one of the more aggressive subtypes of breast cancer. The prognosis of TNBC patients remains low. Therefore, there is still a need to continue identifying novel biomarkers to improve the prognosis and treatment of TNBC patients. Research in recent years has shown that the effective use and integration of information in genomic data and image data will contribute to the prediction and prognosis of diseases. Considering that imaging genetics can deeply study the influence of microscopic genetic variation on disease phenotype, this paper proposes a sample prior information-induced multidimensional combined non-negative matrix factorization (SPID-MDJNMF) algorithm to integrate the Whole-slide image (WSI), mRNAs expression data, and miRNAs expression data. The algorithm effectively fuses high-dimensional data of three modalities through various constraints. In addition, this paper constructs an undirected graph between samples, uses an adjacency matrix to constrain the similarity, and embeds the clinical stage information of patients in the algorithm so that the algorithm can identify the co-expression patterns of samples with different labels. We performed univariate and multivariate Cox regression analysis on the mRNAs and miRNAs in the screened co-expression modules to construct a TNBC-related prognostic model. Finally, we constructed prognostic models for 2-mRNAs (IL12RB2 and CNIH2) and 2-miRNAs (miR-203a-3p and miR-148b-3p), respectively. The prognostic model can predict the survival time of TNBC patients with high accuracy. In conclusion, our proposed SPID-MDJNMF algorithm can efficiently integrate image and genomic data. Furthermore, we evaluated the prognostic value of mRNAs and miRNAs screened by the SPID-MDJNMF algorithm in TNBC, which may provide promising targets for the prognosis of TNBC patients.
Collapse
Affiliation(s)
- Shipeng Ning
- Department of Breast Surgery, Guangxi Medical University Cancer Hospital, Nanning, China
| | - Juan Xie
- Department of Clinical Laboratory, Guangxi Medical University Cancer Hospital, Nanning, China
| | - Jianlan Mo
- Department of Anesthesiology, Maternal and Child Health Hospital of Guangxi Zhuang Autonomous Region, Nanning, China
| | - You Pan
- Department of Breast Surgery, Guangxi Medical University Cancer Hospital, Nanning, China
| | - Rong Huang
- Department of Breast Surgery, Guangxi Medical University Cancer Hospital, Nanning, China
| | - Qinghua Huang
- Department of Breast Surgery, Guangxi Medical University Cancer Hospital, Nanning, China
| | - Jifeng Feng
- Department of Anesthesiology, Maternal and Child Health Hospital of Guangxi Zhuang Autonomous Region, Nanning, China
| |
Collapse
|
18
|
Bi XA, Mao Y, Luo S, Wu H, Zhang L, Luo X, Xu L. A novel generation adversarial network framework with characteristics aggregation and diffusion for brain disease classification and feature selection. Brief Bioinform 2022; 23:6762742. [PMID: 36259367 DOI: 10.1093/bib/bbac454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/01/2022] [Accepted: 09/23/2022] [Indexed: 12/14/2022] Open
Abstract
Imaging genetics provides unique insights into the pathological studies of complex brain diseases by integrating the characteristics of multi-level medical data. However, most current imaging genetics research performs incomplete data fusion. Also, there is a lack of effective deep learning methods to analyze neuroimaging and genetic data jointly. Therefore, this paper first constructs the brain region-gene networks to intuitively represent the association pattern of pathogenetic factors. Second, a novel feature information aggregation model is constructed to accurately describe the information aggregation process among brain region nodes and gene nodes. Finally, a deep learning method called feature information aggregation and diffusion generative adversarial network (FIAD-GAN) is proposed to efficiently classify samples and select features. We focus on improving the generator with the proposed convolution and deconvolution operations, with which the interpretability of the deep learning framework has been dramatically improved. The experimental results indicate that FIAD-GAN can not only achieve superior results in various disease classification tasks but also extract brain regions and genes closely related to AD. This work provides a novel method for intelligent clinical decisions. The relevant biomedical discoveries provide a reliable reference and technical basis for the clinical diagnosis, treatment and pathological analysis of disease.
Collapse
Affiliation(s)
- Xia-An Bi
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, and College of Information Science and Engineering in Hunan Normal University, Changsha, P.R. China
| | - Yuhua Mao
- Department of Computing, School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Sheng Luo
- Department of Computing, School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Hao Wu
- Department of Computing, School of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Lixia Zhang
- School of Information Science and Engineering, Hunan Normal University, Changsha, P.R. China
| | - Xun Luo
- College of Information Science and Engineering in Hunan Normal University, Changsha, P.R. China
| | - Luyun Xu
- College of Business in Hunan Normal University, Changsha, P.R. China
| |
Collapse
|
19
|
GC-CNNnet: Diagnosis of Alzheimer’s Disease with PET Images Using Genetic and Convolutional Neural Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:7413081. [PMID: 35983158 PMCID: PMC9381254 DOI: 10.1155/2022/7413081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 06/01/2022] [Accepted: 06/10/2022] [Indexed: 11/17/2022]
Abstract
There is a wide variety of effects of Alzheimer's disease (AD), a neurodegenerative disease that can lead to cognitive decline, deterioration of daily life, and behavioral and psychological changes. A polymorphism of the ApoE gene ε 4 is considered a genetic risk factor for Alzheimer's disease. The purpose of this paper is to demonstrate that single-nucleotide polymorphic markers (SNPs) have a causal relationship with quantitative PET imaging traits. Additionally, the classification of AD is based on the frequency of brain tissue variations in PET images using a combination of k-nearest-neighbor (KNN), support vector machine (SVM), linear discrimination analysis (LDA), and convolutional neural network (CNN) techniques. According to the results, the suggested SNPs appear to be associated with quantitative traits more strongly than the SNPs in the ApoE genes. Regarding the classification result, the highest accuracy is obtained by the CNN with 91.1%. These results indicate that the KNN and CNN methods are beneficial in diagnosing AD. Nevertheless, the LDA and SVM are demonstrated with a lower level of accuracy.
Collapse
|
20
|
Wang S, Chen H, Kong W, Ke F, Wei K. Identify Biomarkers of Alzheimer's Disease Based on Multi-task Canonical Correlation Analysis and Regression Model. J Mol Neurosci 2022; 72:1749-1763. [PMID: 35698015 DOI: 10.1007/s12031-022-02031-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 05/21/2022] [Indexed: 11/29/2022]
Abstract
Imaging genetics using imaging technology is regarded as a neuroanatomical phenotype to evaluate gene single nucleotide polymorphisms and their effects on the structure and function of different brain regions. It plays a vital role in bridging the initial understanding of the genetic basis of brain structure and dysfunction. Sparse canonical correlation analysis (SCCA) has become a widespread technique in this field because of its powerful ability to identify bivariate relationships and feature selection. Since most traditional SCCA algorithms assume that the input features are independent, this method obviously cannot be used to analyze genetic image data. The MT-SCCA model is unsupervised and cannot identify the genotype-phenotype associations for diagnostic guidance. Meanwhile, a single biological clinical index cannot fully reflect the physiological process of a comprehensive disease. Therefore, it is necessary to find biomarkers that can reflect Alzheimer's disease and physiological functions that can more comprehensively reflect the development of the disease. This article uses a multi-task sparse canonical correlation analysis and regression (MT-SCCAR) model to combine the annual depression level total score (GDSCALE), clinical dementia assessment scale (GLOBAL CDR), functional activity questionnaire (FAQ), and neuropsychiatric Symptom Questionnaire (NPI-Q) in this paper. These four clinical data are used as compensation information and embedded in the algorithm in a linear regression manner. It also reflects its superiority and robustness compared to traditional correlation analysis methods on actual and simulated data. Meanwhile, compared with MT-SCCA, the model utilized in this paper obtains a higher gene-ROI weight and identifies clearer biomarkers, which provides a practical basis for the study of complex human disease pathology.
Collapse
Affiliation(s)
- Shuaiqun Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai, 201306, People's Republic of China.
| | - Huiqiu Chen
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai, 201306, People's Republic of China
| | - Wei Kong
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai, 201306, People's Republic of China
| | - Fengchun Ke
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai, 201306, People's Republic of China
| | - Kai Wei
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai, 201306, People's Republic of China
| |
Collapse
|
21
|
Wang Y, Fu Y, Luo X. Identification of Pathogenetic Brain Regions via Neuroimaging Data for Diagnosis of Autism Spectrum Disorders. Front Neurosci 2022; 16:900330. [PMID: 35655751 PMCID: PMC9152096 DOI: 10.3389/fnins.2022.900330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 04/11/2022] [Indexed: 11/13/2022] Open
Abstract
Autism spectrum disorder (ASD) is a kind of neurodevelopmental disorder that often occurs in children and has a hidden onset. Patients usually have lagged development of communication ability and social behavior and thus suffer an unhealthy physical and mental state. Evidence has indicated that diseases related to ASD have commonalities in brain imaging characteristics. This study aims to study the pathogenesis of ASD based on brain imaging data to locate the ASD-related brain regions. Specifically, we collected the functional magnetic resonance image data of 479 patients with ASD and 478 normal subjects matched in age and gender and used a machine-learning framework named random support vector machine cluster to extract distinctive brain regions from the preprocessed data. According to the experimental results, compared with other existing approaches, the method used in this study can more accurately distinguish patients from normal individuals based on brain imaging data. At the same time, this study found that the development of ASD was highly correlated with certain brain regions, e.g., lingual gyrus, superior frontal gyrus, medial gyrus, insular lobe, and olfactory cortex. This study explores the effectiveness of a novel machine-learning approach in the study of ASD brain imaging and provides a reference brain area for the medical research and clinical treatment of ASD.
Collapse
Affiliation(s)
- Yu Wang
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
- Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
| | - Yu Fu
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
- Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
- *Correspondence: Yu Fu
| | - Xun Luo
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, China
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
- Hunan Xiangjiang Artificial Intelligence Academy, Changsha, China
| |
Collapse
|
22
|
Xin Y, Sheng J, Miao M, Wang L, Yang Z, Huang H. A review ofimaging genetics in Alzheimer's disease. J Clin Neurosci 2022; 100:155-163. [PMID: 35487021 DOI: 10.1016/j.jocn.2022.04.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 03/01/2022] [Accepted: 04/15/2022] [Indexed: 01/18/2023]
Abstract
Determining the association between genetic variation and phenotype is a key step to study the mechanism of Alzheimer's disease (AD), laying the foundation for studying drug therapies and biomarkers. AD is the most common type of dementia in the aged population. At present, three early-onset AD genes (APP, PSEN1, PSEN2) and one late-onset AD susceptibility gene apolipoprotein E (APOE) have been determined. However, the pathogenesis of AD remains unknown. Imaging genetics, an emerging interdisciplinary field, is able to reveal the complex mechanisms from the genetic level to human cognition and mental disorders via macroscopic intermediates. This paper reviews methods of establishing genotype-phenotype to explore correlations, including sparse canonical correlation analysis, sparse reduced rank regression, sparse partial least squares and so on. We found that most research work did poorly in supervised learning and exploring the nonlinear relationship between SNP-QT.
Collapse
Affiliation(s)
- Yu Xin
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| | - Jinhua Sheng
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China.
| | - Miao Miao
- Beijing Hospital, Beijing 100730, China; National Center of Gerontology, Beijing 100730, China; Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Luyun Wang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; Hangzhou Vocational & Technical College, Hangzhou, Zhejiang 310018, China
| | - Ze Yang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| | - He Huang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
23
|
Bi XA, Zhou W, Luo S, Mao Y, Hu X, Zeng B, Xu L. Feature aggregation graph convolutional network based on imaging genetic data for diagnosis and pathogeny identification of Alzheimer's disease. Brief Bioinform 2022; 23:6572662. [PMID: 35453149 DOI: 10.1093/bib/bbac137] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/15/2022] [Accepted: 03/23/2022] [Indexed: 12/30/2022] Open
Abstract
The roles of brain regions activities and gene expressions in the development of Alzheimer's disease (AD) remain unclear. Existing imaging genetic studies usually has the problem of inefficiency and inadequate fusion of data. This study proposes a novel deep learning method to efficiently capture the development pattern of AD. First, we model the interaction between brain regions and genes as node-to-node feature aggregation in a brain region-gene network. Second, we propose a feature aggregation graph convolutional network (FAGCN) to transmit and update the node feature. Compared with the trivial graph convolutional procedure, we replace the input from the adjacency matrix with a weight matrix based on correlation analysis and consider common neighbor similarity to discover broader associations of nodes. Finally, we use a full-gradient saliency graph mechanism to score and extract the pathogenetic brain regions and risk genes. According to the results, FAGCN achieved the best performance among both traditional and cutting-edge methods and extracted AD-related brain regions and genes, providing theoretical and methodological support for the research of related diseases.
Collapse
Affiliation(s)
- Xia-An Bi
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, and the College of Information Science and Engineering in Hunan Normal University, P.R. China
| | - Wenyan Zhou
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Sheng Luo
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Yuhua Mao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Xi Hu
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Bin Zeng
- Hunan Youdao Information Technology Co., Ltd, P.R. China
| | - Luyun Xu
- College of Business in Hunan Normal University, P.R. China
| |
Collapse
|
24
|
Zhang J, Wang H, Zhao Y, Guo L, Du L. Identification of multimodal brain imaging association via a parameter decomposition based sparse multi-view canonical correlation analysis method. BMC Bioinformatics 2022; 23:128. [PMID: 35413798 PMCID: PMC9006414 DOI: 10.1186/s12859-022-04669-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 04/04/2022] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND With the development of noninvasive imaging technology, collecting different imaging measurements of the same brain has become more and more easy. These multimodal imaging data carry complementary information of the same brain, with both specific and shared information being intertwined. Within these multimodal data, it is essential to discriminate the specific information from the shared information since it is of benefit to comprehensively characterize brain diseases. While most existing methods are unqualified, in this paper, we propose a parameter decomposition based sparse multi-view canonical correlation analysis (PDSMCCA) method. PDSMCCA could identify both modality-shared and -specific information of multimodal data, leading to an in-depth understanding of complex pathology of brain disease. RESULTS Compared with the SMCCA method, our method obtains higher correlation coefficients and better canonical weights on both synthetic data and real neuroimaging data. This indicates that, coupled with modality-shared and -specific feature selection, PDSMCCA improves the multi-view association identification and shows meaningful feature selection capability with desirable interpretation. CONCLUSIONS The novel PDSMCCA confirms that the parameter decomposition is a suitable strategy to identify both modality-shared and -specific imaging features. The multimodal association and the diverse information of multimodal imaging data enable us to better understand the brain disease such as Alzheimer's disease.
Collapse
Affiliation(s)
- Jin Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Huiai Wang
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Ying Zhao
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Lei Guo
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Lei Du
- School of Automation, Northwestern Polytechnical University, Xi'an, China.
| | | |
Collapse
|
25
|
Bi XA, Xing Z, Zhou W, Li L, Xu L. Pathogeny Detection for Mild Cognitive Impairment via Weighted Evolutionary Random Forest with Brain Imaging and Genetic Data. IEEE J Biomed Health Inform 2022; 26:3068-3079. [PMID: 35157601 DOI: 10.1109/jbhi.2022.3151084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Medical imaging technology and gene sequencing technology have long been widely used to analyze the pathogenesis and make precise diagnoses of mild cognitive impairment (MCI). However, few studies involve the fusion of radiomics data with genomics data to make full use of the complementarity between different omics to detect pathogenic factors of MCI. This paper performs multimodal fusion analysis based on functional magnetic resonance imaging (fMRI) data and single nucleotide polymorphism (SNP) data of MCI patients. In specific, first, using correlation analysis methods on sequence information of regions of interests (ROIs) and digitalized gene sequences, the fusion features of samples are constructed. Then, introducing weighted evolution strategy into ensemble learning, a novel weighted evolutionary random forest (WERF) model is built to eliminate the inefficient features. Consequently, with the help of WERF, an overall multimodal data analysis framework is established to effectively identify MCI patients and extract pathogenic factors. Based on the data of MCI patients from the ADNI database and compared with some existing popular methods, the superiority in performance of the framework is verified. Our study has great potential to be an effective tool for pathogenic factors detection of MCI.
Collapse
|
26
|
Wang W, Kong W, Wang S, Wei K. Detecting Biomarkers of Alzheimer's Disease Based on Multi-constrained Uncertainty-Aware Adaptive Sparse Multi-view Canonical Correlation Analysis. J Mol Neurosci 2022; 72:841-865. [PMID: 35080765 DOI: 10.1007/s12031-021-01963-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 12/29/2021] [Indexed: 12/01/2022]
Abstract
Image genetics mainly explores the pathogenesis of Alzheimer's disease (AD) by studying the relationship between genetic data (such as SNP, gene expression data, and DNA methylation) and imaging data (such as structural MRI (sMRI), fMRI, and PET). Most of the existing research on brain imaging genomics uses two-way or three-way bi-multivariate methods to explore the correlation analysis between genes and brain imaging. However, many of these methods are still affected by the gradient domination or cannot take into account the effect of feature redundancy on the results, so that the typical correlation coefficient and program running speed are not significantly improved. In order to solve the above problems, this paper proposes a multi-constrained uncertainty-aware adaptive sparse multi-view canonical correlation analysis method (MC-unAdaSMCCA) to explore associations among SNPs, gene expression data, and sMRI; that is, based on traditional unAdaSMCCA, orthogonal constraints are imposed on the weights of the three data features through linear programming, which can reduce the redundancy of feature weights to improve the correlation between the data and reduce the complexity of the algorithm to significantly speed up the running speed of the program. Three adaptive sparse multi-view canonical correlation analysis methods are used as benchmarks to evaluate the difference between real neuroimaging data and synthetic data. Compared with the other three methods, our proposed method has obtained better or comparable typical correlation coefficients and typical weights. Moreover, the following experimental results show that the MC-unAdaSMCCA method cannot only identify biomarkers related to AD and mild cognitive impairment (MCI), but also has a strong ability to resist noise and process high-dimensional data. Therefore, our proposed method provides a reliable approach to multi-modal imaging genetic researches.
Collapse
Affiliation(s)
- Wenbo Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China
| | - Wei Kong
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China.
| | - Shuaiqun Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China
| | - Kai Wei
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave., Shanghai, 201306, People's Republic of China
| |
Collapse
|
27
|
Bao J, Wen Z, Kim M, Saykin AJ, Thompson PM, Zhao Y, Shen L. Identifying imaging genetic associations via regional morphometricity estimation. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022; 27:97-108. [PMID: 34890140 PMCID: PMC8730533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Brain imaging genetics is an emerging research field aiming to reveal the genetic basis of brain traits captured by imaging data. Inspired by heritability analysis, the concept of morphometricity was recently introduced to assess trait association with whole brain morphology. In this study, we extend the concept of morphometricity from its original definition at the whole brain level to a more focal level based on a region of interest (ROI). We propose a novel framework to identify the SNP-ROI association via regional morphometricity estimation of each studied single nucleotide polymorphism (SNP). We perform an empirical study on the structural MRI and genotyping data from a landmark Alzheimer's disease (AD) biobank; and yield promising results. Our findings indicate that the AD-related SNPs have higher overall regional morphometricity estimates than the SNPs not yet related to AD. This observation suggests that the variance of AD SNPs can be explained more by regional morphometric features than non-AD SNPs, supporting the value of imaging traits as targets in studying AD genetics. Also, we identified 11 ROIs, where the AD/non-AD SNPs and significant/insignificant morphometricity estimation of the corresponding SNPs in these ROIs show strong dependency. Supplementary motor area (SMA) and dorsolateral prefrontal cortex (DPC) are enriched by these ROIs. Our results also demonstrate that using all the detailed voxel-level measures within the ROI to incorporate morphometric information outperforms using only a single average ROI measure, and thus provides improved power to detect imaging genetic associations.
Collapse
Affiliation(s)
- Jingxuan Bao
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Zixuan Wen
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Mansu Kim
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Andrew J. Saykin
- Indiana Alzheimer Disease Center, Department of Radiology and Imaging Sciences Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Paul M. Thompson
- Imaging Genetics Center, Stevens Institute for Neuroimaging and Informatics University of Southern California School of Medicine, Marina del Rey, CA 90292, USA
| | - Yize Zhao
- Department of Biostatistics Yale University School of Public Health, New Haven, CT 06511, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| |
Collapse
|
28
|
Bao J, Wen Z, Kim M, Zhao X, Lee BN, Jung SH, Davatzikos C, Saykin AJ, Thompson PM, Kim D, Zhao Y, Shen L. Identifying highly heritable brain amyloid phenotypes through mining Alzheimer's imaging and sequencing biobank data. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022; 27:109-120. [PMID: 34890141 PMCID: PMC8730532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Brain imaging genetics, an emerging and rapidly growing research field, studies the relationship between genetic variations and brain imaging quantitative traits (QTs) to gain new insights into the phenotypic characteristics and genetic mechanisms of the brain. Heritability is an important measurement to quantify the proportion of the observed variance in an imaging QT that is explained by genetic factors, and can often be used to prioritize brain QTs for subsequent imaging genetic association studies. Most existing studies define regional imaging QTs using predefined brain parcellation schemes such as the automated anatomical labeling (AAL) atlas. However, the power to dissect genetic underpinnings under QTs defined in such an unsupervised fashion could be negatively affected by heterogeneity within the regions in the partition. To bridge this gap, we propose a novel method to define highly heritable brain regions. Based on voxelwise heritability estimates, we extract brain regions containing spatially connected voxels with high heritability. We perform an empirical study on the amyloid imaging and whole genome sequencing data from a landmark Alzheimer's disease biobank; and demonstrate the regions defined by our method have much higher estimated heritabilities than the regions defined by the AAL atlas. Our proposed method refines the imaging endophenotype constructions in light of their genetic dissection, and yields more powerful imaging QTs for subsequent detection of genetic risk factors along with better interpretability.
Collapse
Affiliation(s)
- Jingxuan Bao
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Zixuan Wen
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Mansu Kim
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Xiwen Zhao
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT 06511, USA
| | - Brian N. Lee
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Sang-Hyuk Jung
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Christos Davatzikos
- Center for Biomedical Image Computing and Analytics, Department of Radiology University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Andrew J. Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Paul M. Thompson
- Imaging Genetics Center, Stevens Institute for Neuroimaging and Informatics University of Southern California School of Medicine, Marina del Rey, CA 90292, USA
| | - Dokyoon Kim
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Yize Zhao
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT 06511, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | | |
Collapse
|
29
|
Wang S, Wu X, Wei K, Kong W. An Improved Fusion Paired Group Lasso Structured Sparse Canonical Correlation Analysis Based on Brain Imaging Genetics to Identify Biomarkers of Alzheimer’s Disease. Front Aging Neurosci 2022; 13:817520. [PMID: 35069181 PMCID: PMC8770861 DOI: 10.3389/fnagi.2021.817520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 12/14/2021] [Indexed: 01/01/2023] Open
Abstract
Brain imaging genetics can demonstrate the complicated relationship between genetic factors and the structure or function of the humankind brain. Therefore, it has become an important research topic and attracted more and more attention from scholars. The structured sparse canonical correlation analysis (SCCA) model has been widely used to identify the association between brain image data and genetic data in imaging genetics. To investigate the intricate genetic basis of cerebrum imaging phenotypes, a great deal of other standard SCCA methods combining different interested structed have now appeared. For example, some models use group lasso penalty, and some use the fused lasso or the graph/network guided fused lasso for feature selection. However, prior knowledge may not be completely available and the group lasso methods have limited capabilities in practical applications. The graph/network guided approaches can use sample correlation to define constraints, thereby overcoming this problem. Unfortunately, this also has certain limitations. The graph/network conducted methods are susceptible to the sign of the sample correlation of the data, which will affect the stability of the model. To improve the efficiency and stability of SCCA, a sparse canonical correlation analysis model with GraphNet regularization (FGLGNSCCA) is proposed in this manuscript. Based on the FGLSCCA model, the GraphNet regularization penalty is imposed in our study and an optimization algorithm is presented to optimize the model. The structural Magnetic Resonance Imaging (sMRI) and gene expression data are used in this study to find the genotype and characteristics of brain regions associated with Alzheimer’s disease (AD). Experiment results shown that the new FGLGNSCCA model proposed in this manuscript is superior or equivalent to traditional methods in both artificially synthesized neuroimaging genetics data or actual neuroimaging genetics data. It can select essential features more powerfully compared with other multivariate methods and identify significant canonical correlation coefficients as well as captures more significant typical weight patterns which demonstrated its excellent ability in finding biologically important imaging genetic relations.
Collapse
|
30
|
Associating brain imaging phenotypes and genetic in Alzheimer's disease via JSCCA approach with autocorrelation constraints. Med Biol Eng Comput 2021; 60:95-108. [PMID: 34714488 DOI: 10.1007/s11517-021-02439-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2020] [Accepted: 09/02/2021] [Indexed: 10/20/2022]
Abstract
Imaging genetics research can explore the potential correlation between imaging and genomics. Most association analysis methods cannot effectively use the prior knowledge of the original data. In this respect, we add the prior knowledge of each original data to mine more effective biomarkers. The study of imaging genetics based on the sparse canonical correlation analysis (SCCA) is helpful to mine the potential biomarkers of neurological diseases. To improve the performance and interpretability of SCCA, we proposed a penalty method based on the autocorrelation matrix for discovering the possible biological mechanism between single nucleotide polymorphisms (SNP) variations and brain regions changes of Alzheimer's disease (AD). The addition of the penalty allows the proposed algorithm to analyze the correlation between different modal features. The proposed algorithm obtains more biologically interpretable ROIs and SNPs that are significantly related to AD, which has better anti-noise performance. Compared with other SCCA-based algorithms (JCB-SCCA, JSNMNMF), the proposed algorithm can still maintain a stronger correlation with ground truth even when the noise is larger. Then, we put the regions of interest (ROI) selected by the three algorithms into the SVM classifier. The proposed algorithm has higher classification accuracy. Also, we use ridge regression with SNPs selected by three algorithms and four AD risk ROIs. The proposed algorithm has a smaller root mean square error (RMSE). It shows that proposed algorithm has a good ability in association recognition and feature selection. Furthermore, it selects important features more stably, improving the clinical diagnosis of new potential biomarkers.
Collapse
|
31
|
Identifying Biomarkers of Alzheimer's Disease via a Novel Structured Sparse Canonical Correlation Analysis Approach. J Mol Neurosci 2021; 72:323-335. [PMID: 34570360 DOI: 10.1007/s12031-021-01915-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 09/09/2021] [Indexed: 02/05/2023]
Abstract
Using correlation analysis to study the potential connection between brain genetics and imaging has become an effective method to understand neurodegenerative diseases. Sparse canonical correlation analysis (SCCA) makes it possible to study high-dimensional genetic information. The traditional SCCA methods can only process single-modal genetic and image data, which to some extent weaken the close connection of the brain's biological network. In some recently proposed multimodal SCCA methods, due to the limitations of penalty items, the pre-processed data needs to be further filtered to make the dimensions uniform, which may destroy the potential association of data in the same modal. In this research, in order to combine data between different modalities and to ensure that the chain relationship or graph network relationship within the same modality will not be destroyed, the original generalized fused lasso penalty was replaced with the fused pairwise group lasso (FGL) and the graph-guided pairwise group lasso (GGL) based on the method of joint sparse canonical correlation analysis (JSCCA). We used prior knowledge to construct a supervised bivariate learning model and use linear regression to select quantitative traits (QTs) of images that are strongly correlated with the Mini-mental State Examination (MMSE) scores. Compared with FGL-SCCA, the model we constructed obtained a higher gene-ROI correlation coefficient and identified more significant biomarkers, providing a theoretical basis for further understanding the complex pathology of neurodegenerative diseases.
Collapse
|
32
|
Integration of Imaging Genomics Data for the Study of Alzheimer's Disease Using Joint-Connectivity-Based Sparse Nonnegative Matrix Factorization. J Mol Neurosci 2021; 72:255-272. [PMID: 34410569 DOI: 10.1007/s12031-021-01888-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 07/06/2021] [Indexed: 10/20/2022]
Abstract
Imaging genetics reveals the connection between microscopic genetics and macroscopic imaging, enabling the identification of disease biomarkers. In this work, we make full use of prior knowledge that has significant reference value for investigating the correlation between the brain and genetics to explore more biologically substantial biomarkers. In this paper, we propose joint-connectivity-based sparse nonnegative matrix factorization (JCB-SNMF). The algorithm simultaneously projects structural magnetic resonance imaging (sMRI), single-nucleotide polymorphism sites (SNPs), and gene expression data onto a common feature space, where heterogeneous variables with large coefficients in the same projection direction form a common module. In addition, the connectivity information for each region of the brain and genetic data are added as prior knowledge to identify regions of interest (ROIs), SNPs, and gene-related risks related to Alzheimer's disease (AD) patients. GraphNet regularization increases the anti-noise performance of the algorithm and the biological interpretability of the results. The simulation results show that compared with other NMF-based algorithms (JNMF, JSNMNMF), JCB-SNMF has better anti-noise performance and can identify and predict biomarkers closely related to AD from significant modules. By constructing a protein-protein interaction (PPI) network, we identified SF3B1, RPS20, and RBM14 as potential biomarkers of AD. We also found some significant SNP-ROI and gene-ROI pairs. Among them, two SNPs rs4472239 and rs11918049 and three genes KLHL8, ZC3H11A, and OSGEPL1 may have effects on the gray matter volume of multiple brain regions. This model provides a new way to further integrate multimodal impact genetic data to identify complex disease association patterns.
Collapse
|
33
|
Bi XA, Zhou W, Li L, Xing Z. Detecting Risk Gene and Pathogenic Brain Region in EMCI Using a Novel GERF Algorithm Based on Brain Imaging and Genetic Data. IEEE J Biomed Health Inform 2021; 25:3019-3028. [PMID: 33750717 DOI: 10.1109/jbhi.2021.3067798] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Fusion analysis of disease-related multi-modal data is becoming increasingly important to illuminate the pathogenesis of complex brain diseases. However, owing to the small amount and high dimension of multi-modal data, current machine learning methods do not fully achieve the high veracity and reliability of fusion feature selection. In this paper, we propose a genetic-evolutionary random forest (GERF) algorithm to discover the risk genes and disease-related brain regions of early mild cognitive impairment (EMCI) based on the genetic data and resting-state functional magnetic resonance imaging (rs-fMRI) data. Classical correlation analysis method is used to explore the association between brain regions and genes, and fusion features are constructed. The genetic-evolutionary idea is introduced to enhance the classification performance, and to extract the optimal features effectively. The proposed GERF algorithm is evaluated by the public Alzheimer's Disease Neuroimaging Initiative (ADNI) database, and the results show that the algorithm achieves satisfactory classification accuracy in small sample learning. Moreover, we compare the GERF algorithm with other methods to prove its superiority. Furthermore, we propose the overall framework of detecting pathogenic factors, which can be accurately and efficiently applied to the multi-modal data analysis of EMCI and be able to extend to other diseases. This work provides a novel insight for early diagnosis and clinicopathologic analysis of EMCI, which facilitates clinical medicine to control further deterioration of diseases and is good for the accurate electric shock using transcranial magnetic stimulation.
Collapse
|
34
|
Du L, Zhang J, Liu F, Wang H, Guo L, Han J, Disease Neuroimaging Initiative TA. Identifying associations among genomic, proteomic and imaging biomarkers via adaptive sparse multi-view canonical correlation analysis. Med Image Anal 2021; 70:102003. [PMID: 33735757 DOI: 10.1016/j.media.2021.102003] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 02/10/2021] [Accepted: 02/15/2021] [Indexed: 12/13/2022]
Abstract
To uncover the genetic underpinnings of brain disorders, brain imaging genomics usually jointly analyzes genetic variations and imaging measurements. Meanwhile, other biomarkers such as proteomic expressions can also carry valuable complementary information. Therefore, it is necessary yet challenging to investigate the underlying relationships among genetic variations, proteomic expressions, and neuroimaging measurements, which stands a chance of gaining new insights into the pathogenesis of brain disorders. Given multiple types of biomarkers, using sparse multi-view canonical correlation analysis (SMCCA) and its variants to identify the multi-way associations is straightforward. However, due to the gradient domination issue caused by the naive fusion of multiple SCCA objectives, SMCCA is suboptimal. In this paper, we proposed two adaptive SMCCA (AdaSMCCA) methods, i.e. the robustness-aware AdaSMCCA and the uncertainty-aware AdaSMCCA, to analyze the complicated associations among genetic, proteomic, and neuroimaging biomarkers. We also imposed a data-driven feature grouping penalty to the genetic data with aim to uncover the joint inheritance of neighboring genetic variations. An efficient optimization algorithm, which is guaranteed to converge, was provided. Using two state-of-the-art SMCCA as benchmarks, we evaluated robustness-aware AdaSMCCA and uncertainty-aware AdaSMCCA on both synthetic data and real neuroimaging, proteomics, and genetic data. Both proposed methods obtained higher associations and cleaner canonical weight profiles than comparison methods, indicating their promising capability for association identification and feature selection. In addition, the subsequent analysis showed that the identified biomarkers were related to Alzheimer's disease, demonstrating the power of our methods in identifying multi-way bi-multivariate associations among multiple heterogeneous biomarkers.
Collapse
Affiliation(s)
- Lei Du
- School of Automation, Northwestern Polytechnical University, Xi'an 710072, China.
| | - Jin Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Fang Liu
- School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Huiai Wang
- School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Lei Guo
- School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Junwei Han
- School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | | |
Collapse
|
35
|
Du L, Liu F, Liu K, Yao X, Risacher SL, Han J, Saykin AJ, Shen L. Associating Multi-Modal Brain Imaging Phenotypes and Genetic Risk Factors via a Dirty Multi-Task Learning Method. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3416-3428. [PMID: 32746095 PMCID: PMC7705646 DOI: 10.1109/tmi.2020.2995510] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Brain imaging genetics becomes more and more important in brain science, which integrates genetic variations and brain structures or functions to study the genetic basis of brain disorders. The multi-modal imaging data collected by different technologies, measuring the same brain distinctly, might carry complementary information. Unfortunately, we do not know the extent to which the phenotypic variance is shared among multiple imaging modalities, which further might trace back to the complex genetic mechanism. In this paper, we propose a novel dirty multi-task sparse canonical correlation analysis (SCCA) to study imaging genetic problems with multi-modal brain imaging quantitative traits (QTs) involved. The proposed method takes advantages of the multi-task learning and parameter decomposition. It can not only identify the shared imaging QTs and genetic loci across multiple modalities, but also identify the modality-specific imaging QTs and genetic loci, exhibiting a flexible capability of identifying complex multi-SNP-multi-QT associations. Using the state-of-the-art multi-view SCCA and multi-task SCCA, the proposed method shows better or comparable canonical correlation coefficients and canonical weights on both synthetic and real neuroimaging genetic data. In addition, the identified modality-consistent biomarkers, as well as the modality-specific biomarkers, provide meaningful and interesting information, demonstrating the dirty multi-task SCCA could be a powerful alternative method in multi-modal brain imaging genetics.
Collapse
Affiliation(s)
- Lei Du
- School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Fang Liu
- School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Kefei Liu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Xiaohui Yao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shannon L. Risacher
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Junwei Han
- School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Andrew J. Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| |
Collapse
|
36
|
Bi XA, Hu X, Xie Y, Wu H. A novel CERNNE approach for predicting Parkinson's Disease-associated genes and brain regions based on multimodal imaging genetics data. Med Image Anal 2020; 67:101830. [PMID: 33096519 DOI: 10.1016/j.media.2020.101830] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/24/2020] [Accepted: 09/01/2020] [Indexed: 12/13/2022]
Abstract
The detection and pathogenic factors analysis of Parkinson's disease (PD) has a practical significance for its diagnosis and treatment. However, the traditional research paradigms are commonly based on single neural imaging data, which is easy to ignore the complementarity between multimodal imaging genetics data. The existing researches also pay little attention to the comprehensive framework of patient detection and pathogenic factors analysis for PD. Based on functional magnetic resonance imaging (fMRI) data and single nucleotide polymorphism (SNP) data, a novel brain disease multimodal data analysis model is proposed in this paper. Firstly, according to the complementarity between the two types of data, the classical correlation analysis method is used to construct the fusion feature of subjects. Secondly, based on the artificial neural network, the fusion feature analysis tool named clustering evolutionary random neural network ensemble (CERNNE) is designed. This method integrates multiple neural networks constructed randomly, and uses clustering evolution strategy to optimize the ensemble learner by adaptive selective integration, selecting the discriminative features for PD analysis and ensuring the generalization performance of the ensemble model. By combining with data fusion scheme, the CERNNE is applied to forming a multi-task analysis framework, recognizing PD patients and predicting PD-associated brain regions and genes. In the multimodal data experiment, the proposed framework shows better classification performance and pathogenic factors predicting ability, which provides a new perspective for the diagnosis of PD.
Collapse
Affiliation(s)
- Xia-An Bi
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha 410081, China; College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China.
| | - Xi Hu
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha 410081, China; College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China
| | - Yiming Xie
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha 410081, China; College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China
| | - Hao Wu
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha 410081, China; College of Information Science and Engineering, Hunan Normal University, Changsha 410081, China
| |
Collapse
|
37
|
Liu J, Sheng Y, Lan W, Guo R, Wang Y, Wang J. Improved ASD classification using dynamic functional connectivity and multi-task feature selection. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.07.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
38
|
Bi XA, Wu H, Xie Y, Zhang L, Luo X, Fu Y. The exploration of Parkinson's disease: a multi-modal data analysis of resting functional magnetic resonance imaging and gene data. Brain Imaging Behav 2020; 15:1986-1996. [PMID: 32990896 DOI: 10.1007/s11682-020-00392-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/31/2020] [Indexed: 02/02/2023]
Abstract
Parkinson's disease (PD) is the most universal chronic degenerative neurological dyskinesia and an important threat to elderly health. At present, the researches of PD are mainly based on single-modal data analysis, while the fusion research of multi-modal data may provide more meaningful information in the aspect of comprehending the pathogenesis of PD. In this paper, 104 samples having resting functional magnetic resonance imaging (rfMRI) and gene data are from Parkinson's Progression Markers Initiative (PPMI) and Alzheimer's Disease Neuroimaging Initiative (ADNI) database to predict pathological brain areas and risk genes related to PD. In the experiment, Pearson correlation analysis is adopted to conduct fusion analysis from the data of genes and brain areas as multi-modal sample characteristics, and the clustering evolution random forest (CERF) method is applied to detect the discriminative genes and brain areas. The experimental results indicate that compared with several existing advanced methods, the CERF method can further improve the diagnosis of PD and healthy control, and can achieve a significant effect. More importantly, we find that there are some interesting associations between brain areas and genes in PD patients. Based on these associations, we notice that PD-related brain areas include angular gyrus, thalamus, posterior cingulate gyrus and paracentral lobule, and risk genes mainly include C6orf10, HLA-DPB1 and HLA-DOA. These discoveries have a significant contribution to the early prevention and clinical treatments of PD.
Collapse
Affiliation(s)
- Xia-An Bi
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, People's Republic of China. .,College of Information Science and Engineering, Hunan Normal University, Changsha, People's Republic of China.
| | - Hao Wu
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, People's Republic of China.,College of Information Science and Engineering, Hunan Normal University, Changsha, People's Republic of China
| | - Yiming Xie
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, People's Republic of China.,College of Information Science and Engineering, Hunan Normal University, Changsha, People's Republic of China
| | - Lixia Zhang
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, People's Republic of China.,College of Information Science and Engineering, Hunan Normal University, Changsha, People's Republic of China
| | - Xun Luo
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, People's Republic of China.,College of Information Science and Engineering, Hunan Normal University, Changsha, People's Republic of China
| | - Yu Fu
- Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, People's Republic of China.,College of Information Science and Engineering, Hunan Normal University, Changsha, People's Republic of China
| | | |
Collapse
|
39
|
Du L, Liu F, Liu K, Yao X, Risacher SL, Han J, Guo L, Saykin AJ, Shen L. Identifying diagnosis-specific genotype-phenotype associations via joint multitask sparse canonical correlation analysis and classification. Bioinformatics 2020; 36:i371-i379. [PMID: 32657360 PMCID: PMC7355274 DOI: 10.1093/bioinformatics/btaa434] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
MOTIVATION Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype-phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype-phenotype associations. RESULTS In this article, we propose a new joint multitask learning method, named MT-SCCALR, which absorbs the merits of both SCCA and logistic regression. MT-SCCALR learns genotype-phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype-phenotype pattern. Meanwhile, MT-SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT-SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype-phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. AVAILABILITY AND IMPLEMENTATION The software is publicly available at https://github.com/dulei323/MTSCCALR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lei Du
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Fang Liu
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Kefei Liu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Xiaohui Yao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shannon L Risacher
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Junwei Han
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Lei Guo
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | | |
Collapse
|