1
|
Song P, Li X, Yuan X, Pang L, Song X, Wang Y. Identifying frequency-dependent imaging genetic associations via hypergraph-structured multi-task sparse canonical correlation analysis. Comput Biol Med 2024; 171:108051. [PMID: 38335819 DOI: 10.1016/j.compbiomed.2024.108051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 01/03/2024] [Accepted: 01/26/2024] [Indexed: 02/12/2024]
Abstract
Identifying complex associations between genetic variations and imaging phenotypes is a challenging task in the research of brain imaging genetics. The previous study has proved that neuronal oscillations within distinct frequency bands are derived from frequency-dependent genetic modulation. Thus it is meaningful to explore frequency-dependent imaging genetic associations, which may give important insights into the pathogenesis of brain disorders. In this work, the hypergraph-structured multi-task sparse canonical correlation analysis (HS-MTSCCA) was developed to explore the associations between multi-frequency imaging phenotypes and single-nucleotide polymorphisms (SNPs). Specifically, we first created a hypergraph for the imaging phenotypes of each frequency and the SNPs, respectively. Then, a new hypergraph-structured constraint was proposed to learn high-order relationships among features in each hypergraph, which can introduce biologically meaningful information into the model. The frequency-shared and frequency-specific imaging phenotypes and SNPs could be identified using the multi-task learning framework. We also proposed a useful strategy to tackle this algorithm and then demonstrated its convergence. The proposed method was evaluated on four simulation datasets and a real schizophrenia dataset. The experimental results on synthetic data showed that HS-MTSCCA outperforms the other competing methods according to canonical correlation coefficients, canonical weights, and cosine similarity. And the results on real data showed that HS-MTSCCA could obtain superior canonical coefficients and canonical weights. Furthermore, the identified frequency-shared and frequency-specific biomarkers could provide more interesting and meaningful information, demonstrating that HS-MTSCCA is a powerful method for brain imaging genetics.
Collapse
Affiliation(s)
- Peilun Song
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Xue Li
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Xiuxia Yuan
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Lijuan Pang
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Xueqin Song
- Department of Psychiatry, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450001, Henan, China; Biological Psychiatry International Joint Laboratory of Henan/Zhengzhou University, Zhengzhou, 450001, Henan, China
| | - Yaping Wang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou, 450001, Henan, China.
| |
Collapse
|
2
|
Wang M, Shao W, Huang S, Zhang D. Hypergraph-regularized multimodal learning by graph diffusion for imaging genetics based Alzheimer's Disease diagnosis. Med Image Anal 2023; 89:102883. [PMID: 37467641 DOI: 10.1016/j.media.2023.102883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 04/06/2023] [Accepted: 06/28/2023] [Indexed: 07/21/2023]
Abstract
Recent studies show that multi-modal data fusion techniques combining information from diverse sources are helpful to diagnose and predict complex brain disorders. However, most existing diagnosis methods have only simply employed a feature combination strategy for multiple imaging and genetic data, ignoring the imaging phenotypes associated with the risk gene information. To this end, we present a hypergraph-regularized multimodal learning by graph diffusion (HMGD) for joint association learning and outcome prediction. Specifically, we first present a graph diffusion method for enhancing similarity measures among subjects given from multi-modality phenotypes, which fully uses multiple input similarity graphs and integrates them into a unified graph with valuable geometric structures among different imaging phenotypes. Then, we employ the unified graph to represent the high-order similarity relationships among subjects, and enforce a hypergraph-regularized term to incorporate both inter- and cross-modality information for selecting the imaging phenotypes associated with the risk single nucleotide polymorphism (SNP). Finally, a multi-kernel support vector machine (MK-SVM) is adopted to fuse such phenotypic features selected from different modalities for the final diagnosis and prediction. The proposed approach is experimentally explored on brain imaging genetic data of the Alzheimer's Disease Neuroimaging Initiative (ADNI) datasets. Relevant results present that the proposed approach is superior to several competing algorithms, and realizes strong associations and discovers significant consistent and robust ROIs across different imaging phenotypes associated with the genetic risk biomarkers to guide disease interpretation and prediction.
Collapse
Affiliation(s)
- Meiling Wang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China
| | - Wei Shao
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China
| | - Shuo Huang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China
| | - Daoqiang Zhang
- College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China; MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing 211106, China.
| |
Collapse
|
3
|
Du L, Zhang J, Zhao Y, Shang M, Guo L, Han J. inMTSCCA: An Integrated Multi-task Sparse Canonical Correlation Analysis for Multi-omic Brain Imaging Genetics. Genomics Proteomics Bioinformatics 2023; 21:396-413. [PMID: 37442417 PMCID: PMC10634656 DOI: 10.1016/j.gpb.2023.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Revised: 01/29/2023] [Accepted: 03/14/2023] [Indexed: 07/15/2023]
Abstract
Identifying genetic risk factors for Alzheimer's disease (AD) is an important research topic. To date, different endophenotypes, such as imaging-derived endophenotypes and proteomic expression-derived endophenotypes, have shown the great value in uncovering risk genes compared to case-control studies. Biologically, a co-varying pattern of different omics-derived endophenotypes could result from the shared genetic basis. However, existing methods mainly focus on the effect of endophenotypes alone; the effect of cross-endophenotype (CEP) associations remains largely unexploited. In this study, we used both endophenotypes and their CEP associations of multi-omic data to identify genetic risk factors, and proposed two integrated multi-task sparse canonical correlation analysis (inMTSCCA) methods, i.e., pairwise endophenotype correlation-guided MTSCCA (pcMTSCCA) and high-order endophenotype correlation-guided MTSCCA (hocMTSCCA). pcMTSCCA employed pairwise correlations between magnetic resonance imaging (MRI)-derived, plasma-derived, and cerebrospinal fluid (CSF)-derived endophenotypes as an additional penalty. hocMTSCCA used high-order correlations among these multi-omic data for regularization. To figure out genetic risk factors at individual and group levels, as well as altered endophenotypic markers, we introduced sparsity-inducing penalties for both models. We compared pcMTSCCA and hocMTSCCA with three related methods on both simulation and real (consisting of neuroimaging data, proteomic analytes, and genetic data) datasets. The results showed that our methods obtained better or comparable canonical correlation coefficients (CCCs) and better feature subsets than benchmarks. Most importantly, the identified genetic loci and heterogeneous endophenotypic markers showed high relevance. Therefore, jointly using multi-omic endophenotypes and their CEP associations is promising to reveal genetic risk factors. The source code and manual of inMTSCCA are available at https://ngdc.cncb.ac.cn/biocode/tools/BT007330.
Collapse
Affiliation(s)
- Lei Du
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China.
| | - Jin Zhang
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Ying Zhao
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Muheng Shang
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Lei Guo
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| | - Junwei Han
- Department of Intelligent Science and Technology, School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
4
|
Zhang X, Hao Y, Zhang J, Ji Y, Zou S, Zhao S, Xie S, Du L. A multi-task SCCA method for brain imaging genetics and its application in neurodegenerative diseases. Comput Methods Programs Biomed 2023; 232:107450. [PMID: 36905750 DOI: 10.1016/j.cmpb.2023.107450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 02/24/2023] [Accepted: 02/24/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVES In brain imaging genetics, multi-task sparse canonical correlation analysis (MTSCCA) is effective to study the bi-multivariate associations between genetic variations such as single nucleotide polymorphisms (SNPs) and multi-modal imaging quantitative traits (QTs). However, most existing MTSCCA methods are neither supervised nor capable of distinguishing the shared patterns of multi-modal imaging QTs from the specific patterns. METHODS A new diagnosis-guided MTSCCA (DDG-MTSCCA) with parameter decomposition and graph-guided pairwise group lasso penalty was proposed. Specifically, the multi-tasking modeling paradigm enables us to comprehensively identify risk genetic loci by jointly incorporating multi-modal imaging QTs. The regression sub-task was raised to guide the selection of diagnosis-related imaging QTs. To reveal the diverse genetic mechanisms, the parameter decomposition and different constraints were utilized to facilitate the identification of modality-consistent and -specific genotypic variations. Besides, a network constraint was added to find out meaningful brain networks. The proposed method was applied to synthetic data and two real neuroimaging data sets respectively from Alzheimer's disease neuroimaging initiative (ADNI) and Parkinson's progression marker initiative (PPMI) databases. RESULTS Compared with the competitive methods, the proposed method exhibited higher or comparable canonical correlation coefficients (CCCs) and better feature selection results. In particular, in the simulation study, DDG-MTSCCA showed the best anti-noise ability and achieved the highest average hit rate, about 25% higher than MTSCCA. On the real data of Alzheimer's disease (AD) and Parkinson's disease (PD), our method obtained the highest average testing CCCs, about 40% ∼ 50% higher than MTSCCA. Especially, our method could select more comprehensive feature subsets, and the top five SNPs and imaging QTs were all disease-related. The ablation experimental results also demonstrated the significance of each component in the model, i.e., the diagnosis guidance, parameter decomposition, and network constraint. CONCLUSIONS These results on simulated data, ADNI and PPMI cohorts suggested the effectiveness and generalizability of our method in identifying meaningful disease-related markers. DDG-MTSCCA could be a powerful tool in brain imaging genetics, worthy of in-depth study.
Collapse
Affiliation(s)
- Xin Zhang
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Yipeng Hao
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Jin Zhang
- School of Automation, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Yanuo Ji
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Shihong Zou
- Institute of Medical Research, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Shijie Zhao
- School of Automation, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Songyun Xie
- School of Electronics and Information, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China
| | - Lei Du
- School of Automation, Northwestern Polytechnical University, Xi'an, Shannxi 710072, China.
| |
Collapse
|
5
|
Kim M, Wu R, Yao X, Saykin AJ, Moore JH, Shen L. Identifying genetic markers enriched by brain imaging endophenotypes in Alzheimer's disease. BMC Med Genomics 2022; 15:168. [PMID: 35915443 PMCID: PMC9344647 DOI: 10.1186/s12920-022-01323-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 07/26/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Alzheimer's disease (AD) is a complex neurodegenerative disorder and the most common type of dementia. AD is characterized by a decline of cognitive function and brain atrophy, and is highly heritable with estimated heritability ranging from 60 to 80[Formula: see text]. The most straightforward and widely used strategy to identify AD genetic basis is to perform genome-wide association study (GWAS) of the case-control diagnostic status. These GWAS studies have identified over 50 AD related susceptibility loci. Recently, imaging genetics has emerged as a new field where brain imaging measures are studied as quantitative traits to detect genetic factors. Given that many imaging genetics studies did not involve the diagnostic outcome in the analysis, the identified imaging or genetic markers may not be related or specific to the disease outcome. RESULTS We propose a novel method to identify disease-related genetic variants enriched by imaging endophenotypes, which are the imaging traits associated with both genetic factors and disease status. Our analysis consists of three steps: (1) map the effects of a genetic variant (e.g., single nucleotide polymorphism or SNP) onto imaging traits across the brain using a linear regression model, (2) map the effects of a diagnosis phenotype onto imaging traits across the brain using a linear regression model, and (3) detect SNP-diagnosis association via correlating the SNP effects with the diagnostic effects on the brain-wide imaging traits. We demonstrate the promise of our approach by applying it to the Alzheimer's Disease Neuroimaging Initiative database. Among 54 AD related susceptibility loci reported in prior large-scale AD GWAS, our approach identifies 41 of those from a much smaller study cohort while the standard association approaches identify only two of those. Clearly, the proposed imaging endophenotype enriched approach can reveal promising AD genetic variants undetectable using the traditional method. CONCLUSION We have proposed a novel method to identify AD genetic variants enriched by brain-wide imaging endophenotypes. This approach can not only boost detection power, but also reveal interesting biological pathways from genetic determinants to intermediate brain traits and to phenotypic AD outcomes.
Collapse
Affiliation(s)
- Mansu Kim
- Department of Artificial intelligence, Catholic University of Korea, Bucheon, Republic of Korea
| | - Ruiming Wu
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA
| | - Xiaohui Yao
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, USA
| | - Andrew J. Saykin
- Indiana Alzheimer Disease Center and Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, USA
| | - Jason H. Moore
- Department of Computational Biomedicine, Cedars Sinai Medical Center, West Hollywood, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, USA
| | - for the Alzheimer’s Disease Neuroimaging Initiative
- Department of Artificial intelligence, Catholic University of Korea, Bucheon, Republic of Korea
- School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, USA
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, USA
- Indiana Alzheimer Disease Center and Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, USA
- Department of Computational Biomedicine, Cedars Sinai Medical Center, West Hollywood, USA
| |
Collapse
|
6
|
Kim M, Min EJ, Liu K, Yan J, Saykin AJ, Moore JH, Long Q, Shen L. Multi-task learning based structured sparse canonical correlation analysis for brain imaging genetics. Med Image Anal 2022; 76:102297. [PMID: 34871929 PMCID: PMC8792314 DOI: 10.1016/j.media.2021.102297] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 09/08/2021] [Accepted: 10/29/2021] [Indexed: 02/03/2023]
Abstract
The advances in technologies for acquiring brain imaging and high-throughput genetic data allow the researcher to access a large amount of multi-modal data. Although the sparse canonical correlation analysis is a powerful bi-multivariate association analysis technique for feature selection, we are still facing major challenges in integrating multi-modal imaging genetic data and yielding biologically meaningful interpretation of imaging genetic findings. In this study, we propose a novel multi-task learning based structured sparse canonical correlation analysis (MTS2CCA) to deliver interpretable results and improve integration in imaging genetics studies. We perform comparative studies with state-of-the-art competing methods on both simulation and real imaging genetic data. On the simulation data, our proposed model has achieved the best performance in terms of canonical correlation coefficients, estimation accuracy, and feature selection accuracy. On the real imaging genetic data, our proposed model has revealed promising features of single-nucleotide polymorphisms and brain regions related to sleep. The identified features can be used to improve clinical score prediction using promising imaging genetic biomarkers. An interesting future direction is to apply our model to additional neurological or psychiatric cohorts such as patients with Alzheimer's or Parkinson's disease to demonstrate the generalizability of our method.
Collapse
Affiliation(s)
- Mansu Kim
- Department of Artificial Intelligence, Catholic University of Korea, Bucheon, Republic of Korea
| | - Eun Jeong Min
- College of Medicine, Catholic University of Korea, Seoul, Republic of Korea
| | - Kefei Liu
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, PA, USA
| | - Jingwen Yan
- School of Informatics and Computing, Indiana University, IN, USA
| | | | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, PA, USA
| | - Qi Long
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, PA, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, PA, USA.
| |
Collapse
|
7
|
Wang L, Kong W, Wang S. Detecting genetic associations with brain imaging phenotypes in Alzheimer's disease via a novel structured KCCA approach. J Bioinform Comput Biol 2021; 19:2150012. [PMID: 33950804 DOI: 10.1142/s0219720021500128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Neuroimaging genetics has become an important research topic since it can reveal complex associations between genetic variants (i.e. single nucleotide polymorphisms (SNPs) and the structures or functions of the human brain. However, existing kernel mapping is difficult to directly use the sparse representation method in the kernel feature space, which makes it difficult for most existing sparse canonical correlation analysis (SCCA) methods to be directly promoted in the kernel feature space. To bridge this gap, we adopt a novel alternating projected gradient approach, gradient KCCA (gradKCCA) model to develop a powerful model for exploring the intrinsic associations among genetic markers, imaging quantitative traits (QTs) of interest. Specifically, this model solves kernel canonical correlation (KCCA) with an additional constraint that projection directions have pre-images in the original data space, a sparsity-inducing variant of the model is achieved through controlling the [Formula: see text]-norm of the preimages of the projection directions. We evaluate this model using Alzheimer's disease Neuroimaging Initiative (ADNI) cohort to discover the relationships among SNPs from Alzheimer's disease (AD) risk gene APOE, imaging QTs extracted from structural magnetic resonance imaging (MRI) scans. Our results show that the algorithm not only outperforms the traditional KCCA method in terms of Root Mean Square Error (RMSE) and Correlation Coefficient (CC) but also identify the meaningful and relevant biomarkers of SNPs (e.g. rs157594 and rs405697), which are positively related to right Postcentral and right SupraMarginal brain regions in this study. Empirical results indicate its promising capability in revealing biologically meaningful neuroimaging genetics associations and improving the disease-related mechanistic understanding of AD.
Collapse
Affiliation(s)
- Lei Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai 201306, P. R. China
| | - Wei Kong
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai 201306, P. R. China
| | - Shuaiqun Wang
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Ave, Shanghai 201306, P. R. China
| |
Collapse
|
8
|
Du L, Liu K, Yao X, Risacher SL, Han J, Saykin AJ, Guo L, Shen L. Detecting genetic associations with brain imaging phenotypes in Alzheimer's disease via a novel structured SCCA approach. Med Image Anal 2020; 61:101656. [PMID: 32062154 DOI: 10.1016/j.media.2020.101656] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 11/27/2019] [Accepted: 01/22/2020] [Indexed: 01/15/2023]
Abstract
Brain imaging genetics becomes an important research topic since it can reveal complex associations between genetic factors and the structures or functions of the human brain. Sparse canonical correlation analysis (SCCA) is a popular bi-multivariate association identification method. To mine the complex genetic basis of brain imaging phenotypes, there arise many SCCA methods with a variety of norms for incorporating different structures of interest. They often use the group lasso penalty, the fused lasso or the graph/network guided fused lasso ones. However, the group lasso methods have limited capability because of the incomplete or unavailable prior knowledge in real applications. The fused lasso and graph/network guided methods are sensitive to the sign of the sample correlation which may be incorrectly estimated. In this paper, we introduce two new penalties to improve the fused lasso and the graph/network guided lasso penalties in structured sparse learning. We impose both penalties to the SCCA model and propose an optimization algorithm to solve it. The proposed SCCA method has a strong upper bound of grouping effects for both positively and negatively highly correlated variables. We show that, on both synthetic and real neuroimaging genetics data, the proposed SCCA method performs better than or equally to the conventional methods using fused lasso or graph/network guided fused lasso. In particular, the proposed method identifies higher canonical correlation coefficients and captures clearer canonical weight patterns, demonstrating its promising capability in revealing biologically meaningful imaging genetic associations.
Collapse
|
9
|
Du L, Liu K, Yao X, Risacher SL, Guo L, Saykin AJ, Shen L. DIAGNOSIS STATUS GUIDED BRAIN IMAGING GENETICS VIA INTEGRATED REGRESSION AND SPARSE CANONICAL CORRELATION ANALYSIS. Proc IEEE Int Symp Biomed Imaging 2019; 2019:356-359. [PMID: 31844486 DOI: 10.1109/isbi.2019.8759489] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Brain imaging genetics use the imaging quantitative traits (QTs) as intermediate endophenotypes to identify the genetic basis of the brain structure, function and abnormality. The regression and canonical correlation analysis (CCA) coupled with sparsity regularization are widely used in imaging genetics. The regression only selects relevant features for predictors. SCCA overcomes this but is unsupervised and thus could not make use of the diagnosis information. We propose a novel method integrating regression and SCCA together to construct a supervised sparse bi-multivariate learning model. The regression part plays a role of providing guidance for imaging QTs selection, and the SCCA part is focused on selecting relevant genetic markers and imaging QTs. We propose an efficient algorithm based on the alternative search method. Our method obtains better feature selection results than both regression and SCCA on both synthetic and real neuroimaging data. This demonstrates that our method is a promising bi-multivariate tool for brain imaging genetics.
Collapse
Affiliation(s)
- Lei Du
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Kefei Liu
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Xiaohui Yao
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | | | - Lei Guo
- School of Automation, Northwestern Polytechnical University, Xi'an, China
| | - Andrew J Saykin
- Indiana University School of Medicine, Indianapolis, IN, USA
| | - Li Shen
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | | |
Collapse
|
10
|
Klein M, Onnink M, van Donkelaar M, Wolfers T, Harich B, Shi Y, Dammers J, Arias-Vásquez A, Hoogman M, Franke B. Brain imaging genetics in ADHD and beyond - Mapping pathways from gene to disorder at different levels of complexity. Neurosci Biobehav Rev 2017; 80:115-155. [PMID: 28159610 PMCID: PMC6947924 DOI: 10.1016/j.neubiorev.2017.01.013] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Revised: 12/08/2016] [Accepted: 01/09/2017] [Indexed: 01/03/2023]
Abstract
Attention-deficit/hyperactivity disorder (ADHD) is a common and often persistent neurodevelopmental disorder. Beyond gene-finding, neurobiological parameters, such as brain structure, connectivity, and function, have been used to link genetic variation to ADHD symptomatology. We performed a systematic review of brain imaging genetics studies involving 62 ADHD candidate genes in childhood and adult ADHD cohorts. Fifty-one eligible research articles described studies of 13 ADHD candidate genes. Almost exclusively, single genetic variants were studied, mostly focussing on dopamine-related genes. While promising results have been reported, imaging genetics studies are thus far hampered by methodological differences in study design and analysis methodology, as well as limited sample sizes. Beyond reviewing imaging genetics studies, we also discuss the need for complementary approaches at multiple levels of biological complexity and emphasize the importance of combining and integrating findings across levels for a better understanding of biological pathways from gene to disease. These may include multi-modal imaging genetics studies, bioinformatic analyses, and functional analyses of cell and animal models.
Collapse
Affiliation(s)
- Marieke Klein
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Marten Onnink
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Marjolein van Donkelaar
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Thomas Wolfers
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Benjamin Harich
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Yan Shi
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Janneke Dammers
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands; Department of Psychiatry, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Alejandro Arias-Vásquez
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands; Department of Psychiatry, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands; Department of Cognitive Neuroscience, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Martine Hoogman
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
| | - Barbara Franke
- Department of Human Genetics, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands; Department of Psychiatry, Radboud university medical center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands.
| |
Collapse
|
11
|
Abstract
BACKGROUND Recently, structured sparse canonical correlation analysis (SCCA) has received increased attention in brain imaging genetics studies. It can identify bi-multivariate imaging genetic associations as well as select relevant features with desired structure information. These SCCA methods either use the fused lasso regularizer to induce the smoothness between ordered features, or use the signed pairwise difference which is dependent on the estimated sign of sample correlation. Besides, several other structured SCCA models use the group lasso or graph fused lasso to encourage group structure, but they require the structure/group information provided in advance which sometimes is not available. RESULTS We propose a new structured SCCA model, which employs the graph OSCAR (GOSCAR) regularizer to encourage those highly correlated features to have similar or equal canonical weights. Our GOSCAR based SCCA has two advantages: 1) It does not require to pre-define the sign of the sample correlation, and thus could reduce the estimation bias. 2) It could pull those highly correlated features together no matter whether they are positively or negatively correlated. We evaluate our method using both synthetic data and real data. Using the 191 ROI measurements of amyloid imaging data, and 58 genetic markers within the APOE gene, our method identifies a strong association between APOE SNP rs429358 and the amyloid burden measure in the frontal region. In addition, the estimated canonical weights present a clear pattern which is preferable for further investigation. CONCLUSIONS Our proposed method shows better or comparable performance on the synthetic data in terms of the estimated correlations and canonical loadings. It has successfully identified an important association between an Alzheimer's disease risk SNP rs429358 and the amyloid burden measure in the frontal region.
Collapse
Affiliation(s)
- Lei Du
- School of Medicine, Indiana University, Indianapolis, USA
| | - Heng Huang
- Computer Science & Engineering, University of Texas at Arlington, Arlington, USA
| | - Jingwen Yan
- School of Medicine, Indiana University, Indianapolis, USA
| | - Sungeun Kim
- School of Medicine, Indiana University, Indianapolis, USA
| | | | | | - Jason Moore
- School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Andrew Saykin
- School of Medicine, Indiana University, Indianapolis, USA
| | - Li Shen
- School of Medicine, Indiana University, Indianapolis, USA
| | - for the Alzheimer’s Disease Neuroimaging Initiative
- School of Medicine, Indiana University, Indianapolis, USA
- Computer Science & Engineering, University of Texas at Arlington, Arlington, USA
- Terre Haute, USA
- School of Medicine, University of Pennsylvania, Philadelphia, USA
| |
Collapse
|
12
|
Brucato N, Guadalupe T, Franke B, Fisher SE, Francks C. A schizophrenia-associated HLA locus affects thalamus volume and asymmetry. Brain Behav Immun 2015; 46:311-8. [PMID: 25728236 DOI: 10.1016/j.bbi.2015.02.021] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 01/20/2015] [Accepted: 02/07/2015] [Indexed: 02/02/2023] Open
Abstract
Genes of the Major Histocompatibility Complex (MHC) have recently been shown to have neuronal functions in the thalamus and hippocampus. Common genetic variants in the Human Leukocyte Antigens (HLA) region, human homologue of the MHC locus, are associated with small effects on susceptibility to schizophrenia, while volumetric changes of the thalamus and hippocampus have also been linked to schizophrenia. We therefore investigated whether common variants of the HLA would affect volumetric variation of the thalamus and hippocampus. We analysed thalamus and hippocampus volumes, as measured using structural magnetic resonance imaging, in 1.265 healthy participants. These participants had also been genotyped using genome-wide single nucleotide polymorphism (SNP) arrays. We imputed genotypes for single nucleotide polymorphisms at high density across the HLA locus, as well as HLA allotypes and HLA amino acids, by use of a reference population dataset that was specifically targeted to the HLA region. We detected a significant association of the SNP rs17194174 with thalamus volume (nominal P=0.0000017, corrected P=0.0039), as well as additional SNPs within the same region of linkage disequilibrium. This effect was largely lateralized to the left thalamus and is localized within a genomic region previously associated with schizophrenia. The associated SNPs are also clustered within a potential regulatory element, and a region of linkage disequilibrium that spans genes expressed in the thalamus, including HLA-A. Our data indicate that genetic variation within the HLA region influences the volume and asymmetry of the human thalamus. The molecular mechanisms underlying this association may relate to HLA influences on susceptibility to schizophrenia.
Collapse
Affiliation(s)
- Nicolas Brucato
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Leiden University Centre for Linguistics, Leiden, The Netherlands.
| | - Tulio Guadalupe
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; International Max Planck Research School for Language Sciences, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Barbara Franke
- Department of Human Genetics, Radboud University Nijmegen Medical Center, Nijmegen, The Netherlands; Department of Psychiatry, Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Medical Center, Nijmegen, The Netherlands; Department of Cognitive Neuroscience, Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen Medical Center, Nijmegen, The Netherlands
| | - Simon E Fisher
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Donders Institute for Brain, Cognition and Behavior, Radboud University Nijmegen, The Netherlands
| | - Clyde Francks
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands; Donders Institute for Brain, Cognition and Behavior, Radboud University Nijmegen, The Netherlands
| |
Collapse
|