1
|
Wu Q, Zhang Y, Huang X, Ma T, Hong LE, Kochunov P, Chen S. A multivariate to multivariate approach for voxel-wise genome-wide association analysis. Stat Med 2024. [PMID: 38922949 DOI: 10.1002/sim.10101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 03/02/2024] [Accepted: 04/24/2024] [Indexed: 06/28/2024]
Abstract
The joint analysis of imaging-genetics data facilitates the systematic investigation of genetic effects on brain structures and functions with spatial specificity. We focus on voxel-wise genome-wide association analysis, which may involve trillions of single nucleotide polymorphism (SNP)-voxel pairs. We attempt to identify underlying organized association patterns of SNP-voxel pairs and understand the polygenic and pleiotropic networks on brain imaging traits. We propose a bi-clique graph structure (ie, a set of SNPs highly correlated with a cluster of voxels) for the systematic association pattern. Next, we develop computational strategies to detect latent SNP-voxel bi-cliques and an inference model for statistical testing. We further provide theoretical results to guarantee the accuracy of our computational algorithms and statistical inference. We validate our method by extensive simulation studies, and then apply it to the whole genome genetic and voxel-level white matter integrity data collected from 1052 participants of the human connectome project. The results demonstrate multiple genetic loci influencing white matter integrity measures on splenium and genu of the corpus callosum.
Collapse
Affiliation(s)
- Qiong Wu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Yuan Zhang
- Department of Statistics, Ohio State University, Columbus, Ohio, USA
| | - Xiaoqi Huang
- Department of Mathematics, Louisiana State University, Baton Rouge, Louisiana, USA
| | - Tianzhou Ma
- Department of Epidemiology and Biostatistics, School of Public Health, University of Maryland, College Park, Maryland, USA
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, USA
| | - L Elliot Hong
- Faillace Department of Psychiatry and Behavioral Sciences at McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Peter Kochunov
- Faillace Department of Psychiatry and Behavioral Sciences at McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Shuo Chen
- Maryland Psychiatric Research Center, Department of Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, USA
- Faillace Department of Psychiatry and Behavioral Sciences at McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, Texas, USA
- Division of Biostatistics and Bioinformatics, Department of Epidemiology and Public Health, University of Maryland, Baltimore, Maryland, USA
- The University of Maryland Institute for Health Computing, University of Maryland, North Bethesda, USA
| |
Collapse
|
2
|
Cheek CL, Lindner P, Grigorenko EL. Statistical and Machine Learning Analysis in Brain-Imaging Genetics: A Review of Methods. Behav Genet 2024; 54:233-251. [PMID: 38336922 DOI: 10.1007/s10519-024-10177-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 01/24/2024] [Indexed: 02/12/2024]
Abstract
Brain-imaging-genetic analysis is an emerging field of research that aims at aggregating data from neuroimaging modalities, which characterize brain structure or function, and genetic data, which capture the structure and function of the genome, to explain or predict normal (or abnormal) brain performance. Brain-imaging-genetic studies offer great potential for understanding complex brain-related diseases/disorders of genetic etiology. Still, a combined brain-wide genome-wide analysis is difficult to perform as typical datasets fuse multiple modalities, each with high dimensionality, unique correlational landscapes, and often low statistical signal-to-noise ratios. In this review, we outline the progress in brain-imaging-genetic methodologies starting from early massive univariate to current deep learning approaches, highlighting each approach's strengths and weaknesses and elongating it with the field's development. We conclude by discussing selected remaining challenges and prospects for the field.
Collapse
Affiliation(s)
- Connor L Cheek
- Texas Institute for Evaluation, Measurement, and Statistics, University of Houston, Houston, TX, USA.
- Department of Physics, University of Houston, Houston, TX, USA.
| | - Peggy Lindner
- Texas Institute for Evaluation, Measurement, and Statistics, University of Houston, Houston, TX, USA
- Department of Information Science Technology, University of Houston, Houston, TX, USA
| | - Elena L Grigorenko
- Texas Institute for Evaluation, Measurement, and Statistics, University of Houston, Houston, TX, USA
- Department of Psychology, University of Houston, Houston, TX, USA
- Baylor College of Medicine, Houston, TX, USA
- Sirius University of Science and Technology, Sochi, Russia
| |
Collapse
|
3
|
Pan W, Shan Y, Li C, Huang S, Li T, Li Y, Zhu H. FPLS-DC: functional partial least squares through distance covariance for imaging genetics. Bioinformatics 2024; 40:btae173. [PMID: 38552322 PMCID: PMC11034987 DOI: 10.1093/bioinformatics/btae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/28/2024] [Accepted: 03/27/2024] [Indexed: 04/24/2024] Open
Abstract
MOTIVATION Imaging genetics integrates imaging and genetic techniques to examine how genetic variations influence the function and structure of organs like the brain or heart, providing insights into their impact on behavior and disease phenotypes. The use of organ-wide imaging endophenotypes has increasingly been used to identify potential genes associated with complex disorders. However, analyzing organ-wide imaging data alongside genetic data presents two significant challenges: high dimensionality and complex relationships. To address these challenges, we propose a novel, nonlinear inference framework designed to partially mitigate these issues. RESULTS We propose a functional partial least squares through distance covariance (FPLS-DC) framework for efficient genome wide analyses of imaging phenotypes. It consists of two components. The first component utilizes the FPLS-derived base functions to reduce image dimensionality while screening genetic markers. The second component maximizes the distance correlation between genetic markers and projected imaging data, which is a linear combination of the FPLS-basis functions, using simulated annealing algorithm. In addition, we proposed an iterative FPLS-DC method based on FPLS-DC framework, which effectively overcomes the influence of inter-gene correlation on inference analysis. We efficiently approximate the null distribution of test statistics using a gamma approximation. Compared to existing methods, FPLS-DC offers computational and statistical efficiency for handling large-scale imaging genetics. In real-world applications, our method successfully detected genetic variants associated with the hippocampus, demonstrating its value as a statistical toolbox for imaging genetic studies. AVAILABILITY AND IMPLEMENTATION The FPLS-DC method we propose opens up new research avenues and offers valuable insights for analyzing functional and high-dimensional data. In addition, it serves as a useful tool for scientific analysis in practical applications within the field of imaging genetics research. The R package FPLS-DC is available in Github: https://github.com/BIG-S2/FPLSDC.
Collapse
Affiliation(s)
- Wenliang Pan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
| | - Yue Shan
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Chuang Li
- Department of Statistical Science, School of Mathematics, Sun Yat-sen University, Guangzhou 510275, China
| | - Shuai Huang
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Tengfei Li
- Departments of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Yun Li
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Departments of Biostatistics, Statistics, Genetics, and Computer Science and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
- Departments of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
4
|
Jin Z, Kang J, Yu T. Bayesian nonparametric method for genetic dissection of brain activation region. Front Neurosci 2023; 17:1235321. [PMID: 37920300 PMCID: PMC10618557 DOI: 10.3389/fnins.2023.1235321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 09/26/2023] [Indexed: 11/04/2023] Open
Abstract
Biological evidence indicewates that the brain atrophy can be involved at the onset of neuropathological pathways of Alzheimer's disease. However, there is lack of formal statistical methods to perform genetic dissection of brain activation phenotypes such as shape and intensity. To this end, we propose a Bayesian hierarchical model which consists of two levels of hierarchy. At level 1, we develop a Bayesian nonparametric level set (BNLS) model for studying the brain activation region shape. At level 2, we construct a regression model to select genetic variants that are strongly associated with the brain activation intensity, where a spike-and-slab prior and a Gaussian prior are chosen for feature selection. We develop efficient posterior computation algorithms based on the Markov chain Monte Carlo (MCMC) method. We demonstrate the advantages of the proposed method via extensive simulation studies and analyses of imaging genetics data in the Alzheimer's disease neuroimaging initiative (ADNI) study.
Collapse
Affiliation(s)
- Zhuxuan Jin
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
| | - Jian Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
| | - Tianwei Yu
- School of Data Science, Chinese University of Hong Kong - Shenzhen, Shenzhen, China
- Guangdong Provincial Key Laboratory of Big Data Computing, Shenzhen, China
| |
Collapse
|
5
|
Khalilullah KMI, Agcaoglu O, Sui J, Adali T, Duda M, Calhoun VD. Multimodal fusion of multiple rest fMRI networks and MRI gray matter via parallel multilink joint ICA reveals highly significant function/structure coupling in Alzheimer's disease. Hum Brain Mapp 2023; 44:5167-5179. [PMID: 37605825 PMCID: PMC10502647 DOI: 10.1002/hbm.26456] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 07/11/2023] [Accepted: 08/01/2023] [Indexed: 08/23/2023] Open
Abstract
In this article, we focus on estimating the joint relationship between structural magnetic resonance imaging (sMRI) gray matter (GM), and multiple functional MRI (fMRI) intrinsic connectivity networks (ICNs). To achieve this, we propose a multilink joint independent component analysis (ml-jICA) method using the same core algorithm as jICA. To relax the jICA assumption, we propose another extension called parallel multilink jICA (pml-jICA) that allows for a more balanced weight distribution over ml-jICA/jICA. We assume a shared mixing matrix for both the sMRI and fMRI modalities, while allowing for different mixing matrices linking the sMRI data to the different ICNs. We introduce the model and then apply this approach to study the differences in resting fMRI and sMRI data from patients with Alzheimer's disease (AD) versus controls. The results of the pml-jICA yield significant differences with large effect sizes that include regions in overlapping portions of default mode network, and also hippocampus and thalamus. Importantly, we identify two joint components with partially overlapping regions which show opposite effects for AD versus controls, but were able to be separated due to being linked to distinct functional and structural patterns. This highlights the unique strength of our approach and multimodal fusion approaches generally in revealing potentially biomarkers of brain disorders that would likely be missed by a unimodal approach. These results represent the first work linking multiple fMRI ICNs to GM components within a multimodal data fusion model and challenges the typical view that brain structure is more sensitive to AD than fMRI.
Collapse
Affiliation(s)
- K. M. Ibrahim Khalilullah
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State University, Georgia Institute of Technology, Emory UniversityAtlantaGeorgiaUSA
| | - Oktay Agcaoglu
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State University, Georgia Institute of Technology, Emory UniversityAtlantaGeorgiaUSA
| | - Jing Sui
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State University, Georgia Institute of Technology, Emory UniversityAtlantaGeorgiaUSA
- State Key Laboratory of Cognitive Neuroscience and LearningBeijing Normal UniversityBeijingChina
| | - Tülay Adali
- Department of Electrical and Computer EngineeringUniversity of MarylandBaltimoreMarylandUSA
| | - Marlena Duda
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State University, Georgia Institute of Technology, Emory UniversityAtlantaGeorgiaUSA
| | - Vince D. Calhoun
- Tri‐institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS)Georgia State University, Georgia Institute of Technology, Emory UniversityAtlantaGeorgiaUSA
| |
Collapse
|
6
|
Beaulac C, Wu S, Gibson E, Miranda MF, Cao J, Rocha L, Beg MF, Nathoo FS. Neuroimaging feature extraction using a neural network classifier for imaging genetics. BMC Bioinformatics 2023; 24:271. [PMID: 37391692 DOI: 10.1186/s12859-023-05394-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 06/21/2023] [Indexed: 07/02/2023] Open
Abstract
BACKGROUND Dealing with the high dimension of both neuroimaging data and genetic data is a difficult problem in the association of genetic data to neuroimaging. In this article, we tackle the latter problem with an eye toward developing solutions that are relevant for disease prediction. Supported by a vast literature on the predictive power of neural networks, our proposed solution uses neural networks to extract from neuroimaging data features that are relevant for predicting Alzheimer's Disease (AD) for subsequent relation to genetics. The neuroimaging-genetic pipeline we propose is comprised of image processing, neuroimaging feature extraction and genetic association steps. We present a neural network classifier for extracting neuroimaging features that are related with the disease. The proposed method is data-driven and requires no expert advice or a priori selection of regions of interest. We further propose a multivariate regression with priors specified in the Bayesian framework that allows for group sparsity at multiple levels including SNPs and genes. RESULTS We find the features extracted with our proposed method are better predictors of AD than features used previously in the literature suggesting that single nucleotide polymorphisms (SNPs) related to the features extracted by our proposed method are also more relevant for AD. Our neuroimaging-genetic pipeline lead to the identification of some overlapping and more importantly some different SNPs when compared to those identified with previously used features. CONCLUSIONS The pipeline we propose combines machine learning and statistical methods to benefit from the strong predictive performance of blackbox models to extract relevant features while preserving the interpretation provided by Bayesian models for genetic association. Finally, we argue in favour of using automatic feature extraction, such as the method we propose, in addition to ROI or voxelwise analysis to find potentially novel disease-relevant SNPs that may not be detected when using ROIs or voxels alone.
Collapse
Affiliation(s)
- Cédric Beaulac
- School of Engineering Science, Simon Fraser University, Burnaby, Canada.
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada.
| | - Sidi Wu
- Department of Statistics and Actuarial Sciences, Simon Fraser University, Burnaby, Canada
| | - Erin Gibson
- School of Engineering Science, Simon Fraser University, Burnaby, Canada
| | - Michelle F Miranda
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Jiguo Cao
- Department of Statistics and Actuarial Sciences, Simon Fraser University, Burnaby, Canada
| | - Leno Rocha
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| | - Mirza Faisal Beg
- School of Engineering Science, Simon Fraser University, Burnaby, Canada
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, Victoria, Canada
| |
Collapse
|
7
|
Khalilullah KMI, Agcaoglu O, Sui J, Adali T, Duda M, Calhoun VD. Multimodal fusion of multiple rest fMRI networks and MRI gray matter via multilink joint ICA reveals highly significant function/structure coupling in Alzheimer's disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.28.530458. [PMID: 36909478 PMCID: PMC10002680 DOI: 10.1101/2023.02.28.530458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
Abstract
In this paper we focus on estimating the joint relationship between structural MRI (sMRI) gray matter (GM) and multiple functional MRI (fMRI) intrinsic connectivity networks (ICN) using a novel approach called multi-link joint independent component analysis (ml-jICA). The proposed model offers several improvements over the existing joint independent component analysis (jICA) model. We assume a shared mixing matrix for both the sMRI and fMRI modalities, while allowing for different mixing matrices linking the sMRI data to the different ICNs. We introduce the model and then apply this approach to study the differences in resting fMRI and sMRI data from patients with Alzheimer's disease (AD) versus controls. The results yield significant differences with large effect sizes that include regions in overlapping portions of default mode network, and also hippocampus and thalamus. Importantly, we identify two joint components with partially overlapping regions which show opposite effects for Alzheimer's disease versus controls, but were able to be separated due to being linked to distinct functional and structural patterns. This highlights the unique strength of our approach and multimodal fusion approaches generally in revealing potentially biomarkers of brain disorders that would likely be missed by a unimodal approach. These results represent the first work linking multiple fMRI ICNs to gray matter components within a multimodal data fusion model and challenges the typical view that brain structure is more sensitive to AD than fMRI.
Collapse
|
8
|
Martí-Juan G, Lorenzi M, Piella G. MC-RVAE: Multi-channel recurrent variational autoencoder for multimodal Alzheimer's disease progression modelling. Neuroimage 2023; 268:119892. [PMID: 36682509 DOI: 10.1016/j.neuroimage.2023.119892] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 12/15/2022] [Accepted: 01/18/2023] [Indexed: 01/21/2023] Open
Abstract
The progression of neurodegenerative diseases, such as Alzheimer's Disease, is the result of complex mechanisms interacting across multiple spatial and temporal scales. Understanding and predicting the longitudinal course of the disease requires harnessing the variability across different data modalities and time, which is extremely challenging. In this paper, we propose a model based on recurrent variational autoencoders that is able to capture cross-channel interactions between different modalities and model temporal information. These are achieved thanks to its multi-channel architecture and its shared latent variational space, parametrized with a recurrent neural network. We evaluate our model on both synthetic and real longitudinal datasets, the latter including imaging and non-imaging data, with N=897 subjects. Results show that our multi-channel recurrent variational autoencoder outperforms a set of baselines (KNN, random forest, and group factor analysis) for the task of reconstructing missing modalities, reducing the mean absolute error by 5% (w.r.t. the best baseline) for both subcortical volumes and cortical thickness. Our model is robust to missing features within each modality and is able to generate realistic synthetic imaging biomarkers trajectories from cognitive scores.
Collapse
Affiliation(s)
- Gerard Martí-Juan
- BCN MedTech, Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Barcelona, Spain.
| | - Marco Lorenzi
- Université Côte d'Azur, Inria Sophia Antipolis, Epione Research Project, France
| | - Gemma Piella
- BCN MedTech, Departament de Tecnologies de la Informació i les Comunicacions, Universitat Pompeu Fabra, Barcelona, Spain
| | | |
Collapse
|
9
|
Moon SW. Neuroimaging Genetics and Network Analysis in Alzheimer's Disease. Curr Alzheimer Res 2023; 20:526-538. [PMID: 37957920 DOI: 10.2174/0115672050265188231107072215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/22/2023] [Accepted: 08/13/2023] [Indexed: 11/15/2023]
Abstract
The issue of the genetics in brain imaging phenotypes serves as a crucial link between two distinct scientific fields: neuroimaging genetics (NG). The articles included here provide solid proof that this NG link has considerable synergy. There is a suitable collection of articles that offer a wide range of viewpoints on how genetic variations affect brain structure and function. They serve as illustrations of several study approaches used in contemporary genetics and neuroscience. Genome-wide association studies and candidate-gene association are two examples of genetic techniques. Cortical gray matter structural/volumetric measures from magnetic resonance imaging (MRI) are sources of information on brain phenotypes. Together, they show how various scientific disciplines have benefited from significant technological advances, such as the single-nucleotide polymorphism array in genetics and the development of increasingly higher-resolution MRI imaging. Moreover, we discuss NG's contribution to expanding our knowledge about the heterogeneity within Alzheimer's disease as well as the benefits of different network analyses.
Collapse
Affiliation(s)
- Seok Woo Moon
- Department of Psychiatry, Institute of Medical Science, Konkuk University School of Medicine, Chungju, Republic of Korea
| |
Collapse
|
10
|
Detection of Association Features Based on Gene Eigenvalues and MRI Imaging Using Genetic Weighted Random Forest. Genes (Basel) 2022; 13:genes13122344. [PMID: 36553611 PMCID: PMC9777775 DOI: 10.3390/genes13122344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 12/07/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
In the studies of Alzheimer's disease (AD), jointly analyzing imaging data and genetic data provides an effective method to explore the potential biomarkers of AD. AD can be separated into healthy controls (HC), early mild cognitive impairment (EMCI), late mild cognitive impairment (LMCI) and AD. In the meantime, identifying the important biomarkers of AD progression, and analyzing these biomarkers in AD provide valuable insights into understanding the mechanism of AD. In this paper, we present a novel data fusion method and a genetic weighted random forest method to mine important features. Specifically, we amplify the difference among AD, LMCI, EMCI and HC by introducing eigenvalues calculated from the gene p-value matrix for feature fusion. Furthermore, we construct the genetic weighted random forest using the resulting fused features. Genetic evolution is used to increase the diversity among decision trees and the decision trees generated are weighted by weights. After training, the genetic weighted random forest is analyzed further to detect the significant fused features. The validation experiments highlight the performance and generalization of our proposed model. We analyze the biological significance of the results and identify some significant genes (CSMD1, CDH13, PTPRD, MACROD2 and WWOX). Furthermore, the calcium signaling pathway, arrhythmogenic right ventricular cardiomyopathy and the glutamatergic synapse pathway were identified. The investigational findings demonstrate that our proposed model presents an accurate and efficient approach to identifying significant biomarkers in AD.
Collapse
|
11
|
Mihalik A, Chapman J, Adams RA, Winter NR, Ferreira FS, Shawe-Taylor J, Mourão-Miranda J. Canonical Correlation Analysis and Partial Least Squares for Identifying Brain-Behavior Associations: A Tutorial and a Comparative Study. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2022; 7:1055-1067. [PMID: 35952973 DOI: 10.1016/j.bpsc.2022.07.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 06/30/2022] [Accepted: 07/22/2022] [Indexed: 06/15/2023]
Abstract
Canonical correlation analysis (CCA) and partial least squares (PLS) are powerful multivariate methods for capturing associations across 2 modalities of data (e.g., brain and behavior). However, when the sample size is similar to or smaller than the number of variables in the data, standard CCA and PLS models may overfit, i.e., find spurious associations that generalize poorly to new data. Dimensionality reduction and regularized extensions of CCA and PLS have been proposed to address this problem, yet most studies using these approaches have some limitations. This work gives a theoretical and practical introduction into the most common CCA/PLS models and their regularized variants. We examine the limitations of standard CCA and PLS when the sample size is similar to or smaller than the number of variables. We discuss how dimensionality reduction and regularization techniques address this problem and explain their main advantages and disadvantages. We highlight crucial aspects of the CCA/PLS analysis framework, including optimizing the hyperparameters of the model and testing the identified associations for statistical significance. We apply the described CCA/PLS models to simulated data and real data from the Human Connectome Project and Alzheimer's Disease Neuroimaging Initiative (both of n > 500). We use both low- and high-dimensionality versions of these data (i.e., ratios between sample size and variables in the range of ∼1-10 and ∼0.1-0.01, respectively) to demonstrate the impact of data dimensionality on the models. Finally, we summarize the key lessons of the tutorial.
Collapse
Affiliation(s)
- Agoston Mihalik
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom; Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom.
| | - James Chapman
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| | - Rick A Adams
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom; Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
| | - Nils R Winter
- Institute of Translational Psychiatry, University of Münster, Münster, Germany
| | - Fabio S Ferreira
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| | - John Shawe-Taylor
- Department of Computer Science, University College London, London, United Kingdom
| | - Janaina Mourão-Miranda
- Centre for Medical Image Computing, Department of Computer Science, University College London, London, United Kingdom; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| |
Collapse
|
12
|
Qian J, Tanigawa Y, Li R, Tibshirani R, Rivas MA, Hastie T. LARGE-SCALE MULTIVARIATE SPARSE REGRESSION WITH APPLICATIONS TO UK BIOBANK. Ann Appl Stat 2022; 16:1891-1918. [PMID: 36091495 PMCID: PMC9454085 DOI: 10.1214/21-aoas1575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
In high-dimensional regression problems, often a relatively small subset of the features are relevant for predicting the outcome, and methods that impose sparsity on the solution are popular. When multiple correlated outcomes are available (multitask), reduced rank regression is an effective way to borrow strength and capture latent structures that underlie the data. Our proposal is motivated by the UK Biobank population-based cohort study, where we are faced with large-scale, ultrahigh-dimensional features, and have access to a large number of outcomes (phenotypes)-lifestyle measures, biomarkers, and disease outcomes. We are hence led to fit sparse reduced-rank regression models, using computational strategies that allow us to scale to problems of this size. We use a scheme that alternates between solving the sparse regression problem and solving the reduced rank decomposition. For the sparse regression component we propose a scalable iterative algorithm based on adaptive screening that leverages the sparsity assumption and enables us to focus on solving much smaller subproblems. The full solution is reconstructed and tested via an optimality condition to make sure it is a valid solution for the original problem. We further extend the method to cope with practical issues, such as the inclusion of confounding variables and imputation of missing values among the phenotypes. Experiments on both synthetic data and the UK Biobank data demonstrate the effectiveness of the method and the algorithm. We present multiSnpnet package, available at http://github.com/junyangq/multiSnpnet that works on top of PLINK2 files, which we anticipate to be a valuable tool for generating polygenic risk scores from human genetic studies.
Collapse
Affiliation(s)
| | | | - Ruilin Li
- Institute for Computational and Mathematical Engineering, Stanford University
| | | | - Manuel A Rivas
- Department of Biomedical Data Science, Stanford University
| | | |
Collapse
|
13
|
Li J. High-dimensional dynamic systems identification with additional constraints. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2020.1836219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Junlin Li
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, Hubei, P. R. China
| |
Collapse
|
14
|
Zhang J, Sun WW, Li L. Generalized Connectivity Matrix Response Regression with Applications in Brain Connectivity Studies. J Comput Graph Stat 2022; 32:252-262. [PMID: 36970553 PMCID: PMC10035565 DOI: 10.1080/10618600.2022.2074434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 04/23/2022] [Indexed: 10/18/2022]
Abstract
Multiple-subject network data are fast emerging in recent years, where a separate connectivity matrix is measured over a common set of nodes for each individual subject, along with subject covariates information. In this article, we propose a new generalized matrix response regression model, where the observed network is treated as a matrix-valued response and the subject covariates as predictors. The new model characterizes the population-level connectivity pattern through a low-rank intercept matrix, and the effect of subject covariates through a sparse slope tensor. We develop an efficient alternating gradient descent algorithm for parameter estimation, and establish the non-asymptotic error bound for the actual estimator from the algorithm, which quantifies the interplay between the computational and statistical errors. We further show the strong consistency for graph community recovery, as well as the edge selection consistency. We demonstrate the efficacy of our method through simulations and two brain connectivity studies.
Collapse
Affiliation(s)
- Jingfei Zhang
- Department of Management Science, Miami Herbert Business School, University of Miami, Miami, FL, 33146
| | - Will Wei Sun
- Krannert School of Management, Purdue University, West Lafayette, IN, 47906
| | - Lexin Li
- Department of Biostatistics and Epidemiology, School of Public Health, University of California at Berkeley, Berkeley, CA, 94720
| |
Collapse
|
15
|
Feature Fusion and Detection in Alzheimer’s Disease Using a Novel Genetic Multi-Kernel SVM Based on MRI Imaging and Gene Data. Genes (Basel) 2022; 13:genes13050837. [PMID: 35627222 PMCID: PMC9140721 DOI: 10.3390/genes13050837] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 05/03/2022] [Accepted: 05/05/2022] [Indexed: 01/27/2023] Open
Abstract
Voxel-based morphometry provides an opportunity to study Alzheimer’s disease (AD) at a subtle level. Therefore, identifying the important brain voxels that can classify AD, early mild cognitive impairment (EMCI) and healthy control (HC) and studying the role of these voxels in AD will be crucial to improve our understanding of the neurobiological mechanism of AD. Combining magnetic resonance imaging (MRI) imaging and gene information, we proposed a novel feature construction method and a novel genetic multi-kernel support vector machine (SVM) method to mine important features for AD detection. Specifically, to amplify the differences among AD, EMCI and HC groups, we used the eigenvalues of the top 24 Single Nucleotide Polymorphisms (SNPs) in a p-value matrix of 24 genes associated with AD for feature construction. Furthermore, a genetic multi-kernel SVM was established with the resulting features. The genetic algorithm was used to detect the optimal weights of 3 kernels and the multi-kernel SVM was used after training to explore the significant features. By analyzing the significance of the features, we identified some brain regions affected by AD, such as the right superior frontal gyrus, right inferior temporal gyrus and right superior temporal gyrus. The findings proved the good performance and generalization of the proposed model. Particularly, significant susceptibility genes associated with AD were identified, such as CSMD1, RBFOX1, PTPRD, CDH13 and WWOX. Some significant pathways were further explored, such as the calcium signaling pathway (corrected p-value = 1.35 × 10−6) and cell adhesion molecules (corrected p-value = 5.44 × 10−4). The findings offer new candidate abnormal brain features and demonstrate the contribution of these features to AD.
Collapse
|
16
|
Xin Y, Sheng J, Miao M, Wang L, Yang Z, Huang H. A review ofimaging genetics in Alzheimer's disease. J Clin Neurosci 2022; 100:155-163. [PMID: 35487021 DOI: 10.1016/j.jocn.2022.04.017] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 03/01/2022] [Accepted: 04/15/2022] [Indexed: 01/18/2023]
Abstract
Determining the association between genetic variation and phenotype is a key step to study the mechanism of Alzheimer's disease (AD), laying the foundation for studying drug therapies and biomarkers. AD is the most common type of dementia in the aged population. At present, three early-onset AD genes (APP, PSEN1, PSEN2) and one late-onset AD susceptibility gene apolipoprotein E (APOE) have been determined. However, the pathogenesis of AD remains unknown. Imaging genetics, an emerging interdisciplinary field, is able to reveal the complex mechanisms from the genetic level to human cognition and mental disorders via macroscopic intermediates. This paper reviews methods of establishing genotype-phenotype to explore correlations, including sparse canonical correlation analysis, sparse reduced rank regression, sparse partial least squares and so on. We found that most research work did poorly in supervised learning and exploring the nonlinear relationship between SNP-QT.
Collapse
Affiliation(s)
- Yu Xin
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| | - Jinhua Sheng
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China.
| | - Miao Miao
- Beijing Hospital, Beijing 100730, China; National Center of Gerontology, Beijing 100730, China; Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Luyun Wang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; Hangzhou Vocational & Technical College, Hangzhou, Zhejiang 310018, China
| | - Ze Yang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| | - He Huang
- College of Computer Science, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
17
|
Predictive classification of Alzheimer’s disease using brain imaging and genetic data. Sci Rep 2022; 12:2405. [PMID: 35165327 PMCID: PMC8844076 DOI: 10.1038/s41598-022-06444-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 01/24/2022] [Indexed: 02/06/2023] Open
Abstract
For now, Alzheimer’s disease (AD) is incurable. But if it can be diagnosed early, the correct treatment can be used to delay the disease. Most of the existing research methods use single or multi-modal imaging features for prediction, relatively few studies combine brain imaging with genetic features for disease diagnosis. In order to accurately identify AD, healthy control (HC) and the two stages of mild cognitive impairment (MCI: early MCI, late MCI) combined with brain imaging and genetic characteristics, we proposed an integrated Fisher score and multi-modal multi-task feature selection research method. We learned first genetic features with Fisher score to perform dimensionality reduction in order to solve the problem of the large difference between the feature scales of genetic and brain imaging. Then we learned the potential related features of brain imaging and genetic data, and multiplied the selected features with the learned weight coefficients. Through the feature selection program, five imaging and five genetic features were selected to achieve an average classification accuracy of 98% for HC and AD, 82% for HC and EMCI, 86% for HC and LMCI, 80% for EMCI and LMCI, 88% for EMCI and AD, and 72% for LMCI and AD. Compared with only using imaging features, the classification accuracy has been improved to a certain extent, and a set of interrelated features of brain imaging phenotypes and genetic factors were selected.
Collapse
|
18
|
Won JH, Youn J, Park H. Enhanced neuroimaging genetics using multi-view non-negative matrix factorization with sparsity and prior knowledge. Med Image Anal 2022; 77:102378. [DOI: 10.1016/j.media.2022.102378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 10/29/2021] [Accepted: 01/26/2022] [Indexed: 11/28/2022]
|
19
|
Vilor-Tejedor N, Garrido-Martín D, Rodriguez-Fernandez B, Lamballais S, Guigó R, Gispert JD. Multivariate Analysis and Modelling of multiple Brain endOphenotypes: Let's MAMBO! Comput Struct Biotechnol J 2021; 19:5800-5810. [PMID: 34765095 PMCID: PMC8567328 DOI: 10.1016/j.csbj.2021.10.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 10/08/2021] [Accepted: 10/12/2021] [Indexed: 12/01/2022] Open
Abstract
Imaging genetic studies aim to test how genetic information influences brain structure and function by combining neuroimaging-based brain features and genetic data from the same individual. Most studies focus on individual correlation and association tests between genetic variants and a single measurement of the brain. Despite the great success of univariate approaches, given the capacity of neuroimaging methods to provide a multiplicity of cerebral phenotypes, the development and application of multivariate methods become crucial. In this article, we review novel methods and strategies focused on the analysis of multiple phenotypes and genetic data. We also discuss relevant aspects of multi-trait modelling in the context of neuroimaging data.
Collapse
Affiliation(s)
- Natalia Vilor-Tejedor
- Barcelonaβeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
- Department of Clinical Genetics, Erasmus Medical Center, Rotterdam, Netherlands
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Diego Garrido-Martín
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | | | - Sander Lamballais
- Department of Clinical Genetics, Erasmus Medical Center, Rotterdam, Netherlands
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
| | - Juan Domingo Gispert
- Barcelonaβeta Brain Research Center (BBRC), Pasqual Maragall Foundation, Barcelona, Spain
- Universitat Pompeu Fabra, Barcelona, Spain
- IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
- Centro de Investigación Biomédica en Red Bioingeniería, Biomateriales y Nanomedicina, Madrid, Spain
| |
Collapse
|
20
|
Sheng J, Wang L, Cheng H, Zhang Q, Zhou R, Shi Y. Strategies for multivariate analyses of imaging genetics study in Alzheimer's disease. Neurosci Lett 2021; 762:136147. [PMID: 34332030 DOI: 10.1016/j.neulet.2021.136147] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 03/27/2021] [Accepted: 07/26/2021] [Indexed: 11/16/2022]
Abstract
Alzheimer's disease (AD) is an incurable neurodegenerative disease primarily affecting the elderly population. Early diagnosis of AD is critical for the management of this disease. Imaging genetics examines the influence of genetic variants (i.e., single nucleotide polymorphisms (SNPs)) on brain structure and function and many novel approaches of imaging genetics are proposed for studying AD. We review and synthesize the Alzheimer's Disease Neuroimaging Initiative (ADNI) genetic associations with quantitative disease endophenotypes including structural and functional neuroimaging, diffusion tensor imaging (DTI), positron emission tomography (PET), and fluid biomarker assays. In this review, we survey recent publications using neuroimaging and genetic data of AD, with a focus on methods capturing multivariate effects accommodating the large number variables from both imaging data and genetic data. We review methods focused on bridging the imaging and genetic data by establishing genotype-phenotype association, including sparse canonical correlation analysis, parallel independent component analysis, sparse reduced rank regression, sparse partial least squares, genome-wide association study, and so on. The broad availability and wide scope of ADNI genetic and phenotypic data has advanced our understanding of the genetic basis of AD and has nominated novel targets for future pharmaceutical therapy and biomarker development.
Collapse
Affiliation(s)
- Jinhua Sheng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China.
| | - Luyun Wang
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; College of Information Engineering, Hangzhou Vocational & Technical College, Hangzhou, Zhejiang 310018, China
| | - Hu Cheng
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN 47405, USA
| | | | - Rougang Zhou
- Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China; School of Mechanical Engineering, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Mstar Technologies Inc., Hangzhou, Zhejiang 310018, China
| | - Yuchen Shi
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China; Key Laboratory of Intelligent Image Analysis for Sensory and Cognitive Health, Ministry of Industry and Information Technology of China, Hangzhou, Zhejiang 310018, China
| |
Collapse
|
21
|
Wu J, Dong Q, Gui J, Zhang J, Su Y, Chen K, Thompson PM, Caselli RJ, Reiman EM, Ye J, Wang Y. Predicting Brain Amyloid Using Multivariate Morphometry Statistics, Sparse Coding, and Correntropy: Validation in 1,101 Individuals From the ADNI and OASIS Databases. Front Neurosci 2021; 15:669595. [PMID: 34421510 PMCID: PMC8377280 DOI: 10.3389/fnins.2021.669595] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 07/15/2021] [Indexed: 01/04/2023] Open
Abstract
Biomarker assisted preclinical/early detection and intervention in Alzheimer’s disease (AD) may be the key to therapeutic breakthroughs. One of the presymptomatic hallmarks of AD is the accumulation of beta-amyloid (Aβ) plaques in the human brain. However, current methods to detect Aβ pathology are either invasive (lumbar puncture) or quite costly and not widely available (amyloid PET). Our prior studies show that magnetic resonance imaging (MRI)-based hippocampal multivariate morphometry statistics (MMS) are an effective neurodegenerative biomarker for preclinical AD. Here we attempt to use MRI-MMS to make inferences regarding brain Aβ burden at the individual subject level. As MMS data has a larger dimension than the sample size, we propose a sparse coding algorithm, Patch Analysis-based Surface Correntropy-induced Sparse-coding and Max-Pooling (PASCS-MP), to generate a low-dimensional representation of hippocampal morphometry for each individual subject. Then we apply these individual representations and a binary random forest classifier to predict brain Aβ positivity for each person. We test our method in two independent cohorts, 841 subjects from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and 260 subjects from the Open Access Series of Imaging Studies (OASIS). Experimental results suggest that our proposed PASCS-MP method and MMS can discriminate Aβ positivity in people with mild cognitive impairment (MCI) [Accuracy (ACC) = 0.89 (ADNI)] and in cognitively unimpaired (CU) individuals [ACC = 0.79 (ADNI) and ACC = 0.81 (OASIS)]. These results compare favorably relative to measures derived from traditional algorithms, including hippocampal volume and surface area, shape measures based on spherical harmonics (SPHARM) and our prior Patch Analysis-based Surface Sparse-coding and Max-Pooling (PASS-MP) methods.
Collapse
Affiliation(s)
- Jianfeng Wu
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, United States
| | - Qunxi Dong
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, United States.,Institute of Engineering Medicine, Beijing Institute of Technology, Beijing, China
| | - Jie Gui
- School of Cyber Science and Engineering, Southeast University, Nanjing, China
| | - Jie Zhang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, United States
| | - Yi Su
- Banner Alzheimer's Institute, Phoenix, AZ, United States
| | - Kewei Chen
- Banner Alzheimer's Institute, Phoenix, AZ, United States
| | - Paul M Thompson
- Imaging Genetics Center, Stevens Neuroimaging and Informatics Institute, University of Southern California, Marina del Rey, CA, United States
| | - Richard J Caselli
- Department of Neurology, Mayo Clinic Arizona, Scottsdale, AZ, United States
| | - Eric M Reiman
- Banner Alzheimer's Institute, Phoenix, AZ, United States
| | - Jieping Ye
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, United States
| | - Yalin Wang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, United States
| |
Collapse
|
22
|
Wen C, Ba H, Pan W, Huang M. Co-sparse reduced-rank regression for association analysis between imaging phenotypes and genetic variants. Bioinformatics 2021; 36:5214-5222. [PMID: 32683450 DOI: 10.1093/bioinformatics/btaa650] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 05/22/2020] [Accepted: 07/14/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The association analysis between genetic variants and imaging phenotypes must be carried out to understand the inherited neuropsychiatric disorders via imaging genetic studies. Given the high dimensionality in imaging and genetic data, traditional methods based on massive univariate regression entail large computational cost and disregard many-to-many correlations between phenotypes and genetic variants. Several multivariate imaging genetic methods have been proposed to alleviate the above problems. However, most of these methods are based on the l1 penalty, which might cause the over-selection of variables and thus mislead scientists in analyzing data from the field of neuroimaging genetics. RESULTS To address these challenges in both statistics and computation, we propose a novel co-sparse reduced-rank regression model that identifies complex correlations in a dimensional reduction manner. We developed an iterative algorithm based on a group primal dual-active set formulation to detect simultaneously important genetic variants and imaging phenotypes efficiently and precisely via non-convex penalty. The simulation studies showed that our method achieved accurate and stable performance in parameter estimation and variable selection. In real application, the proposed approach successfully detected several novel Alzheimer's disease-related genetic variants and regions of interest, which indicate that our method may be a valuable statistical toolbox for imaging genetic studies. AVAILABILITY AND IMPLEMENTATION The R package csrrr, and the code for experiments in this article is available in Github: https://github.com/hailongba/csrrr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Canhong Wen
- International Institute of Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
| | - Hailong Ba
- International Institute of Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
| | - Wenliang Pan
- Department of Statistical Science, School of Mathematics, Sun Yat-Sen University, Guangzhou 510275, China
| | - Meiyan Huang
- School of Biomedical Engineering, Guangzhou 510515, China.,Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China
| | | |
Collapse
|
23
|
Nayor M, Shen L, Hunninghake GM, Kochunov P, Barr RG, Bluemke DA, Broeckel U, Caravan P, Cheng S, de Vries PS, Hoffmann U, Kolossváry M, Li H, Luo J, McNally EM, Thanassoulis G, Arnett DK, Vasan RS. Progress and Research Priorities in Imaging Genomics for Heart and Lung Disease: Summary of an NHLBI Workshop. Circ Cardiovasc Imaging 2021; 14:e012943. [PMID: 34387095 PMCID: PMC8486340 DOI: 10.1161/circimaging.121.012943] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Imaging genomics is a rapidly evolving field that combines state-of-the-art bioimaging with genomic information to resolve phenotypic heterogeneity associated with genomic variation, improve risk prediction, discover prevention approaches, and enable precision diagnosis and treatment. Contemporary bioimaging methods provide exceptional resolution generating discrete and quantitative high-dimensional phenotypes for genomics investigation. Despite substantial progress in combining high-dimensional bioimaging and genomic data, methods for imaging genomics are evolving. Recognizing the potential impact of imaging genomics on the study of heart and lung disease, the National Heart, Lung, and Blood Institute convened a workshop to review cutting-edge approaches and methodologies in imaging genomics studies, and to establish research priorities for future investigation. This report summarizes the presentations and discussions at the workshop. In particular, we highlight the need for increased availability of imaging genomics data in diverse populations, dedicated focus on less common conditions, and centralization of efforts around specific disease areas.
Collapse
Affiliation(s)
- Matthew Nayor
- Cardiology Division, Department of Medicine, Massachusetts
General Hospital, Harvard Medical School, Boston, MA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics,
Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Gary M. Hunninghake
- Division of Pulmonary and Critical Care Medicine, Harvard
Medical School, Brigham and Women’s Hospital, Boston, MA
| | - Peter Kochunov
- Maryland Psychiatric Research Center, Department of
Psychiatry, University of Maryland School of Medicine, Baltimore, MD
| | - R. Graham Barr
- Department of Medicine and Department of Epidemiology,
Mailman School of Public Health, Columbia University Irving Medical Center, New
York, NY
| | - David A. Bluemke
- Department of Radiology, University of Wisconsin-Madison
School of Medicine and Public Health, Madison, WI
| | - Ulrich Broeckel
- Section of Genomic Pediatrics, Department of Pediatrics,
Medicine and Physiology, Children’s Research Institute and Genomic Sciences
and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI
| | - Peter Caravan
- Institute for Innovation in Imaging, Athinoula A. Martinos
Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical
School, Charlestown, MA
| | - Susan Cheng
- Department of Cardiology, Smidt Heart Institute,
Cedars-Sinai Medical Center, Los Angeles, CA
| | - Paul S. de Vries
- Human Genetics Center, Department of Epidemiology, Human
Genetics, and Environmental Sciences, School of Public Health, The University of
Texas Health Science Center at Houston, Houston, TX
| | - Udo Hoffmann
- Department of Radiology, Harvard Medical School,
Massachusetts General Hospital, Boston, Massachusetts
| | - Márton Kolossváry
- Department of Radiology, Harvard Medical School,
Massachusetts General Hospital, Boston, Massachusetts
| | - Huiqing Li
- Division of Cardiovascular Sciences, National Heart,
Lung, and Blood Institute, Bethesda, MD
| | - James Luo
- Division of Cardiovascular Sciences, National Heart,
Lung, and Blood Institute, Bethesda, MD
| | - Elizabeth M. McNally
- Center for Genetic Medicine, Northwestern University
Feinberg School of Medicine, Chicago, IL
| | - George Thanassoulis
- Preventive and Genomic Cardiology, McGill University
Health Center and Research Institute, Montreal, Quebec, Canada
| | - Donna K. Arnett
- College of Public Health, University of Kentucky,
Lexington KY
| | - Ramachandran S. Vasan
- Sections of Preventive Medicine and Epidemiology, and
Cardiology, Department of Medicine, Department of Epidemiology, Boston University
Schools of Medicine and Public Health, and Center for Computing and Data Sciences,
Boston University, Boston, MA
| |
Collapse
|
24
|
Zhou J, Sun WW, Zhang J, Li L. Partially Observed Dynamic Tensor Response Regression. J Am Stat Assoc 2021; 118:424-439. [PMID: 37333062 PMCID: PMC10274377 DOI: 10.1080/01621459.2021.1938082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 05/22/2021] [Accepted: 05/25/2021] [Indexed: 10/21/2022]
Abstract
In modern data science, dynamic tensor data prevail in numerous applications. An important task is to characterize the relationship between dynamic tensor datasets and external covariates. However, the tensor data are often only partially observed, rendering many existing methods inapplicable. In this article, we develop a regression model with a partially observed dynamic tensor as the response and external covariates as the predictor. We introduce the low-rankness, sparsity, and fusion structures on the regression coefficient tensor, and consider a loss function projected over the observed entries. We develop an efficient nonconvex alternating updating algorithm, and derive the finite-sample error bound of the actual estimator from each step of our optimization algorithm. Unobserved entries in the tensor response have imposed serious challenges. As a result, our proposal differs considerably in terms of estimation algorithm, regularity conditions, as well as theoretical properties, compared to the existing tensor completion or tensor response regression solutions. We illustrate the efficacy of our proposed method using simulations and two real applications, including a neuroimaging dementia study and a digital advertising study.
Collapse
Affiliation(s)
- Jie Zhou
- Department of Management Science, University of Miami Herbert Business School, Miami, FL
| | - Will Wei Sun
- Krannert School of Management, Purdue University, West Lafayette, IN
| | - Jingfei Zhang
- Department of Management Science, University of Miami Herbert Business School, Miami, FL
| | - Lexin Li
- Division of Biostatistics, University of California, Berkeley, Berkeley, CA
| |
Collapse
|
25
|
Wang M, Shao W, Hao X, Shen L, Zhang D. Identify Consistent Cross-Modality Imaging Genetic Patterns via Discriminant Sparse Canonical Correlation Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1549-1561. [PMID: 31581090 DOI: 10.1109/tcbb.2019.2944825] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Sparse canonical correlation analysis (SCCA) is a bi-multivariate technique used in imaging genetics to identify complex multi-SNP-multi-QT associations. However, the traditional SCCA algorithm has been designed to seek a linear correlation between the SNP genotype and brain imaging phenotype, ignoring the discriminant similarity information between within-class subjects in brain imaging genetics association analysis. In addition, multi-modality brain imaging phenotypes are extracted from different perspectives and imaging markers from the same region consistently showing up in multimodalities may provide more insights for the mechanistic understanding of diseases. In this paper, a novel multi-modality discriminant SCCA algorithm (MD-SCCA) is proposed to overcome these limitations as well as to improve learning results by incorporating valuable discriminant similarity information into the SCCA algorithm. Specifically, we first extract the discriminant similarity information between within-class subjects by the sparse representation. Second, the discriminant similarity information is enforced within SCCA to construct a discriminant SCCA algorithm (D-SCCA). At last, the MD-SCCA algorithm is adopted to fully explore the relationships among different modalities of different subjects. In experiments, both synthetic dataset and real data from the Alzheimer's Disease Neuroimaging Initiative database are used to test the performance of our algorithm. The empirical results have demonstrated that the proposed algorithm not only produces improved cross-validation performances but also identifies consistent cross-modality imaging genetic biomarkers.
Collapse
|
26
|
Wang M, Shao W, Hao X, Zhang D. Identify Complex Imaging Genetic Patterns via Fusion Self-Expressive Network Analysis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:1673-1686. [PMID: 33661732 DOI: 10.1109/tmi.2021.3063785] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In the brain imaging genetic studies, it is a challenging task to estimate the association between quantitative traits (QTs) extracted from neuroimaging data and genetic markers such as single-nucleotide polymorphisms (SNPs). Most of the existing association studies are based on the extensions of sparse canonical correlation analysis (SCCA) for the identification of complex bi-multivariate associations, which can take the specific structure and group information into consideration. However, they often take the original data as input without considering its underlying complex multi-subspace structure, which will deteriorate the performance of the following integrative analysis. Accordingly, in this paper, the self-expressive property is exploited for the reconstruction of the original data before the association analysis, which can well describe the similarity structure. Specifically, we first apply the within-class similarity information to construct self-expressive networks by sparse representation. Then, we use the fusion method to iteratively fuse the self-expressive networks from multi-modality brain phenotypes into one network. Finally, we calculate the imaging genetic association based on the fused self-expressive network. We conduct the experiments on both single-modality and multi-modality phenotype data. Related experimental results validate that our method can not only better estimate the potential association between genetic markers and quantitative traits but also identify consistent multi-modality imaging genetic biomarkers to guide the interpretation of Alzheimer's disease.
Collapse
|
27
|
Zhang J, Dong Q, Shi J, Li Q, Stonnington CM, Gutman BA, Chen K, Reiman EM, Caselli RJ, Thompson PM, Ye J, Wang Y. Predicting future cognitive decline with hyperbolic stochastic coding. Med Image Anal 2021; 70:102009. [PMID: 33711742 PMCID: PMC8049149 DOI: 10.1016/j.media.2021.102009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 08/10/2020] [Accepted: 02/16/2021] [Indexed: 01/18/2023]
Abstract
Hyperbolic geometry has been successfully applied in modeling brain cortical and subcortical surfaces with general topological structures. However, such approaches, similar to other surface-based brain morphology analysis methods, usually generate high dimensional features. It limits their statistical power in cognitive decline prediction research, especially in datasets with limited subject numbers. To address the above limitation, we propose a novel framework termed as hyperbolic stochastic coding (HSC). We first compute diffeomorphic maps between general topological surfaces by mapping them to a canonical hyperbolic parameter space with consistent boundary conditions and extracts critical shape features. Secondly, in the hyperbolic parameter space, we introduce a farthest point sampling with breadth-first search method to obtain ring-shaped patches. Thirdly, stochastic coordinate coding and max-pooling algorithms are adopted for feature dimension reduction. We further validate the proposed system by comparing its classification accuracy with some other methods on two brain imaging datasets for Alzheimer's disease (AD) progression studies. Our preliminary experimental results show that our algorithm achieves superior results on various classification tasks. Our work may enrich surface-based brain imaging research tools and potentially result in a diagnostic and prognostic indicator to be useful in individualized treatment strategies.
Collapse
Affiliation(s)
- Jie Zhang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, 85287 USA
| | - Qunxi Dong
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, 85287 USA
| | - Jie Shi
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, 85287 USA
| | - Qingyang Li
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, 85287 USA
| | | | - Boris A Gutman
- Armour College of Engineering, Illinois Institute of Technology, Chicago, IL, USA
| | - Kewei Chen
- Banner Alzheimer's Institute, Phoenix, AZ, USA
| | | | | | - Paul M Thompson
- Imaging Genetics Center, Institute for Neuroimaging and Informatics, University of Southern California, Los Angeles, CA, USA
| | - Jieping Ye
- Department of Computational Medicine and Bioinformatics & Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Yalin Wang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, 85287 USA.
| |
Collapse
|
28
|
Li J, Liu W, Li H, Chen F, Luo H, Bao P, Li Y, Jiang H, Gao Y, Liang H, Fang S. Genome-wide variant-based study of genetic effects with the largest neuroanatomic coverage. BMC Bioinformatics 2021; 22:223. [PMID: 33931008 PMCID: PMC8086096 DOI: 10.1186/s12859-021-04145-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 04/21/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Brain image genetics provides enormous opportunities for examining the effects of genetic variations on the brain. Many studies have shown that the structure, function, and abnormality (e.g., those related to Alzheimer's disease) of the brain are heritable. However, which genetic variations contribute to these phenotypic changes is not completely clear. Advances in neuroimaging and genetics have led us to obtain detailed brain anatomy and genome-wide information. These data offer us new opportunities to identify genetic variations such as single nucleotide polymorphisms (SNPs) that affect brain structure. In this paper, we perform a genome-wide variant-based study, and aim to identify top SNPs or SNP sets which have genetic effects with the largest neuroanotomic coverage at both voxel and region-of-interest (ROI) levels. Based on the voxelwise genome-wide association study (GWAS) results, we used the exhaustive search to find the top SNPs or SNP sets that have the largest voxel-based or ROI-based neuroanatomic coverage. For SNP sets with >2 SNPs, we proposed an efficient genetic algorithm to identify top SNP sets that can cover all ROIs or a specific ROI. RESULTS We identified an ensemble of top SNPs, SNP-pairs and SNP-sets, whose effects have the largest neuroanatomic coverage. Experimental results on real imaging genetics data show that the proposed genetic algorithm is superior to the exhaustive search in terms of computational time for identifying top SNP-sets. CONCLUSIONS We proposed and applied an informatics strategy to identify top SNPs, SNP-pairs and SNP-sets that have genetic effects with the largest neuroanatomic coverage. The proposed genetic algorithm offers an efficient solution to accomplish the task, especially for identifying top SNP-sets.
Collapse
Affiliation(s)
- Jin Li
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Wenjie Liu
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Huang Li
- Computer and Information Science, IUPUI, 723 W Michigan St, Indianapolis, IN 46202 USA
| | - Feng Chen
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Haoran Luo
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Peihua Bao
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Yanzhao Li
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Hailong Jiang
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Yue Gao
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Hong Liang
- College of Automation, Harbin Engineering University, NO. 145 Nantong Street, Nangang District, Harbin, 150001 China
| | - Shiaofen Fang
- Computer and Information Science, IUPUI, 723 W Michigan St, Indianapolis, IN 46202 USA
| |
Collapse
|
29
|
Song Y, Ge S, Cao J, Wang L, Nathoo FS. A Bayesian spatial model for imaging genetics. Biometrics 2021; 78:742-753. [PMID: 33765325 DOI: 10.1111/biom.13460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 02/08/2021] [Accepted: 02/24/2021] [Indexed: 11/29/2022]
Abstract
We develop a Bayesian bivariate spatial model for multivariate regression analysis applicable to studies examining the influence of genetic variation on brain structure. Our model is motivated by an imaging genetics study of the Alzheimer's Disease Neuroimaging Initiative (ADNI), where the objective is to examine the association between images of volumetric and cortical thickness values summarizing the structure of the brain as measured by magnetic resonance imaging (MRI) and a set of 486 single nucleotide polymorphism (SNPs) from 33 Alzheimer's disease (AD) candidate genes obtained from 632 subjects. A bivariate spatial process model is developed to accommodate the correlation structures typically seen in structural brain imaging data. First, we allow for spatial correlation on a graph structure in the imaging phenotypes obtained from a neighborhood matrix for measures on the same hemisphere of the brain. Second, we allow for correlation in the same measures obtained from different hemispheres (left/right) of the brain. We develop a mean-field variational Bayes algorithm and a Gibbs sampling algorithm to fit the model. We also incorporate Bayesian false discovery rate (FDR) procedures to select SNPs. We implement the methodology in a new release of the R package bgsmtr. We show that the new spatial model demonstrates superior performance over a standard model in our application. Data used in the preparation of this article were obtained from the ADNI database (https://adni.loni.usc.edu).
Collapse
Affiliation(s)
- Yin Song
- Department of Mathematics and Statistics, University of Victoria, British Columbia, Canada
| | - Shufei Ge
- Institute of Mathematical Sciences, ShanghaiTech University, Shanghai, China
| | - Jiguo Cao
- Statistics and Actuarial Science, Simon Fraser University, British Columbia, Canada
| | - Liangliang Wang
- Statistics and Actuarial Science, Simon Fraser University, British Columbia, Canada
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, British Columbia, Canada
| |
Collapse
|
30
|
Hwang H, Cho G, Jin MJ, Ryoo JH, Choi Y, Lee SH. A knowledge-based multivariate statistical method for examining gene-brain-behavioral/cognitive relationships: Imaging genetics generalized structured component analysis. PLoS One 2021; 16:e0247592. [PMID: 33690643 PMCID: PMC7946325 DOI: 10.1371/journal.pone.0247592] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/10/2021] [Indexed: 12/30/2022] Open
Abstract
With advances in neuroimaging and genetics, imaging genetics is a naturally emerging field that combines genetic and neuroimaging data with behavioral or cognitive outcomes to examine genetic influence on altered brain functions associated with behavioral or cognitive variation. We propose a statistical approach, termed imaging genetics generalized structured component analysis (IG-GSCA), which allows researchers to investigate such gene-brain-behavior/cognitive associations, taking into account well-documented biological characteristics (e.g., genetic pathways, gene-environment interactions, etc.) and methodological complexities (e.g., multicollinearity) in imaging genetic studies. We begin by describing the conceptual and technical underpinnings of IG-GSCA. We then apply the approach for investigating how nine depression-related genes and their interactions with an environmental variable (experience of potentially traumatic events) influence the thickness variations of 53 brain regions, which in turn affect depression severity in a sample of Korean participants. Our analysis shows that a dopamine receptor gene and an interaction between a serotonin transporter gene and the environment variable have statistically significant effects on a few brain regions' variations that have statistically significant negative impacts on depression severity. These relationships are largely supported by previous studies. We also conduct a simulation study to safeguard whether IG-GSCA can recover parameters as expected in a similar situation.
Collapse
Affiliation(s)
- Heungsun Hwang
- Department of Psychology, McGill University, Montreal, Quebec, Canada
| | - Gyeongcheol Cho
- Department of Psychology, McGill University, Montreal, Quebec, Canada
| | - Min Jin Jin
- Institute of Liberal Education, Kongju National University, Gongju, Korea
| | - Ji Hoon Ryoo
- Department of Education, Yonsei University, Seoul, Korea
| | - Younyoung Choi
- Department of Counseling Psychology, Hanyang Cyber University, Seoul, Korea
| | - Seung Hwan Lee
- Department of Psychiatry, Inje University Ilsan-Paik Hospital and Inje University, Goyang, Korea
| |
Collapse
|
31
|
Wen C, Yang Y, Xiao Q, Huang M, Pan W. Genome-wide association studies of brain imaging data via weighted distance correlation. Bioinformatics 2021; 36:4942-4950. [PMID: 32619001 DOI: 10.1093/bioinformatics/btaa612] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 06/17/2020] [Accepted: 06/26/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Imaging genetics is mainly used to reveal the pathogenesis of neuropsychiatric risk genes and understand the relationship between human brain structure, functional and individual differences. Increasingly, the brain-wide imaging phenotypes in voxels are available to test the association with genetic markers. A challenge with analyzing such data is their high dimensionality and complex relationships. RESULTS To tackle this challenge, we introduce a weighed distance correlation (wdCor) that can assess the association between genetic markers and voxel-based imaging data. Importantly, the wdCor test takes the voxel-based data as a whole multivariate phenotype, which preserves the spatial continuity and might enhance the power. Besides, an adaptive permutation procedure is introduced to determine the P-values of the wdCor test and also alleviate the computational burden in GWAS. In extensive simulation studies, wdCor achieves much better performances compared to the original distance correlation. We also successfully apply wdCor to conduct a large-scale analysis on data from the Alzheimer's disease neuroimaging project (ADNI). AVAILABILITY AND IMPLEMENTATION Our wdCor method provides new research directions and ideas for multivariate analysis of high-dimensional data, it can also be used as a tool for scientific analysis of imaging genetics research in practical applications. The R package wdcor, and the code for reproducing all results in this article is available in Github: https://github.com/yangyuhui0129/wdcor. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Canhong Wen
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
| | - Yuhui Yang
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
| | - Quan Xiao
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei 230026, China
| | - Meiyan Huang
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Wenliang Pan
- Department of Statistical Science, School of Mathematics, Sun Yat-Sen University, Guangzhou 510275, China
| | | |
Collapse
|
32
|
Fu Y, Zhang J, Li Y, Shi J, Zou Y, Guo H, Li Y, Yao Z, Wang Y, Hu B. A novel pipeline leveraging surface-based features of small subcortical structures to classify individuals with autism spectrum disorder. Prog Neuropsychopharmacol Biol Psychiatry 2021; 104:109989. [PMID: 32512131 PMCID: PMC9632410 DOI: 10.1016/j.pnpbp.2020.109989] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 05/19/2020] [Accepted: 05/30/2020] [Indexed: 10/24/2022]
Abstract
Autism spectrum disorder (ASD) is accompanied with widespread impairment in social-emotional functioning. Classification of ASD using sensitive morphological features derived from structural magnetic resonance imaging (MRI) of the brain may help us to better understand ASD-related mechanisms and improve related automatic diagnosis. Previous studies using T1 MRI scans in large heterogeneous ABIDE dataset with typical development (TD) controls reported poor classification accuracies (around 60%). This may because they only considered surface-based morphometry (SBM) as scalar estimates (such as cortical thickness and surface area) and ignored the neighboring intrinsic geometry information among features. In recent years, the shape-related SBM achieves great success in discovering the disease burden and progression of other brain diseases. However, when focusing on local geometry information, its high dimensionality requires careful treatment in its application to machine learning. To address the above challenges, we propose a novel pipeline for ASD classification, which mainly includes the generation of surface-based features, patch-based surface sparse coding and dictionary learning, Max-pooling and ensemble classifiers based on adaptive optimizers. The proposed pipeline may leverage the sensitivity of brain surface morphometry statistics and the efficiency of sparse coding and Max-pooling. By introducing only the surface features of bilateral hippocampus that derived from 364 male subjects with ASD and 381 age-matched TD males, this pipeline outperformed five recent MRI-based ASD classification studies with >80% accuracy in discriminating individuals with ASD from TD controls. Our results suggest shape-related SBM features may further boost the classification performance of MRI between ASD and TD.
Collapse
Affiliation(s)
- Yu Fu
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Jie Zhang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | - Yuan Li
- School of Information Science and Engineering, Shandong Normal University, Jinan, Shandong Province, China
| | - Jie Shi
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Ying Zou
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Hanning Guo
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Yongchao Li
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China
| | - Zhijun Yao
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China.
| | - Yalin Wang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA.
| | - Bin Hu
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China; Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China; CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China; Beijing Institute for Brain Disorders, Capital Medical University, Beijing, China.
| |
Collapse
|
33
|
Lee HJ, Kwon H, Kim JI, Lee JY, Lee JY, Bang S, Lee JM. The cingulum in very preterm infants relates to language and social-emotional impairment at 2 years of term-equivalent age. NEUROIMAGE-CLINICAL 2020; 29:102528. [PMID: 33338967 PMCID: PMC7750449 DOI: 10.1016/j.nicl.2020.102528] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 10/15/2020] [Accepted: 12/04/2020] [Indexed: 01/25/2023]
Abstract
Maturation of specific WM tracts in preterm individuals differs from those of term controls. The elastic net logistic regression model was used to identify altered white matter tracts in the preterm brain. The alteration of the cingulum in the preterm at near-term correlate with neurodevelopmental scores at 18–22 months of age.
Background Relative to full-term infants, very preterm infants exhibit disrupted white matter (WM) maturation and problems related to development, including motor, cognitive, social-emotional, and receptive and expressive language processing. Objective The present study aimed to determine whether regional abnormalities in the WM microstructure of very preterm infants, as defined relative to those of full-term infants at a near-term age, are associated with neurodevelopmental outcomes at the age of 18–22 months. Methods We prospectively enrolled 89 very preterm infants (birth weight < 1500 g) and 43 normal full-term control infants born between 2016 and 2018. All infants underwent a structural brain magnetic resonance imaging scan at near-term age. The diffusion tensor imaging (DTI) metrics of the whole-brain WM tracts were extracted based on the neonatal probabilistic WM pathway. The elastic net logistic regression model was used to identify altered WM tracts in the preterm brain. We evaluated the associations between the altered WM microstructure at near-term age and motor, cognitive, social-emotional, and receptive and expressive language developments at 18–22 months of age, as measured using the Bayley Scales of Infant Development, Third Edition. Results We found that the elastic net logistic regression model could classify preterm and full-term neonates with an accuracy of 87.9% (corrected p < 0.008) using the DTI metrics in the pathway of interest with a 10% threshold level. The fractional anisotropy (FA) values of the body and splenium of the corpus callosum, middle cerebellar peduncle, left and right uncinate fasciculi, and right portion of the pathway between the premotor and primary motor cortices (premotor-PMC), as well as the mean axial diffusivity (AD) values of the left cingulum, were identified as contributive features for classification. Increased adjusted AD values in the left cingulum pathway were significantly correlated with language scores after false discovery rate (FDR) correction (r = 0.217, p = 0.043). The expressive language and social-emotional composite scores showed a significant positive correlation with the AD values in the left cingulum pathway (r = 0.226 [p = 0.036] and r = 0.31 [p = 0.003], respectively) after FDR correction. Conclusion Our approach suggests that the cingulum pathways of very preterm infants differ from those of full-term infants and significantly contribute to the prediction of the subsequent development of the language and social-emotional domains. This finding could improve our understanding of how specific neural substrates influence neurodevelopment at later ages, and individual risk prediction, thus helping to inform early intervention strategies that address developmental delay.
Collapse
Affiliation(s)
- Hyun Ju Lee
- Department of Pediatrics, Hanyang University College of Medicine, Seoul, South Korea; Division of Neonatology and Developmental Medicine, Seoul Hanyang University Hospital, Seoul, South Korea
| | - Hyeokjin Kwon
- Department of Electronic Engineering, Hanyang University, Seoul, South Korea
| | - Johanna Inhyang Kim
- Department of Psychiatry, Hanyang University, Seoul, South Korea; Division of Neonatology and Developmental Medicine, Seoul Hanyang University Hospital, Seoul, South Korea
| | - Joo Young Lee
- Department of Pediatrics, Hanyang University College of Medicine, Seoul, South Korea
| | - Ji Young Lee
- Department of Radiology, Hanyang University College of Medicine, Seoul, South Korea
| | - SungKyu Bang
- Department of Electronic Engineering, Hanyang University, Seoul, South Korea
| | - Jong-Min Lee
- Department of Biomedical Engineering, Hanyang University, Seoul, South Korea.
| |
Collapse
|
34
|
|
35
|
Wang M, Huang TZ, Fang J, Calhoun VD, Wang YP. Integration of Imaging (epi)Genomics Data for the Study of Schizophrenia Using Group Sparse Joint Nonnegative Matrix Factorization. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1671-1681. [PMID: 30762565 PMCID: PMC7781159 DOI: 10.1109/tcbb.2019.2899568] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Schizophrenia (SZ) is a complex disease. Single nucleotide polymorphism (SNP), brain activity measured by functional magnetic resonance imaging (fMRI) and DNA methylation are all important biomarkers that can be used for the study of SZ. To our knowledge, there has been little effort to combine these three datasets together. In this study, we propose a group sparse joint nonnegative matrix factorization (GSJNMF) model to integrate SNP, fMRI, and DNA methylation for the identification of multi-dimensional modules associated with SZ, which can be used to study regulatory mechanisms underlying SZ at multiple levels. The proposed GSJNMF model projects multiple types of data onto a common feature space, in which heterogeneous variables with large coefficients on the same projected bases are used to identify multi-dimensional modules. We also incorporate group structure information available from each dataset. The genomic factors in such modules have significant correlations or functional associations with several brain activities. At the end, we have applied the method to the analysis of real data collected from the Mind Clinical Imaging Consortium (MCIC) for the study of SZ and identified significant biomarkers. These biomarkers were further used to discover genes and corresponding brain regions, which were confirmed to be significantly associated with SZ.
Collapse
Affiliation(s)
- Min Wang
- School of Mathematical Sciences/Research Center for Image and Vision Computing, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, China
- School of Information Technology, Jiangxi University of Finance and Economics, Nanchang, Jiangxi, 330013, China
| | - Ting-Zhu Huang
- School of Mathematical Sciences/Research Center for Image and Vision Computing, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, China
| | - Jian Fang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA
| | - Vince D. Calhoun
- The Mind Research Network, University of New Mexico, NM 87131, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA 70118, USA
- Corresponding author.
| |
Collapse
|
36
|
Kaczkurkin AN, Moore TM, Sotiras A, Xia CH, Shinohara RT, Satterthwaite TD. Approaches to Defining Common and Dissociable Neurobiological Deficits Associated With Psychopathology in Youth. Biol Psychiatry 2020; 88:51-62. [PMID: 32087950 PMCID: PMC7305976 DOI: 10.1016/j.biopsych.2019.12.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 11/07/2019] [Accepted: 12/11/2019] [Indexed: 01/31/2023]
Abstract
Psychiatric disorders show high rates of comorbidity and nonspecificity of presenting clinical symptoms, while demonstrating substantial heterogeneity within diagnostic categories. Notably, many of these psychiatric disorders first manifest in youth. We review progress and next steps in efforts to parse heterogeneity in psychiatric symptoms in youths by identifying abnormalities within neural circuits. To address this fundamental challenge in psychiatry, a number of methods have been proposed. We provide an overview of these methods, broadly organized into dimensional versus categorical approaches and single-view versus multiview approaches. Dimensional approaches including factor analysis and canonical correlation analysis aim to capture dimensional associations between psychopathology and brain measures across a continuous spectrum from health to disease. In contrast, categorical approaches, such as clustering and community detection, aim to identify subtypes of individuals within a class of symptoms or brain features. We highlight several studies that apply these methods to samples of youths and discuss issues to consider when using these approaches. Finally, we end by highlighting avenues for future research.
Collapse
Affiliation(s)
| | - Tyler M Moore
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Aristeidis Sotiras
- Department of Radiology, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Institute for Informatics, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Cedric Huchuan Xia
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Russell T Shinohara
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania
| | - Theodore D Satterthwaite
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.
| |
Collapse
|
37
|
Du L, Liu F, Liu K, Yao X, Risacher SL, Han J, Guo L, Saykin AJ, Shen L. Identifying diagnosis-specific genotype-phenotype associations via joint multitask sparse canonical correlation analysis and classification. Bioinformatics 2020; 36:i371-i379. [PMID: 32657360 PMCID: PMC7355274 DOI: 10.1093/bioinformatics/btaa434] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
MOTIVATION Brain imaging genetics studies the complex associations between genotypic data such as single nucleotide polymorphisms (SNPs) and imaging quantitative traits (QTs). The neurodegenerative disorders usually exhibit the diversity and heterogeneity, originating from which different diagnostic groups might carry distinct imaging QTs, SNPs and their interactions. Sparse canonical correlation analysis (SCCA) is widely used to identify bi-multivariate genotype-phenotype associations. However, most existing SCCA methods are unsupervised, leading to an inability to identify diagnosis-specific genotype-phenotype associations. RESULTS In this article, we propose a new joint multitask learning method, named MT-SCCALR, which absorbs the merits of both SCCA and logistic regression. MT-SCCALR learns genotype-phenotype associations of multiple tasks jointly, with each task focusing on identifying one diagnosis-specific genotype-phenotype pattern. Meanwhile, MT-SCCALR cannot only select relevant SNPs and imaging QTs for each diagnostic group alone, but also allows the selection of those shared by multiple diagnostic groups. We derive an efficient optimization algorithm whose convergence to a local optimum is guaranteed. Compared with two state-of-the-art methods, MT-SCCALR yields better or similar canonical correlation coefficients and classification performances. In addition, it owns much better discriminative canonical weight patterns of great interest than competitors. This demonstrates the power and capability of MTSCCAR in identifying diagnostically heterogeneous genotype-phenotype patterns, which would be helpful to understand the pathophysiology of brain disorders. AVAILABILITY AND IMPLEMENTATION The software is publicly available at https://github.com/dulei323/MTSCCALR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lei Du
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Fang Liu
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Kefei Liu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Xiaohui Yao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Shannon L Risacher
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Junwei Han
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Lei Guo
- Department of intelligent science and technology, School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
| | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | | |
Collapse
|
38
|
Shi WJ, Zhuang Y, Russell PH, Hobbs BD, Parker MM, Castaldi PJ, Rudra P, Vestal B, Hersh CP, Saba LM, Kechris K. Unsupervised discovery of phenotype-specific multi-omics networks. Bioinformatics 2020; 35:4336-4343. [PMID: 30957844 DOI: 10.1093/bioinformatics/btz226] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 02/01/2019] [Accepted: 04/05/2019] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION Complex diseases often involve a wide spectrum of phenotypic traits. Better understanding of the biological mechanisms relevant to each trait promotes understanding of the etiology of the disease and the potential for targeted and effective treatment plans. There have been many efforts towards omics data integration and network reconstruction, but limited work has examined the incorporation of relevant (quantitative) phenotypic traits. RESULTS We propose a novel technique, sparse multiple canonical correlation network analysis (SmCCNet), for integrating multiple omics data types along with a quantitative phenotype of interest, and for constructing multi-omics networks that are specific to the phenotype. As a case study, we focus on miRNA-mRNA networks. Through simulations, we demonstrate that SmCCNet has better overall prediction performance compared to popular gene expression network construction and integration approaches under realistic settings. Applying SmCCNet to studies on chronic obstructive pulmonary disease (COPD) and breast cancer, we found enrichment of known relevant pathways (e.g. the Cadherin pathway for COPD and the interferon-gamma signaling pathway for breast cancer) as well as less known omics features that may be important to the diseases. Although those applications focus on miRNA-mRNA co-expression networks, SmCCNet is applicable to a variety of omics and other data types. It can also be easily generalized to incorporate multiple quantitative phenotype simultaneously. The versatility of SmCCNet suggests great potential of the approach in many areas. AVAILABILITY AND IMPLEMENTATION The SmCCNet algorithm is written in R, and is freely available on the web at https://cran.r-project.org/web/packages/SmCCNet/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- W Jenny Shi
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Yonghua Zhuang
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Pamela H Russell
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Brian D Hobbs
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Margaret M Parker
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Pratyaydipta Rudra
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Statistics, Oklahoma State University, Stillwater, OK
| | - Brian Vestal
- Center for Genes, Environment & Health, National Jewish Health, Denver, CO, USA
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Laura M Saba
- Department of Pharmaceutical Sciences, University of Colorado, Aurora, CO, USA
| | - Katerina Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
39
|
Kong D, An B, Zhang J, Zhu H. L2RM: Low-rank Linear Regression Models for High-dimensional Matrix Responses. J Am Stat Assoc 2020; 115:403-424. [PMID: 33408427 PMCID: PMC7781207 DOI: 10.1080/01621459.2018.1555092] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 11/11/2018] [Accepted: 11/26/2018] [Indexed: 10/27/2022]
Abstract
The aim of this paper is to develop a low-rank linear regression model (L2RM) to correlate a high-dimensional response matrix with a high dimensional vector of covariates when coefficient matrices have low-rank structures. We propose a fast and efficient screening procedure based on the spectral norm of each coefficient matrix in order to deal with the case when the number of covariates is extremely large. We develop an efficient estimation procedure based on the trace norm regularization, which explicitly imposes the low rank structure of coefficient matrices. When both the dimension of response matrix and that of covariate vector diverge at the exponential order of the sample size, we investigate the sure independence screening property under some mild conditions. We also systematically investigate some theoretical properties of our estimation procedure including estimation consistency, rank consistency and non-asymptotic error bound under some mild conditions. We further establish a theoretical guarantee for the overall solution of our two-step screening and estimation procedure. We examine the finite-sample performance of our screening and estimation methods using simulations and a large-scale imaging genetic dataset collected by the Philadelphia Neurodevelopmental Cohort (PNC) study.
Collapse
Affiliation(s)
- Dehan Kong
- Department of Statistical Sciences, University of Toronto
| | - Baiguo An
- School of Statistics, Capital University of Economics and Business
| | - Jingwen Zhang
- Department of Biostatistics, University of North Carolina at Chapel Hill
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill
| |
Collapse
|
40
|
Detecting genetic associations with brain imaging phenotypes in Alzheimer's disease via a novel structured SCCA approach. Med Image Anal 2020; 61:101656. [PMID: 32062154 DOI: 10.1016/j.media.2020.101656] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 11/27/2019] [Accepted: 01/22/2020] [Indexed: 01/15/2023]
Abstract
Brain imaging genetics becomes an important research topic since it can reveal complex associations between genetic factors and the structures or functions of the human brain. Sparse canonical correlation analysis (SCCA) is a popular bi-multivariate association identification method. To mine the complex genetic basis of brain imaging phenotypes, there arise many SCCA methods with a variety of norms for incorporating different structures of interest. They often use the group lasso penalty, the fused lasso or the graph/network guided fused lasso ones. However, the group lasso methods have limited capability because of the incomplete or unavailable prior knowledge in real applications. The fused lasso and graph/network guided methods are sensitive to the sign of the sample correlation which may be incorrectly estimated. In this paper, we introduce two new penalties to improve the fused lasso and the graph/network guided lasso penalties in structured sparse learning. We impose both penalties to the SCCA model and propose an optimization algorithm to solve it. The proposed SCCA method has a strong upper bound of grouping effects for both positively and negatively highly correlated variables. We show that, on both synthetic and real neuroimaging genetics data, the proposed SCCA method performs better than or equally to the conventional methods using fused lasso or graph/network guided fused lasso. In particular, the proposed method identifies higher canonical correlation coefficients and captures clearer canonical weight patterns, demonstrating its promising capability in revealing biologically meaningful imaging genetic associations.
Collapse
|
41
|
Shen L, Thompson PM. Brain Imaging Genomics: Integrated Analysis and Machine Learning. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2020; 108:125-162. [PMID: 31902950 PMCID: PMC6941751 DOI: 10.1109/jproc.2019.2947272] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Brain imaging genomics is an emerging data science field, where integrated analysis of brain imaging and genomics data, often combined with other biomarker, clinical and environmental data, is performed to gain new insights into the phenotypic, genetic and molecular characteristics of the brain as well as their impact on normal and disordered brain function and behavior. It has enormous potential to contribute significantly to biomedical discoveries in brain science. Given the increasingly important role of statistical and machine learning in biomedicine and rapidly growing literature in brain imaging genomics, we provide an up-to-date and comprehensive review of statistical and machine learning methods for brain imaging genomics, as well as a practical discussion on method selection for various biomedical applications.
Collapse
Affiliation(s)
- Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark & Mary Stevens Institute for Neuroimaging & Informatics, Keck School of Medicine, University of Southern California, Los Angeles, CA 90232, USA
| |
Collapse
|
42
|
Leviyang S, Strawn N, Griva I. Regulation of interferon stimulated gene expression levels at homeostasis. Cytokine 2019; 126:154870. [PMID: 31629105 DOI: 10.1016/j.cyto.2019.154870] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 09/27/2019] [Accepted: 09/28/2019] [Indexed: 01/12/2023]
Abstract
Interferon stimulated genes (ISGs), a collection of genes important in the early innate immune response, are upregulated in response to stimulation by extracellular type I interferons. The regulation of ISGs has been extensively studied in cells exposed to significant interferon stimulation, but less is known about ISG regulation in homeostatic regimes in which extracellular interferon levels are low. Using a collection of pre-existing, publicly available microarray datasets, we investigated ISG regulation at homeostasis in CD4, pulmonary epithelial, fibroblast and macrophage cells. We used a linear regression model to predict ISG expression levels from regulator expression levels. Our results suggest significant regulation of ISG expression at homeostasis, both through the ISGF3 molecule and through IRF7 and IRF8 associated pathways. We find that roughly 50% of ISGs have expression levels significantly correlated with ISGF3 expression levels at homeostasis, supporting previous results suggesting that homeostatic IFN levels have broad functional consequences. We find that ISG expression levels varied in their correlation with ISGF3, with epithelial and macrophage cells showing more correlation than CD4 and fibroblast cells. Our analysis provides a novel approach for decomposing and quantifying ISG regulation.
Collapse
Affiliation(s)
- Sivan Leviyang
- Department of Mathematics and Statistics, Georgetown University, District of Columbia 20057, USA.
| | - Nate Strawn
- Department of Mathematics and Statistics, Georgetown University, District of Columbia 20057, USA
| | - Igor Griva
- Department of Mathematical Sciences, George Mason University, Fairfax, VA 22030, USA
| |
Collapse
|
43
|
Zhu X, Shen D. Robust and Discriminative Brain Genome Association Study. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2019; 11767:456-464. [PMID: 34296224 PMCID: PMC8294458 DOI: 10.1007/978-3-030-32251-9_50] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Brain Genome Association (BGA) study, which investigates the associations between brain structure/function (characterized by neuroimaging phenotypes) and genetic variations (characterized by Single Nucleotide Polymorphisms (SNPs)), is important in pathological analysis of neurological disease. However, the current BGA studies are limited as they did not explicitly consider the disease labels, source importance, and sample importance in their formulations. We address these issues by proposing a robust and discriminative BGA formulation. Specifically, we learn two transformation matrices for mapping two heterogeneous data sources (i.e., neuroimaging data and genetic data) into a common space, so that the samples from the same subject (but diffrent sources) are close to each other, and also the samples with diffrent labels are separable. In addition, we add a sparsity constraint on the transformation matrices to enable feature selection on both data sources. Furthermore, both sample importance and source importance are also considered in the formulation via adaptive parameter-free sample and source weightings. We have conducted various experiments, using Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, to test how well the neuroimaging phenotypes and SNPs can represent each other in the common space.
Collapse
Affiliation(s)
- Xiaofeng Zhu
- University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Dinggang Shen
- University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
44
|
Wang X, Chen H, Yan J, Nho K, Risacher SL, Saykin AJ, Shen L, Huang H. Quantitative trait loci identification for brain endophenotypes via new additive model with random networks. Bioinformatics 2019; 34:i866-i874. [PMID: 30423101 DOI: 10.1093/bioinformatics/bty557] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Motivation The identification of quantitative trait loci (QTL) is critical to the study of causal relationships between genetic variations and disease abnormalities. We focus on identifying the QTLs associated to the brain endophenotypes in imaging genomics study for Alzheimer's Disease (AD). Existing research works mainly depict the association between single nucleotide polymorphisms (SNPs) and the brain endophenotypes via the linear methods, which may introduce high bias due to the simplicity of the models. Since the influence of QTLs on brain endophenotypes is quite complex, it is desired to design the appropriate non-linear models to investigate the associations of genotypes and endophenotypes. Results In this paper, we propose a new additive model to learn the non-linear associations between SNPs and brain endophenotypes in Alzheimer's disease. Our model can be flexibly employed to explain the non-linear influence of QTLs, thus is more adaptive for the complex distribution of the high-throughput biological data. Meanwhile, as an important computational learning theory contribution, we provide the generalization error analysis for the proposed approach. Unlike most previous theoretical analysis under independent and identically distributed samples assumption, our error bound is based on m-dependent observations, which is more appropriate for the high-throughput and noisy biological data. Experiments on the data from Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort demonstrate the promising performance of our approach for identifying biological meaningful SNPs. Availability and implementation An executable is available at https://github.com/littleq1991/additive_FNNRW.
Collapse
Affiliation(s)
- Xiaoqian Wang
- Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Hong Chen
- Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jingwen Yan
- Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Kwangsik Nho
- Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Shannon L Risacher
- Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Andrew J Saykin
- Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Heng Huang
- Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|
45
|
Du L, Liu K, Zhu L, Yao X, Risacher SL, Guo L, Saykin AJ, Shen L. Identifying progressive imaging genetic patterns via multi-task sparse canonical correlation analysis: a longitudinal study of the ADNI cohort. Bioinformatics 2019; 35:i474-i483. [PMID: 31510645 PMCID: PMC6613037 DOI: 10.1093/bioinformatics/btz320] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Identifying the genetic basis of the brain structure, function and disorder by using the imaging quantitative traits (QTs) as endophenotypes is an important task in brain science. Brain QTs often change over time while the disorder progresses and thus understanding how the genetic factors play roles on the progressive brain QT changes is of great importance and meaning. Most existing imaging genetics methods only analyze the baseline neuroimaging data, and thus those longitudinal imaging data across multiple time points containing important disease progression information are omitted. RESULTS We propose a novel temporal imaging genetic model which performs the multi-task sparse canonical correlation analysis (T-MTSCCA). Our model uses longitudinal neuroimaging data to uncover that how single nucleotide polymorphisms (SNPs) play roles on affecting brain QTs over the time. Incorporating the relationship of the longitudinal imaging data and that within SNPs, T-MTSCCA could identify a trajectory of progressive imaging genetic patterns over the time. We propose an efficient algorithm to solve the problem and show its convergence. We evaluate T-MTSCCA on 408 subjects from the Alzheimer's Disease Neuroimaging Initiative database with longitudinal magnetic resonance imaging data and genetic data available. The experimental results show that T-MTSCCA performs either better than or equally to the state-of-the-art methods. In particular, T-MTSCCA could identify higher canonical correlation coefficients and capture clearer canonical weight patterns. This suggests that T-MTSCCA identifies time-consistent and time-dependent SNPs and imaging QTs, which further help understand the genetic basis of the brain QT changes over the time during the disease progression. AVAILABILITY AND IMPLEMENTATION The software and simulation data are publicly available at https://github.com/dulei323/TMTSCCA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lei Du
- School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Kefei Liu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Lei Zhu
- School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, China
| | - Xiaohui Yao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Shannon L Risacher
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Lei Guo
- School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Andrew J Saykin
- Department of Radiology and Imaging Sciences, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | | |
Collapse
|
46
|
Chen J, Liu J, Calhoun VD. The Translational Potential of Neuroimaging Genomic Analyses To Diagnosis And Treatment In The Mental Disorders. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2019; 107:912-927. [PMID: 32051642 PMCID: PMC7015534 DOI: 10.1109/jproc.2019.2913145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Imaging genomics focuses on characterizing genomic influence on the variation of neurobiological traits, holding promise for illuminating the pathogenesis, reforming the diagnostic system, and precision medicine of mental disorders. This paper aims to provide an overall picture of the current status of neuroimaging-genomic analyses in mental disorders, and how we can increase their translational potential into clinical practice. The review is organized around three perspectives. (a) Towards reliability, generalizability and interpretability, where we summarize the multivariate models and discuss the considerations and trade-offs of using these methods and how reliable findings may be reached, to serve as ground for further delineation. (b) Towards improved diagnosis, where we outline the advantages and challenges of constructing a dimensional transdiagnostic model and how imaging genomic analyses map into this framework to aid in deconstructing heterogeneity and achieving an optimal stratification of patients that better inform treatment planning. (c) Towards improved treatment. Here we highlight recent efforts and progress in elucidating the functional annotations that bridge between genomic risk and neurobiological abnormalities, in detecting genomic predisposition and prodromal neurodevelopmental changes, as well as in identifying imaging genomic biomarkers for predicting treatment response. Providing an overview of the challenges and promises, this review hopefully motivates imaging genomic studies with multivariate, dimensional and transdiagnostic designs for generalizable and interpretable findings that facilitate development of personalized treatment.
Collapse
Affiliation(s)
- Jiayu Chen
- The Mind Research Network, Albuquerque, NM 87106 USA
| | - Jingyu Liu
- The Mind Research Network, Albuquerque, NM 87106 USA, and also with the Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131 USA
| | - Vince D Calhoun
- The Mind Research Network, Albuquerque, NM 87106 USA, and also with the Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM 87131 USA
| |
Collapse
|
47
|
Zhu X, Suk HI, Shen D. Group sparse reduced rank regression for neuroimaging genetic study. WORLD WIDE WEB 2019; 22:673-688. [PMID: 31607788 PMCID: PMC6788769 DOI: 10.1007/s11280-018-0637-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 07/19/2018] [Accepted: 09/07/2018] [Indexed: 06/10/2023]
Abstract
The neuroimaging genetic study usually needs to deal with high dimensionality of both brain imaging data and genetic data, so that often resulting in the issue of curse of dimensionality. In this paper, we propose a group sparse reduced rank regression model to take the relations of both the phenotypes and the genotypes for the neuroimaging genetic study. Specifically, we propose designing a graph sparsity constraint as well as a reduced rank constraint to simultaneously conduct subspace learning and feature selection. The group sparsity constraint conducts feature selection to identify genotypes highly related to neuroimaging data, while the reduced rank constraint considers the relations among neuroimaging data to conduct subspace learning in the feature selection model. Furthermore, an alternative optimization algorithm is proposed to solve the resulting objective function and is proved to achieve fast convergence. Experimental results on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset showed that the proposed method has superiority on predicting the phenotype data by the genotype data, than the alternative methods under comparison.
Collapse
Affiliation(s)
- Xiaofeng Zhu
- Guangxi Key Lab of Multi-source Information Mining and Security, Guangxi Normal University, Guilin 541004, Guangxi, People’s Republic of China
- Institute of Natural and Mathematical Sciences, Massey University, Auckland 0745, New Zealand
- BRIC Center of the University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Heung-Il Suk
- Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
| | - Dinggang Shen
- Department of Brain and Cognitive Engineering, Korea University, Seoul, Korea
- BRIC Center of the University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
48
|
Nathoo FS, Kong L, Zhu H. A Review of Statistical Methods in Imaging Genetics. CAN J STAT 2019; 47:108-131. [PMID: 31274952 PMCID: PMC6605768 DOI: 10.1002/cjs.11487] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 10/08/2018] [Indexed: 12/24/2022]
Abstract
With the rapid growth of modern technology, many biomedical studies are being conducted to collect massive datasets with volumes of multi-modality imaging, genetic, neurocognitive, and clinical information from increasingly large cohorts. Simultaneously extracting and integrating rich and diverse heterogeneous information in neuroimaging and/or genomics from these big datasets could transform our understanding of how genetic variants impact brain structure and function, cognitive function, and brain-related disease risk across the lifespan. Such understanding is critical for diagnosis, prevention, and treatment of numerous complex brain-related disorders (e.g., schizophrenia and Alzheimer's disease). However, the development of analytical methods for the joint analysis of both high-dimensional imaging phenotypes and high-dimensional genetic data, a big data squared (BD2) problem, presents major computational and theoretical challenges for existing analytical methods. Besides the high-dimensional nature of BD2, various neuroimaging measures often exhibit strong spatial smoothness and dependence and genetic markers may have a natural dependence structure arising from linkage disequilibrium. We review some recent developments of various statistical techniques for imaging genetics, including massive univariate and voxel-wise approaches, reduced rank regression, mixture models, and group sparse multi-task regression. By doing so, we hope that this review may encourage others in the statistical community to enter into this new and exciting field of research.
Collapse
Affiliation(s)
- Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria
| | - Linglong Kong
- Department of Mathematical and Statistical Sciences, University of Alberta
| | - Hongtu Zhu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center
| |
Collapse
|
49
|
Zhou T, Thung KH, Liu M, Shen D. Brain-Wide Genome-Wide Association Study for Alzheimer's Disease via Joint Projection Learning and Sparse Regression Model. IEEE Trans Biomed Eng 2019; 66:165-175. [PMID: 29993426 PMCID: PMC6342004 DOI: 10.1109/tbme.2018.2824725] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Brain-wide and genome-wide association (BW-GWA) study is presented in this paper to identify the associations between the brain imaging phenotypes (i.e., regional volumetric measures) and the genetic variants [i.e., single nucleotide polymorphism (SNP)] in Alzheimer's disease (AD). The main challenges of this study include the data heterogeneity, complex phenotype-genotype associations, high-dimensional data (e.g., thousands of SNPs), and the existence of phenotype outliers. Previous BW-GWA studies, while addressing some of these challenges, did not consider the diagnostic label information in their formulations, thus limiting their clinical applicability. To address these issues, we present a novel joint projection and sparse regression model to discover the associations between the phenotypes and genotypes. Specifically, to alleviate the negative influence of data heterogeneity, we first map the genotypes into an intermediate imaging-phenotype-like space. Then, to better reveal the complex phenotype-genotype associations, we project both the mapped genotypes and the original imaging phenotypes into a diagnostic-label-guided joint feature space, where the intraclass projected points are constrained to be close to each other. In addition, we use l2,1-norm minimization on both the regression loss function and the transformation coefficient matrices, to reduce the effect of phenotype outliers and also to encourage sparse feature selections of both the genotypes and phenotypes. We evaluate our method using AD neuroimaging initiative dataset, and the results show that our proposed method outperforms several state-of-the-art methods in term of the average root-mean-square error of genome-to-phenotype predictions. Besides, the associated SNPs and brain regions identified in this study have also been shown in the previous AD-related studies, thus verifying the effectiveness and potential of our proposed method in AD pathogenesis study.
Collapse
Affiliation(s)
- Tao Zhou
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ()
| | - Kim-Han Thung
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ()
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ()
| | - Dinggang Shen
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina, Chapel Hill, NC 27599 USA, and also with the Department of Brain and Cognitive Engineering, Korea University, Seoul 02841, Republic of Korea ()
| |
Collapse
|
50
|
Leppäaho E, Renvall H, Salmela E, Kere J, Salmelin R, Kaski S. Discovering heritable modes of MEG spectral power. Hum Brain Mapp 2019; 40:1391-1402. [PMID: 30600573 PMCID: PMC6590382 DOI: 10.1002/hbm.24454] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 09/27/2018] [Accepted: 10/19/2018] [Indexed: 12/14/2022] Open
Abstract
Brain structure and many brain functions are known to be genetically controlled, but direct links between neuroimaging measures and their underlying cellular-level determinants remain largely undiscovered. Here, we adopt a novel computational method for examining potential similarities in high-dimensional brain imaging data between siblings. We examine oscillatory brain activity measured with magnetoencephalography (MEG) in 201 healthy siblings and apply Bayesian reduced-rank regression to extract a low-dimensional representation of familial features in the participants' spectral power structure. Our results show that the structure of the overall spectral power at 1-90 Hz is a highly conspicuous feature that not only relates siblings to each other but also has very high consistency within participants' own data, irrespective of the exact experimental state of the participant. The analysis is extended by seeking genetic associations for low-dimensional descriptions of the oscillatory brain activity. The observed variability in the MEG spectral power structure was associated with SDK1 (sidekick cell adhesion molecule 1) and suggestively with several other genes that function, for example, in brain development. The current results highlight the potential of sophisticated computational methods in combining molecular and neuroimaging levels for exploring brain functions, even for high-dimensional data limited to a few hundred participants.
Collapse
Affiliation(s)
- Eemeli Leppäaho
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| | - Hanna Renvall
- Department of Neuroscience and Biomedical Engineering, Aalto University, Helsinki, Finland.,Aalto NeuroImaging, Aalto University, Helsinki, Finland
| | - Elina Salmela
- Department of Biosciences, University of Helsinki, Helsinki, Finland
| | - Juha Kere
- Molecular Neurology Research Program, University of Helsinki, Folkhälsan Institute of Genetics, Helsinki, Finland.,Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden.,School of Basic and Medical Biosciences, King's College London, Guy's Hospital, London, United Kingdom
| | - Riitta Salmelin
- Department of Neuroscience and Biomedical Engineering, Aalto University, Helsinki, Finland.,Aalto NeuroImaging, Aalto University, Helsinki, Finland
| | - Samuel Kaski
- Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Helsinki, Finland
| |
Collapse
|