1
|
Shao M, Chen K, Zhang S, Tian M, Shen Y, Cao C, Gu N. Multiome-wide Association Studies: Novel Approaches for Understanding Diseases. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae077. [PMID: 39471467 PMCID: PMC11630051 DOI: 10.1093/gpbjnl/qzae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/06/2024] [Accepted: 10/23/2024] [Indexed: 11/01/2024]
Abstract
The rapid development of multiome (transcriptome, proteome, cistrome, imaging, and regulome)-wide association study methods have opened new avenues for biologists to understand the susceptibility genes underlying complex diseases. Thorough comparisons of these methods are essential for selecting the most appropriate tool for a given research objective. This review provides a detailed categorization and summary of the statistical models, use cases, and advantages of recent multiome-wide association studies. In addition, to illustrate gene-disease association studies based on transcriptome-wide association study (TWAS), we collected 478 disease entries across 22 categories from 235 manually reviewed publications. Our analysis reveals that mental disorders are the most frequently studied diseases by TWAS, indicating its potential to deepen our understanding of the genetic architecture of complex diseases. In summary, this review underscores the importance of multiome-wide association studies in elucidating complex diseases and highlights the significance of selecting the appropriate method for each study.
Collapse
Affiliation(s)
- Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Kaiyang Chen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Shuting Zhang
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Yan Shen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Ning Gu
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
- Nanjing Key Laboratory for Cardiovascular Information and Health Engineering Medicine, Institute of Clinical Medicine, Nanjing Drum Tower Hospital, Medical School, Nanjing University, Nanjing 210093, China
| |
Collapse
|
2
|
He J, Li Q, Zhang Q. rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection. Genetics 2024; 226:iyad204. [PMID: 38001381 DOI: 10.1093/genetics/iyad204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/14/2023] [Accepted: 11/16/2023] [Indexed: 11/26/2023] Open
Abstract
Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
3
|
He J, Antonyan L, Zhu H, Ardila K, Li Q, Enoma D, Zhang W, Liu A, Chekouo T, Cao B, MacDonald ME, Arnold PD, Long Q. A statistical method for image-mediated association studies discovers genes and pathways associated with four brain disorders. Am J Hum Genet 2024; 111:48-69. [PMID: 38118447 PMCID: PMC10806749 DOI: 10.1016/j.ajhg.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/04/2023] [Accepted: 11/16/2023] [Indexed: 12/22/2023] Open
Abstract
Brain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts. By leveraging the UK Biobank image-derived phenotypes (IDPs), the IMAS discovered genetic bases underlying four neuropsychiatric disorders and verified them by analyzing annotations, pathways, and expression quantitative trait loci (eQTLs). A cerebellar-mediated mechanism was identified to be common to the four disorders. Simulations show that, if the goal is identifying genetic risk, our IMAS is more powerful than a hypothetical protocol in which the imaging results were available in the GWAS dataset. This implies the feasibility of reanalyzing legacy GWAS datasets without conducting additional imaging, yielding cost savings for integrated analysis of genetics and imaging.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Lilit Antonyan
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Harold Zhu
- Department of Biological Sciences, Faculty of Science, University of Calgary, Calgary, AB, Canada
| | - Karen Ardila
- Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - David Enoma
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | | | - Andy Liu
- Sir Winston Churchill High School, Calgary, AB, Canada; College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Thierry Chekouo
- Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada; Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Bo Cao
- Department of Psychiatry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - M Ethan MacDonald
- The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Paul D Arnold
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
4
|
Cruciani F, Aparo A, Brusini L, Combi C, Storti SF, Giugno R, Menegaz G, Boscolo Galazzo I. Identifying the joint signature of brain atrophy and gene variant scores in Alzheimer's Disease. J Biomed Inform 2024; 149:104569. [PMID: 38104851 DOI: 10.1016/j.jbi.2023.104569] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 11/20/2023] [Accepted: 12/07/2023] [Indexed: 12/19/2023]
Abstract
The joint modeling of genetic data and brain imaging information allows for determining the pathophysiological pathways of neurodegenerative diseases such as Alzheimer's disease (AD). This task has typically been approached using mass-univariate methods that rely on a complete set of Single Nucleotide Polymorphisms (SNPs) to assess their association with selected image-derived phenotypes (IDPs). However, such methods are prone to multiple comparisons bias and, most importantly, fail to account for potential cross-feature interactions, resulting in insufficient detection of significant associations. Ways to overcome these limitations while reducing the number of traits aim at conveying genetic information at the gene level and capturing the integrated genetic effects of a set of genetic variants, rather than looking at each SNP individually. Their associations with brain IDPs are still largely unexplored in the current literature, though they can uncover new potential genetic determinants for brain modulations in the AD continuum. In this work, we explored an explainable multivariate model to analyze the genetic basis of the grey matter modulations, relying on the AD Neuroimaging Initiative (ADNI) phase 3 dataset. Cortical thicknesses and subcortical volumes derived from T1-weighted Magnetic Resonance were considered to describe the imaging phenotypes. At the same time the genetic counterpart was represented by gene variant scores extracted by the Sequence Kernel Association Test (SKAT) filtering model. Moreover, transcriptomic analysis was carried on to assess the expression of the resulting genes in the main brain structures as a form of validation. Results highlighted meaningful genotype-phenotype interactionsas defined by three latent components showing a significant difference in the projection scores between patients and controls. Among the significant associations, the model highlighted EPHX1 and BCAS1 gene variant scores involved in neurodegenerative and myelination processes, hence relevant for AD. In particular, the first was associated with decreased subcortical volumes and the second with decreasedtemporal lobe thickness. Noteworthy, BCAS1 is particularly expressed in the dentate gyrus. Overall, the proposed approach allowed capturing genotype-phenotype interactions in a restricted study cohort that was confirmed by transcriptomic analysis, offering insights into the underlying mechanisms of neurodegeneration in AD in line with previous findings and suggesting new potential disease biomarkers.
Collapse
Affiliation(s)
- Federica Cruciani
- Department of Engineering for Innovation Medicine, University of Verona, Verona, Italy.
| | - Antonino Aparo
- Department of Computer Science, University of Verona, Verona, Italy
| | - Lorenza Brusini
- Department of Engineering for Innovation Medicine, University of Verona, Verona, Italy
| | - Carlo Combi
- Department of Computer Science, University of Verona, Verona, Italy
| | - Silvia F Storti
- Department of Engineering for Innovation Medicine, University of Verona, Verona, Italy
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Verona, Italy
| | - Gloria Menegaz
- Department of Engineering for Innovation Medicine, University of Verona, Verona, Italy
| | | |
Collapse
|
5
|
Monte AA, Vest A, Reisz JA, Berninzoni D, Hart C, Dylla L, D'Alessandro A, Heard KJ, Wood C, Pattee J. A Multi-Omic Mosaic Model of Acetaminophen Induced Alanine Aminotransferase Elevation. J Med Toxicol 2023; 19:255-261. [PMID: 37231244 PMCID: PMC10212224 DOI: 10.1007/s13181-023-00951-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Revised: 05/13/2023] [Accepted: 05/18/2023] [Indexed: 05/27/2023] Open
Abstract
BACKGROUND Acetaminophen (APAP) is the most common cause liver injury following alcohol in US patients. Predicting liver injury and subsequent hepatic regeneration in patients taking therapeutic doses of APAP may be possible using new 'omic methods such as metabolomics and genomics. Multi'omic techniques increase our ability to find new mechanisms of injury and regeneration. METHODS We used metabolomic and genomic data from a randomized controlled trial of patients administered 4 g of APAP per day for 14 days or longer with blood samples obtained at 0 (baseline), 4, 7, 10, 13 and 16 days. We used the highest ALT as the clinical outcome to be predicted in our integrated analysis. We used penalized regression to model the relationship between genetic variants and day 0 metabolite level, and then performed a metabolite-wide colocalization scan to associate the genetically regulated component of metabolite expression with ALT elevation. Genome-wide association study (GWAS) analyses were conducted for ALT elevation and metabolite level using linear regression, with age, sex, and the first five principal components included as covariates. Colocalization was tested via a weighted sum test. RESULTS Out of the 164 metabolites modeled, 120 met the criteria for predictive accuracy and were retained for genetic analyses. After genomic examination, eight metabolites were found to be under genetic control and predictive of ALT elevation due to therapeutic acetaminophen. The metabolites were: 3-oxalomalate, allantoate, diphosphate, L-carnitine, L-proline, maltose, and ornithine. These genes are important in the tricarboxylic acid cycle (TCA), urea breakdown pathway, glutathione production, mitochondrial energy production, and maltose metabolism. CONCLUSIONS This multi'omic approach can be used to integrate metabolomic and genomic data allowing identification of genes that control downstream metabolites. These findings confirm prior work that have identified mitochondrial energy production as critical to APAP induced liver injury and have confirmed our prior work that demonstrate the importance of the urea cycle in therapeutic APAP liver injury.
Collapse
Affiliation(s)
- Andrew A Monte
- Department of Emergency Medicine, University of Colorado School of Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA.
- Center for Bioinformatics & Personalized Medicine, University of Colorado School of Medicine, Aurora, CO, USA.
- Skaggs School of Pharmacy, University of Colorado, Aurora, CO, USA.
- Denver Health and Hospital Authority, Rocky Mountain Poison & Drug Center, Denver, CO, USA.
| | - Alexis Vest
- Department of Emergency Medicine, University of Colorado School of Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA
| | - Julie A Reisz
- Metabolomics Core, Department of Biochemistry and Molecular Genetics, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA
| | - Danielle Berninzoni
- Department of Emergency Medicine, University of Colorado School of Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA
| | - Claire Hart
- Department of Emergency Medicine, University of Colorado School of Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA
| | - Layne Dylla
- Department of Emergency Medicine, University of Colorado School of Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA
- Center for Bioinformatics & Personalized Medicine, University of Colorado School of Medicine, Aurora, CO, USA
| | - Angelo D'Alessandro
- Metabolomics Core, Department of Biochemistry and Molecular Genetics, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA
| | - Kennon J Heard
- Department of Emergency Medicine, University of Colorado School of Medicine, Leprino Building, 7th Floor Campus Box B-215, 12401 E. 17th Avenue, Aurora, CO, 80045, USA
- Denver Health and Hospital Authority, Rocky Mountain Poison & Drug Center, Denver, CO, USA
| | - Cheyret Wood
- Department of Biostatistics & Informatics, Colorado School of Public Health, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA
| | - Jack Pattee
- Department of Biostatistics & Informatics, Colorado School of Public Health, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
6
|
Kang M, Ang TFA, Devine SA, Sherva R, Mukherjee S, Trittschuh EH, Gibbons LE, Scollard P, Lee M, Choi SE, Klinedinst B, Nakano C, Dumitrescu LC, Durant A, Hohman TJ, Cuccaro ML, Saykin AJ, Kukull WA, Bennett DA, Wang LS, Mayeux RP, Haines JL, Pericak-Vance MA, Schellenberg GD, Crane PK, Au R, Lunetta KL, Mez JB, Farrer LA. A genome-wide search for pleiotropy in more than 100,000 harmonized longitudinal cognitive domain scores. Mol Neurodegener 2023; 18:40. [PMID: 37349795 PMCID: PMC10286470 DOI: 10.1186/s13024-023-00633-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 06/06/2023] [Indexed: 06/24/2023] Open
Abstract
BACKGROUND More than 75 common variant loci account for only a portion of the heritability for Alzheimer's disease (AD). A more complete understanding of the genetic basis of AD can be deduced by exploring associations with AD-related endophenotypes. METHODS We conducted genome-wide scans for cognitive domain performance using harmonized and co-calibrated scores derived by confirmatory factor analyses for executive function, language, and memory. We analyzed 103,796 longitudinal observations from 23,066 members of community-based (FHS, ACT, and ROSMAP) and clinic-based (ADRCs and ADNI) cohorts using generalized linear mixed models including terms for SNP, age, SNP × age interaction, sex, education, and five ancestry principal components. Significance was determined based on a joint test of the SNP's main effect and interaction with age. Results across datasets were combined using inverse-variance meta-analysis. Genome-wide tests of pleiotropy for each domain pair as the outcome were performed using PLACO software. RESULTS Individual domain and pleiotropy analyses revealed genome-wide significant (GWS) associations with five established loci for AD and AD-related disorders (BIN1, CR1, GRN, MS4A6A, and APOE) and eight novel loci. ULK2 was associated with executive function in the community-based cohorts (rs157405, P = 2.19 × 10-9). GWS associations for language were identified with CDK14 in the clinic-based cohorts (rs705353, P = 1.73 × 10-8) and LINC02712 in the total sample (rs145012974, P = 3.66 × 10-8). GRN (rs5848, P = 4.21 × 10-8) and PURG (rs117523305, P = 1.73 × 10-8) were associated with memory in the total and community-based cohorts, respectively. GWS pleiotropy was observed for language and memory with LOC107984373 (rs73005629, P = 3.12 × 10-8) in the clinic-based cohorts, and with NCALD (rs56162098, P = 1.23 × 10-9) and PTPRD (rs145989094, P = 8.34 × 10-9) in the community-based cohorts. GWS pleiotropy was also found for executive function and memory with OSGIN1 (rs12447050, P = 4.09 × 10-8) and PTPRD (rs145989094, P = 3.85 × 10-8) in the community-based cohorts. Functional studies have previously linked AD to ULK2, NCALD, and PTPRD. CONCLUSION Our results provide some insight into biological pathways underlying processes leading to domain-specific cognitive impairment and AD, as well as a conduit toward a syndrome-specific precision medicine approach to AD. Increasing the number of participants with harmonized cognitive domain scores will enhance the discovery of additional genetic factors of cognitive decline leading to AD and related dementias.
Collapse
Affiliation(s)
- Moonil Kang
- Department of Medicine (Biomedical Genetics), Boston University Chobanian & Avedisian School of Medicine, 72 East Concord Street E200, Boston, MA 02118 USA
| | - Ting Fang Alvin Ang
- Department of Anatomy and Neurobiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Slone Epidemiology Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
| | - Sherral A. Devine
- Department of Anatomy and Neurobiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
| | - Richard Sherva
- Department of Medicine (Biomedical Genetics), Boston University Chobanian & Avedisian School of Medicine, 72 East Concord Street E200, Boston, MA 02118 USA
| | - Shubhabrata Mukherjee
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Emily H. Trittschuh
- Geriatric Research, Education, and Clinical Center, Veterans Affairs Puget Sound Health Care System, Seattle, WA USA
- Department of Psychiatry and Behavioral Sciences, University of Washington School of Medicine, Seattle, WA USA
| | - Laura E. Gibbons
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Phoebe Scollard
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Michael Lee
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Seo-Eun Choi
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Brandon Klinedinst
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Connie Nakano
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Logan C. Dumitrescu
- Vanderbilt Memory & Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN USA
| | - Alaina Durant
- Vanderbilt Memory & Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN USA
| | - Timothy J. Hohman
- Vanderbilt Memory & Alzheimer’s Center, Vanderbilt University Medical Center, Nashville, TN USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN USA
| | - Michael L. Cuccaro
- John P. Hussman Institute for Human Genomics, Miller School of Medicine, Miami, FL USA
| | - Andrew J. Saykin
- Indiana Alzheimer’s Disease Research Center, Indiana University School of Medicine, Indianapolis, IN USA
- Department of Radiology and Imaging Services, Indiana University School of Medicine, Indianapolis, IN USA
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN USA
| | - Walter A. Kukull
- Department of Epidemiology, University of Washington, Seattle, WA USA
| | - David A. Bennett
- Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL USA
| | - Li-San Wang
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA USA
| | - Richard P. Mayeux
- Department of Neurology, Columbia University School of Medicine, New York, NY USA
| | - Jonathan L. Haines
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH USA
| | | | - Gerard D. Schellenberg
- Department of Pathology and Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA USA
| | - Paul K. Crane
- Department of Medicine, University of Washington School of Medicine, Seattle, WA USA
| | - Rhoda Au
- Department of Anatomy and Neurobiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Slone Epidemiology Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Boston University Alzheimer’s Disease Research Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA USA
| | - Kathryn L. Lunetta
- Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA USA
| | - Jesse B. Mez
- Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Boston University Alzheimer’s Disease Research Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
| | - Lindsay A. Farrer
- Department of Medicine (Biomedical Genetics), Boston University Chobanian & Avedisian School of Medicine, 72 East Concord Street E200, Boston, MA 02118 USA
- Framingham Heart Study, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Boston University Alzheimer’s Disease Research Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Department of Epidemiology, Boston University School of Public Health, Boston, MA USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA USA
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
- Department of Ophthalmology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA USA
| |
Collapse
|
7
|
Wang T, Chen X, Zhang J, Feng Q, Huang M. Deep multimodality-disentangled association analysis network for imaging genetics in neurodegenerative diseases. Med Image Anal 2023; 88:102842. [PMID: 37247468 DOI: 10.1016/j.media.2023.102842] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 03/01/2023] [Accepted: 05/15/2023] [Indexed: 05/31/2023]
Abstract
Imaging genetics is a crucial tool that is applied to explore potentially disease-related biomarkers, particularly for neurodegenerative diseases (NDs). With the development of imaging technology, the association analysis between multimodal imaging data and genetic data is gradually being concerned by a wide range of imaging genetics studies. However, multimodal data are fused first and then correlated with genetic data in traditional methods, which leads to an incomplete exploration of their common and complementary information. In addition, the inaccurate formulation in the complex relationships between imaging and genetic data and information loss caused by missing multimodal data are still open problems in imaging genetics studies. Therefore, in this study, a deep multimodality-disentangled association analysis network (DMAAN) is proposed to solve the aforementioned issues and detect the disease-related biomarkers of NDs simultaneously. First, the imaging data are nonlinearly projected into a latent space and imaging representations can be achieved. The imaging representations are further disentangled into common and specific parts by using a multimodal-disentangled module. Second, the genetic data are encoded to achieve genetic representations, and then, the achieved genetic representations are nonlinearly mapped to the common and specific imaging representations to build nonlinear associations between imaging and genetic data through an association analysis module. Moreover, modality mask vectors are synchronously synthesized to integrate the genetic and imaging data, which helps the following disease diagnosis. Finally, the proposed method achieves reasonable diagnosis performance via a disease diagnosis module and utilizes the label information to detect the disease-related modality-shared and modality-specific biomarkers. Furthermore, the genetic representation can be used to impute the missing multimodal data with our learning strategy. Two publicly available datasets with different NDs are used to demonstrate the effectiveness of the proposed DMAAN. The experimental results show that the proposed DMAAN can identify the disease-related biomarkers, which suggests the proposed DMAAN may provide new insights into the pathological mechanism and early diagnosis of NDs. The codes are publicly available at https://github.com/Meiyan88/DMAAN.
Collapse
Affiliation(s)
- Tao Wang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Xiumei Chen
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Jiawei Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | - Meiyan Huang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| |
Collapse
|
8
|
Bourquard T, Lee K, Al-Ramahi I, Pham M, Shapiro D, Lagisetty Y, Soleimani S, Mota S, Wilhelm K, Samieinasab M, Kim YW, Huh E, Asmussen J, Katsonis P, Botas J, Lichtarge O. Functional variants identify sex-specific genes and pathways in Alzheimer's Disease. Nat Commun 2023; 14:2765. [PMID: 37179358 PMCID: PMC10183026 DOI: 10.1038/s41467-023-38374-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
The incidence of Alzheimer's Disease in females is almost double that of males. To search for sex-specific gene associations, we build a machine learning approach focused on functionally impactful coding variants. This method can detect differences between sequenced cases and controls in small cohorts. In the Alzheimer's Disease Sequencing Project with mixed sexes, this approach identified genes enriched for immune response pathways. After sex-separation, genes become specifically enriched for stress-response pathways in male and cell-cycle pathways in female. These genes improve disease risk prediction in silico and modulate Drosophila neurodegeneration in vivo. Thus, a general approach for machine learning on functionally impactful variants can uncover sex-specific candidates towards diagnostic biomarkers and therapeutic targets.
Collapse
Affiliation(s)
- Thomas Bourquard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kwanghyuk Lee
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Ismael Al-Ramahi
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
- Center for Alzheimer's and Neurodegenerative Diseases, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Minh Pham
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Dillon Shapiro
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Yashwanth Lagisetty
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Biology and Pharmacology, UTHealth McGovern Medical School, Houston, TX, 77030, USA
| | - Shirin Soleimani
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Samantha Mota
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kevin Wilhelm
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Maryam Samieinasab
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Young Won Kim
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eunna Huh
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Jennifer Asmussen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Juan Botas
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX, 77030, USA
- Center for Alzheimer's and Neurodegenerative Diseases, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
- Center for Alzheimer's and Neurodegenerative Diseases, Baylor College of Medicine, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
9
|
Xue H, Shen X, Pan W. Causal Inference in Transcriptome-Wide Association Studies with Invalid Instruments and GWAS Summary Data. J Am Stat Assoc 2023; 118:1525-1537. [PMID: 37808547 PMCID: PMC10557939 DOI: 10.1080/01621459.2023.2183127] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Accepted: 02/14/2023] [Indexed: 02/24/2023]
Abstract
Transcriptome-wide association studies (TWAS) have recently emerged as a popular tool to discover (putative) causal genes by integrating an outcome GWAS dataset with another gene expression/transcriptome GWAS (called eQTL) dataset. In our motivating and target application, we'd like to identify causal genes for low-density lipoprotein cholesterol (LDL), which is crucial for developing new treatments for hyperlipidemia and cardiovascular diseases. The statistical principle underlying TWAS is (two-sample) two-stage least squares (2SLS) using multiple correlated SNPs as instrumental variables (IVs); it is closely related to typical (two-sample) Mendelian randomization (MR) using independent SNPs as IVs, which is expected to be impractical and lower-powered for TWAS (and some other) applications. However, often some of the SNPs used may not be valid IVs, e.g. due to the widespread pleiotropy of their direct effects on the outcome not mediated through the gene of interest, leading to false conclusions by TWAS (or MR). Building on recent advances in sparse regression, we propose a robust and efficient inferential method to account for both hidden confounding and some invalid IVs via two-stage constrained maximum likelihood (2ScML), an extension of 2SLS. We first develop the proposed method with individual-level data, then extend it both theoretically and computationally to GWAS summary data for the most popular two-sample TWAS design, to which almost all existing robust IV regression methods are however not applicable. We show that the proposed method achieves asymptotically valid statistical inference on causal effects, demonstrating its wider applicability and superior finite-sample performance over the standard 2SLS/TWAS (and MR). We apply the methods to identify putative causal genes for LDL by integrating large-scale lipid GWAS summary data with eQTL data.
Collapse
Affiliation(s)
- Haoran Xue
- School of Statistics, University of Minnesota, Minneapolis, Minnesota 55455
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, Minneapolis, Minnesota 55455
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota 55455
| |
Collapse
|
10
|
Sensi SL, Russo M, Tiraboschi P. Biomarkers of diagnosis, prognosis, pathogenesis, response to therapy: Convergence or divergence? Lessons from Alzheimer's disease and synucleinopathies. HANDBOOK OF CLINICAL NEUROLOGY 2023; 192:187-218. [PMID: 36796942 DOI: 10.1016/b978-0-323-85538-9.00015-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Alzheimer's disease (AD) is the most common disorder associated with cognitive impairment. Recent observations emphasize the pathogenic role of multiple factors inside and outside the central nervous system, supporting the notion that AD is a syndrome of many etiologies rather than a "heterogeneous" but ultimately unifying disease entity. Moreover, the defining pathology of amyloid and tau coexists with many others, such as α-synuclein, TDP-43, and others, as a rule, not an exception. Thus, an effort to shift our AD paradigm as an amyloidopathy must be reconsidered. Along with amyloid accumulation in its insoluble state, β-amyloid is becoming depleted in its soluble, normal states, as a result of biological, toxic, and infectious triggers, requiring a shift from convergence to divergence in our approach to neurodegeneration. These aspects are reflected-in vivo-by biomarkers, which have become increasingly strategic in dementia. Similarly, synucleinopathies are primarily characterized by abnormal deposition of misfolded α-synuclein in neurons and glial cells and, in the process, depleting the levels of the normal, soluble α-synuclein that the brain needs for many physiological functions. The soluble to insoluble conversion also affects other normal brain proteins, such as TDP-43 and tau, accumulating in their insoluble states in both AD and dementia with Lewy bodies (DLB). The two diseases have been distinguished by the differential burden and distribution of insoluble proteins, with neocortical phosphorylated tau deposition more typical of AD and neocortical α-synuclein deposition peculiar to DLB. We propose a reappraisal of the diagnostic approach to cognitive impairment from convergence (based on clinicopathologic criteria) to divergence (based on what differs across individuals affected) as a necessary step for the launch of precision medicine.
Collapse
Affiliation(s)
- Stefano L Sensi
- Department of Neuroscience, Imaging, and Clinical Sciences, "G. d'Annunzio" University of Chieti-Pescara, Chieti, Italy; Molecular Neurology Unit, Center for Advanced Studies and Technology-CAST and ITAB Institute for Advanced Biotechnology, "G. d'Annunzio" University of Chieti-Pescara, Chieti, Italy.
| | - Mirella Russo
- Department of Neuroscience, Imaging, and Clinical Sciences, "G. d'Annunzio" University of Chieti-Pescara, Chieti, Italy; Molecular Neurology Unit, Center for Advanced Studies and Technology-CAST and ITAB Institute for Advanced Biotechnology, "G. d'Annunzio" University of Chieti-Pescara, Chieti, Italy
| | - Pietro Tiraboschi
- Division of Neurology V-Neuropathology, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| |
Collapse
|
11
|
Szabo CA, Salinas FS. Neuroimaging in the Epileptic Baboon. Front Vet Sci 2022; 9:908801. [PMID: 35909685 PMCID: PMC9330034 DOI: 10.3389/fvets.2022.908801] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 06/23/2022] [Indexed: 11/13/2022] Open
Abstract
Characterization of baboon model of genetic generalized epilepsy (GGE) is driven both electroclinically and by successful adoption of neuroimaging platforms, such as magnetic resonance imaging (MRI) and positron emission tomography (PET). Based upon its phylogenetic proximity and similar brain anatomy to humans, the epileptic baboon provides an excellent translational model. Its relatively large brain size compared to smaller nonhuman primates or rodents, a gyrencephalic structure compared to lissencephalic organization of rodent brains, and the availability of a large pedigreed colony allows exploration of neuroimaging markers of diseases. Similar to human idiopathic generalized epilepsy (IGE), structural imaging in the baboon is usually normal in individual subjects, but gray matter volume/concentration (GMV/GMC) changes are reported by statistical parametric mapping (SPM) analyses. Functional neuroimaging has been effective for mapping the photoepileptic responses, the epileptic network, altered functional connectivity of physiological networks, and the effects of anti-seizure therapies. This review will provide insights into our current understanding the baboon model of GGE through functional and structural imaging.
Collapse
Affiliation(s)
- C. Akos Szabo
- Department of Neurology, University of Texas Health San Antonio, San Antonio, TX, United States
- *Correspondence: C. Akos Szabo
| | - Felipe S. Salinas
- Research Imaging Institute, University of Texas Health San Antonio, San Antonio, TX, United States
- Department of Radiology, University of Texas Health San Antonio, San Antonio, TX, United States
| |
Collapse
|
12
|
Wang G, Wu W, Xu Y, Yang Z, Xiao B, Long L. Imaging Genetics in Epilepsy: Current Knowledge and New Perspectives. Front Mol Neurosci 2022; 15:891621. [PMID: 35706428 PMCID: PMC9189397 DOI: 10.3389/fnmol.2022.891621] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 05/06/2022] [Indexed: 12/11/2022] Open
Abstract
Epilepsy is a neurological network disease with genetics playing a much greater role than was previously appreciated. Unfortunately, the relationship between genetic basis and imaging phenotype is by no means simple. Imaging genetics integrates multidimensional datasets within a unified framework, providing a unique opportunity to pursue a global vision for epilepsy. This review delineates the current knowledge of underlying genetic mechanisms for brain networks in different epilepsy syndromes, particularly from a neural developmental perspective. Further, endophenotypes and their potential value are discussed. Finally, we highlight current challenges and provide perspectives for the future development of imaging genetics in epilepsy.
Collapse
Affiliation(s)
- Ge Wang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Clinical Research Center for Epileptic Disease of Hunan Province, Central South University, Changsha, China
| | - Wenyue Wu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- Department of Neurology, The Second Affiliated Hospital of Nanchang University, Jiangxi, China
| | - Yuchen Xu
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- Department of Neurology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, China
| | - Zhuanyi Yang
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Clinical Research Center for Epileptic Disease of Hunan Province, Central South University, Changsha, China
- Department of Neurosurgery, Xiangya Hospital, Central South University, Changsha, China
| | - Bo Xiao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Clinical Research Center for Epileptic Disease of Hunan Province, Central South University, Changsha, China
| | - Lili Long
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, China
- Clinical Research Center for Epileptic Disease of Hunan Province, Central South University, Changsha, China
- *Correspondence: Lili Long
| |
Collapse
|
13
|
Silva TC, Young JI, Martin ER, Chen XS, Wang L. MethReg: estimating the regulatory potential of DNA methylation in gene transcription. Nucleic Acids Res 2022; 50:e51. [PMID: 35100398 PMCID: PMC9122535 DOI: 10.1093/nar/gkac030] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Revised: 12/17/2021] [Accepted: 01/11/2022] [Indexed: 01/02/2023] Open
Abstract
Epigenome-wide association studies often detect many differentially methylated sites, and many are located in distal regulatory regions. To further prioritize these significant sites, there is a critical need to better understand the functional impact of CpG methylation. Recent studies demonstrated that CpG methylation-dependent transcriptional regulation is a widespread phenomenon. Here, we present MethReg, an R/Bioconductor package that analyzes matched DNA methylation and gene expression data, along with external transcription factor (TF) binding information, to evaluate, prioritize and annotate CpG sites with high regulatory potential. At these CpG sites, TF-target gene associations are often only present in a subset of samples with high (or low) methylation levels, so they can be missed by analyses that use all samples. Using colorectal cancer and Alzheimer's disease datasets, we show MethReg significantly enhances our understanding of the regulatory roles of DNA methylation in complex diseases.
Collapse
Affiliation(s)
- Tiago C Silva
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Juan I Young
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Eden R Martin
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - X Steven Chen
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Lily Wang
- Department of Public Health Sciences, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| |
Collapse
|
14
|
Bae YE, Wu L, Wu C. InTACT: An adaptive and powerful framework for joint-tissue transcriptome-wide association studies. Genet Epidemiol 2021; 45:848-859. [PMID: 34255882 PMCID: PMC8604767 DOI: 10.1002/gepi.22425] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 06/22/2021] [Accepted: 06/24/2021] [Indexed: 11/05/2022]
Abstract
Transcriptome-wide association studies (TWAS) that integrate transcriptomic reference data and genome-wide association studies (GWAS) have successfully enhanced the discovery of candidate genes for many complex traits. However, existing methods may suffer from substantial power loss because they fail to effectively consider that expression of many genes tends to be consistent across tissues. Here we propose a computationally efficient testing method, referred to as Integrative Test for Associations via Cauchy Transformation (InTACT), that effectively combines information across multiple tissues and thus improves the power of identifying associated genes. Through simulation studies, we show that InTACT maintains high power while properly controls for Type 1 error rates. We applied InTACT to the largest GWAS of Alzheimer's disease (AD) to date and identified 227 genome-wide significant genes, of which 130 were not identified by benchmark methods, TWAS and MultiXcan. Importantly, InTACT identified five novel loci for AD. We implemented InTACT in publicly available software, "InTACT."
Collapse
Affiliation(s)
- Ye Eun Bae
- Department of Statistics, Florida State University
| | - Lang Wu
- Cancer Epidemiology Division, Population Sciences in the Pacific Program, University of Hawaii Cancer Center, University of Hawaii at Manoa
| | - Chong Wu
- Department of Statistics, Florida State University
| |
Collapse
|
15
|
Cao C, Kossinna P, Kwok D, Li Q, He J, Su L, Guo X, Zhang Q, Long Q. Disentangling genetic feature selection and aggregation in transcriptome-wide association studies. Genetics 2021; 220:6444993. [PMID: 34849857 DOI: 10.1093/genetics/iyab216] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 11/04/2021] [Indexed: 12/14/2022] Open
Abstract
The success of transcriptome-wide association studies (TWAS) has led to substantial research towards improving the predictive accuracy of its core component of Genetically Regulated eXpression (GReX). GReX links expression information with genotype and phenotype by playing two roles simultaneously: it acts as both the outcome of the genotype-based predictive models (for predicting expressions) and the linear combination of genotypes (as the predicted expressions) for association tests. From the perspective of machine learning (considering SNPs as features), these are actually two separable steps-feature selection and feature aggregation-which can be independently conducted. In this work, we show that the single approach of GReX limits the adaptability of TWAS methodology and practice. By conducting simulations and real data analysis, we demonstrate that disentangled protocols adapting straightforward approaches for feature selection (e.g., simple marker test) and aggregation (e.g., kernel machines) outperform the standard TWAS protocols that rely on GReX. Our development provides more powerful novel tools for conducting TWAS. More importantly, our characterization of the exact nature of TWAS suggests that, instead of questionably binding two distinct steps into the same statistical form (GReX), methodological research focusing on optimal combinations of feature selection and aggregation approaches will bring higher power to TWAS protocols.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Jingni He
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada
| | - Liya Su
- Department of Pathology, Anatomy and Cell Biology, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Xingyi Guo
- Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN 37203, USA
| | - Qingrun Zhang
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children's Hospital Research Institute, University of Calgary, Calgary, AB T2N 4N1, Canada.,Department of Mathematics & Statistics, University of Calgary, Calgary, AB T2N 1N4, Canada.,Department of Medical Genetics, University of Calgary, Calgary, AB T2N 4N1, Canada.,Hotchkiss Brain Institute, O'Brien Institute for Public Health, University of Calgary, Calgary, AB T2N 4N1, Canada
| |
Collapse
|
16
|
Pursuit of precision medicine: Systems biology approaches in Alzheimer's disease mouse models. Neurobiol Dis 2021; 161:105558. [PMID: 34767943 PMCID: PMC10112395 DOI: 10.1016/j.nbd.2021.105558] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 11/05/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022] Open
Abstract
Alzheimer's disease (AD) is a complex disease that is mediated by numerous factors and manifests in various forms. A systems biology approach to studying AD involves analyses of various body systems, biological scales, environmental elements, and clinical outcomes to understand the genotype to phenotype relationship that potentially drives AD development. Currently, there are many research investigations probing how modifiable and nonmodifiable factors impact AD symptom presentation. This review specifically focuses on how imaging modalities can be integrated into systems biology approaches using model mouse populations to link brain level functional and structural changes to disease onset and progression. Combining imaging and omics data promotes the classification of AD into subtypes and paves the way for precision medicine solutions to prevent and treat AD.
Collapse
|
17
|
Li X, Lin Y, Meng X, Qiu Y, Hu B. An L 0 Regularization Method for Imaging Genetics and Whole Genome Association Analysis on Alzheimer's Disease. IEEE J Biomed Health Inform 2021; 25:3677-3684. [PMID: 34181562 DOI: 10.1109/jbhi.2021.3093027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Although the neuroimaging measures build a bridge between genetic variants and disease phenotypes, an assessment of single nucleotide variants changes in brain structure and their clinically influence on the progression of Alzheimer's disease remain largely preliminary. Note that each variant has very weak correlation signal to neuroimaging measures or Alzheimer's disease phenotypes. Therefore, traditional sparse regression-based image genetics approaches confront with unresolvable features, relative high regression error or inapplicability of high-dimensional data. Adopting an [Formula: see text] regularization method, we significantly elevate the regression accuracy of imaging genetics compared with group-sparse multitask regression method. With further analysis on the simulation results, we conclude that multiple regression tasks model may be unsuitable for image genetics. In addition, we carried out a whole genome association analysis between genetic variants (about 388 million loci) and phenotypes (cognition normal, mild cognitive impairment and Alzheimer's disease) with using the [Formula: see text] regularization method. After annotating the effect of all variants by Ensembl Variant Effect Predictor (VEP), our method locates 33 missense variants which can explain 40% phenotype variance. Then, we mapped each missense variant to the nearest gene and carried out pathway enrichment analysis. The Notch signaling pathway and Apoptosis pathway have been reported to be related to the formation of Alzheimer's disease.
Collapse
|
18
|
Wu C, Bradley J, Li Y, Wu L, Deng HW. A gene-level methylome-wide association analysis identifies novel Alzheimer's disease genes. Bioinformatics 2021; 37:1933–1940. [PMID: 33523132 PMCID: PMC8337007 DOI: 10.1093/bioinformatics/btab045] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 12/31/2020] [Accepted: 01/20/2021] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Transcriptome-wide association studies (TWAS) have successfully facilitated the discovery of novel genetic risk loci for many complex traits, including late-onset Alzheimer's disease (AD). However, most existing TWAS methods rely only on gene expression and ignore epigenetic modification (i.e., DNA methylation) and functional regulatory information (i.e., enhancer-promoter interactions), both of which contribute significantly to the genetic basis of AD. RESULTS We develop a novel gene-level association testing method that integrates genetically regulated DNA methylation and enhancer-target gene pairs with genome-wide association study (GWAS) summary results. Through simulations, we show that our approach, referred to as the CMO (cross methylome omnibus) test, yielded well controlled type I error rates and achieved much higher statistical power than competing methods under a wide range of scenarios. Furthermore, compared with TWAS, CMO identified an average of 124% more associations when analyzing several brain imaging-related GWAS results. By analyzing to date the largest AD GWAS of 71,880 cases and 383,378 controls, CMO identified six novel loci for AD, which have been ignored by competing methods. AVAILABILITY Software: https://github.com/ChongWuLab/CMO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chong Wu
- Department of Statistics, Florida State University
| | | | - Yanming Li
- Department of Biostatistics & Data Science, University of Kansas Medical Center
| | - Lang Wu
- Population Sciences in the Pacific Program, University of Hawaii Cancer center
| | - Hong-Wen Deng
- Tulane Center for Biomedical Informatics and Genomics, Deming Department of Medicine, Tulane University School of Medicine
| |
Collapse
|
19
|
Nayor M, Shen L, Hunninghake GM, Kochunov P, Barr RG, Bluemke DA, Broeckel U, Caravan P, Cheng S, de Vries PS, Hoffmann U, Kolossváry M, Li H, Luo J, McNally EM, Thanassoulis G, Arnett DK, Vasan RS. Progress and Research Priorities in Imaging Genomics for Heart and Lung Disease: Summary of an NHLBI Workshop. Circ Cardiovasc Imaging 2021; 14:e012943. [PMID: 34387095 PMCID: PMC8486340 DOI: 10.1161/circimaging.121.012943] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Imaging genomics is a rapidly evolving field that combines state-of-the-art bioimaging with genomic information to resolve phenotypic heterogeneity associated with genomic variation, improve risk prediction, discover prevention approaches, and enable precision diagnosis and treatment. Contemporary bioimaging methods provide exceptional resolution generating discrete and quantitative high-dimensional phenotypes for genomics investigation. Despite substantial progress in combining high-dimensional bioimaging and genomic data, methods for imaging genomics are evolving. Recognizing the potential impact of imaging genomics on the study of heart and lung disease, the National Heart, Lung, and Blood Institute convened a workshop to review cutting-edge approaches and methodologies in imaging genomics studies, and to establish research priorities for future investigation. This report summarizes the presentations and discussions at the workshop. In particular, we highlight the need for increased availability of imaging genomics data in diverse populations, dedicated focus on less common conditions, and centralization of efforts around specific disease areas.
Collapse
Affiliation(s)
- Matthew Nayor
- Cardiology Division, Department of Medicine, Massachusetts
General Hospital, Harvard Medical School, Boston, MA
| | - Li Shen
- Department of Biostatistics, Epidemiology and Informatics,
Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Gary M. Hunninghake
- Division of Pulmonary and Critical Care Medicine, Harvard
Medical School, Brigham and Women’s Hospital, Boston, MA
| | - Peter Kochunov
- Maryland Psychiatric Research Center, Department of
Psychiatry, University of Maryland School of Medicine, Baltimore, MD
| | - R. Graham Barr
- Department of Medicine and Department of Epidemiology,
Mailman School of Public Health, Columbia University Irving Medical Center, New
York, NY
| | - David A. Bluemke
- Department of Radiology, University of Wisconsin-Madison
School of Medicine and Public Health, Madison, WI
| | - Ulrich Broeckel
- Section of Genomic Pediatrics, Department of Pediatrics,
Medicine and Physiology, Children’s Research Institute and Genomic Sciences
and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI
| | - Peter Caravan
- Institute for Innovation in Imaging, Athinoula A. Martinos
Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical
School, Charlestown, MA
| | - Susan Cheng
- Department of Cardiology, Smidt Heart Institute,
Cedars-Sinai Medical Center, Los Angeles, CA
| | - Paul S. de Vries
- Human Genetics Center, Department of Epidemiology, Human
Genetics, and Environmental Sciences, School of Public Health, The University of
Texas Health Science Center at Houston, Houston, TX
| | - Udo Hoffmann
- Department of Radiology, Harvard Medical School,
Massachusetts General Hospital, Boston, Massachusetts
| | - Márton Kolossváry
- Department of Radiology, Harvard Medical School,
Massachusetts General Hospital, Boston, Massachusetts
| | - Huiqing Li
- Division of Cardiovascular Sciences, National Heart,
Lung, and Blood Institute, Bethesda, MD
| | - James Luo
- Division of Cardiovascular Sciences, National Heart,
Lung, and Blood Institute, Bethesda, MD
| | - Elizabeth M. McNally
- Center for Genetic Medicine, Northwestern University
Feinberg School of Medicine, Chicago, IL
| | - George Thanassoulis
- Preventive and Genomic Cardiology, McGill University
Health Center and Research Institute, Montreal, Quebec, Canada
| | - Donna K. Arnett
- College of Public Health, University of Kentucky,
Lexington KY
| | - Ramachandran S. Vasan
- Sections of Preventive Medicine and Epidemiology, and
Cardiology, Department of Medicine, Department of Epidemiology, Boston University
Schools of Medicine and Public Health, and Center for Computing and Data Sciences,
Boston University, Boston, MA
| |
Collapse
|
20
|
Hampel H, Nisticò R, Seyfried NT, Levey AI, Modeste E, Lemercier P, Baldacci F, Toschi N, Garaci F, Perry G, Emanuele E, Valenzuela PL, Lucia A, Urbani A, Sancesario GM, Mapstone M, Corbo M, Vergallo A, Lista S. Omics sciences for systems biology in Alzheimer's disease: State-of-the-art of the evidence. Ageing Res Rev 2021; 69:101346. [PMID: 33915266 DOI: 10.1016/j.arr.2021.101346] [Citation(s) in RCA: 70] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 04/06/2021] [Accepted: 04/22/2021] [Indexed: 12/12/2022]
Abstract
Alzheimer's disease (AD) is characterized by non-linear, genetic-driven pathophysiological dynamics with high heterogeneity in biological alterations and disease spatial-temporal progression. Human in-vivo and post-mortem studies point out a failure of multi-level biological networks underlying AD pathophysiology, including proteostasis (amyloid-β and tau), synaptic homeostasis, inflammatory and immune responses, lipid and energy metabolism, oxidative stress. Therefore, a holistic, systems-level approach is needed to fully capture AD multi-faceted pathophysiology. Omics sciences - genomics, epigenomics, transcriptomics, proteomics, metabolomics, lipidomics - embedded in the systems biology (SB) theoretical and computational framework can generate explainable readouts describing the entire biological continuum of a disease. Such path in Neurology is encouraged by the promising results of omics sciences and SB approaches in Oncology, where stage-driven pathway-based therapies have been developed in line with the precision medicine paradigm. Multi-omics data integrated in SB network approaches will help detect and chart AD upstream pathomechanistic alterations and downstream molecular effects occurring in preclinical stages. Finally, integrating omics and neuroimaging data - i.e., neuroimaging-omics - will identify multi-dimensional biological signatures essential to track the clinical-biological trajectories, at the subpopulation or even individual level.
Collapse
|
21
|
Huang M, Lai H, Yu Y, Chen X, Wang T, Feng Q. Deep-gated recurrent unit and diet network-based genome-wide association analysis for detecting the biomarkers of Alzheimer's disease. Med Image Anal 2021; 73:102189. [PMID: 34343841 DOI: 10.1016/j.media.2021.102189] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 05/30/2021] [Accepted: 07/16/2021] [Indexed: 01/01/2023]
Abstract
Genome-wide association analysis (GWAS) is a commonly used method to detect the potential biomarkers of Alzheimer's disease (AD). Most existing GWAS methods entail a high computational cost, disregard correlations among imaging data and correlations among genetic data, and ignore various associations between longitudinal imaging and genetic data. A novel GWAS method was proposed to identify potential AD biomarkers and address these problems. A network based on a gated recurrent unit was applied without imputing incomplete longitudinal imaging data to integrate the longitudinal data of variable lengths and extract an image representation. In this study, a modified diet network that can considerably reduce the number of parameters in the genetic network was proposed to perform GWAS between image representation and genetic data. Genetic representation can be extracted in this way. A link between genetic representation and AD was established to detect potential AD biomarkers. The proposed method was tested on a set of simulated data and a real AD dataset. Results of the simulated data showed that the proposed method can accurately detect relevant biomarkers. Moreover, the results of real AD dataset showed that the proposed method can detect some new risk-related genes of AD. Based on previous reports, no research has incorporated a deep-learning model into a GWAS framework to investigate the potential information on super-high-dimensional genetic data and longitudinal imaging data and create a link between imaging genetics and AD for detecting potential AD biomarkers. Therefore, the proposed method may provide new insights into the underlying pathological mechanism of AD.
Collapse
Affiliation(s)
- Meiyan Huang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | - Haoran Lai
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Yuwei Yu
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Xiumei Chen
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Tao Wang
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China.
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou 510515, China.
| | | |
Collapse
|
22
|
Cao C, Kwok D, Edie S, Li Q, Ding B, Kossinna P, Campbell S, Wu J, Greenberg M, Long Q. kTWAS: integrating kernel machine with transcriptome-wide association studies improves statistical power and reveals novel genes. Brief Bioinform 2021; 22:5985285. [PMID: 33200776 DOI: 10.1093/bib/bbaa270] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 12/31/2022] Open
Abstract
The power of genotype-phenotype association mapping studies increases greatly when contributions from multiple variants in a focal region are meaningfully aggregated. Currently, there are two popular categories of variant aggregation methods. Transcriptome-wide association studies (TWAS) represent a set of emerging methods that select variants based on their effect on gene expressions, providing pretrained linear combinations of variants for downstream association mapping. In contrast to this, kernel methods such as sequence kernel association test (SKAT) model genotypic and phenotypic variance use various kernel functions that capture genetic similarity between subjects, allowing nonlinear effects to be included. From the perspective of machine learning, these two methods cover two complementary aspects of feature engineering: feature selection/pruning and feature aggregation. Thus far, no thorough comparison has been made between these categories, and no methods exist which incorporate the advantages of TWAS- and kernel-based methods. In this work, we developed a novel method called kernel-based TWAS (kTWAS) that applies TWAS-like feature selection to a SKAT-like kernel association test, combining the strengths of both approaches. Through extensive simulations, we demonstrate that kTWAS has higher power than TWAS and multiple SKAT-based protocols, and we identify novel disease-associated genes in Wellcome Trust Case Control Consortium genotyping array data and MSSNG (Autism) sequence data. The source code for kTWAS and our simulations are available in our GitHub repository (https://github.com/theLongLab/kTWAS).
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, University of Calgary
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary
| | | | - Qing Li
- Department of Biochemistry & Molecular Biology, University of Calgary
| | - Bowei Ding
- Department of Mathematics & Statistics, University of Calgary
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, University of Calgary
| | | | - Jingjing Wu
- Department of Mathematics & Statistics, University of Calgary
| | | | - Quan Long
- Departments of Biochemistry & Molecular Biology, Medical Genetics and Mathematics & Statistics
| |
Collapse
|
23
|
Knutson KA, Pan W. Integrating brain imaging endophenotypes with GWAS for Alzheimer's disease. QUANTITATIVE BIOLOGY 2021; 9:185-200. [PMID: 35399757 PMCID: PMC8993183 DOI: 10.1007/s40484-020-0202-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 02/11/2020] [Accepted: 02/28/2020] [Indexed: 01/09/2023]
Abstract
Background Genome wide association studies (GWAS) have identified many genetic variants associated with increased risk of Alzheimer's disease (AD). These susceptibility loci may effect AD indirectly through a combination of physiological brain changes. Many of these neuropathologic features are detectable via magnetic resonance imaging (MRI). Methods In this study, we examine the effects of such brain imaging derived phenotypes (IDPs) with genetic etiology on AD, using and comparing the following methods: two-sample Mendelian randomization (2SMR), generalized summary statistics based Mendelian randomization (GSMR), transcriptome wide association studies (TWAS) and the adaptive sum of powered score (aSPU) test. These methods do not require individual-level genotypic and phenotypic data but instead can rely only on an external reference panel and GWAS summary statistics. Results Using publicly available GWAS datasets from the International Genomics of Alzheimer's Project (IGAP) and UK Biobank's (UKBB) brain imaging initiatives, we identify 35 IDPs possibly associated with AD, many of which have well established or biologically plausible links to the characteristic cognitive impairments of this neurodegenerative disease. Conclusions Our results highlight the increased power for detecting genetic associations achieved by multiple correlated SNP-based methods, i.e., aSPU, GSMR and TWAS, over MR methods based on independent SNPs (as instrumental variables).
Collapse
Affiliation(s)
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
24
|
He Z, Liu L, Wang C, Le Guen Y, Lee J, Gogarten S, Lu F, Montgomery S, Tang H, Silverman EK, Cho MH, Greicius M, Ionita-Laza I. Identification of putative causal loci in whole-genome sequencing data via knockoff statistics. Nat Commun 2021; 12:3152. [PMID: 34035245 PMCID: PMC8149672 DOI: 10.1038/s41467-021-22889-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Accepted: 03/26/2021] [Indexed: 02/04/2023] Open
Abstract
The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability, and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.
Collapse
Affiliation(s)
- Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA.
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, USA.
| | - Linxi Liu
- Department of Statistics, Columbia University, New York, NY, USA
| | - Chen Wang
- Department of Biostatistics, Columbia University, New York, NY, USA
| | - Yann Le Guen
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA
| | - Justin Lee
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, USA
| | | | - Fred Lu
- Department of Statistics, Stanford University, Stanford, CA, USA
| | - Stephen Montgomery
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Pathology, Stanford University, Stanford, CA, USA
| | - Hua Tang
- Department of Statistics, Stanford University, Stanford, CA, USA
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Michael H Cho
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine Division, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Michael Greicius
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
25
|
Li Y, Yu C, Zhao Y, Yao W, Aseltine RH, Chen K. Pursuing sources of heterogeneity in modeling clustered population. Biometrics 2021; 78:716-729. [PMID: 33527347 DOI: 10.1111/biom.13434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 10/22/2020] [Accepted: 01/13/2021] [Indexed: 11/28/2022]
Abstract
Researchers often have to deal with heterogeneous population with mixed regression relationships, increasingly so in the era of data explosion. In such problems, when there are many candidate predictors, it is not only of interest to identify the predictors that are associated with the outcome, but also to distinguish the true sources of heterogeneity, that is, to identify the predictors that have different effects among the clusters and thus are the true contributors to the formation of the clusters. We clarify the concepts of the source of heterogeneity that account for potential scale differences of the clusters and propose a regularized finite mixture effects regression to achieve heterogeneity pursuit and feature selection simultaneously. We develop an efficient algorithm and show that our approach can achieve both estimation and selection consistency. Simulation studies further demonstrate the effectiveness of our method under various practical scenarios. Three applications are presented, namely, an imaging genetics study for linking genetic factors and brain neuroimaging traits in Alzheimer's disease, a public health study for exploring the association between suicide risk among adolescents and their school district characteristics, and a sport analytics study for understanding how the salary levels of baseball players are associated with their performance and contractual status.
Collapse
Affiliation(s)
- Yan Li
- Department of Statistics, University of Connecticut, Storrs, Connecticut
| | - Chun Yu
- School of Statistics, Jiangxi University of Finance and Economics, Nanchang, China
| | - Yize Zhao
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut
| | - Weixin Yao
- Department of Statistics, University of California, Riverside, California
| | - Robert H Aseltine
- Center for Population Health, University of Connecticut Health Center, Farmington, Connecticut
| | - Kun Chen
- Department of Statistics, University of Connecticut, Storrs, Connecticut.,Center for Population Health, University of Connecticut Health Center, Farmington, Connecticut
| |
Collapse
|
26
|
Cao C, Ding B, Li Q, Kwok D, Wu J, Long Q. Power analysis of transcriptome-wide association study: Implications for practical protocol choice. PLoS Genet 2021; 17:e1009405. [PMID: 33635859 PMCID: PMC7946362 DOI: 10.1371/journal.pgen.1009405] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 03/10/2021] [Accepted: 02/06/2021] [Indexed: 12/12/2022] Open
Abstract
The transcriptome-wide association study (TWAS) has emerged as one of several promising techniques for integrating multi-scale 'omics' data into traditional genome-wide association studies (GWAS). Unlike GWAS, which associates phenotypic variance directly with genetic variants, TWAS uses a reference dataset to train a predictive model for gene expressions, which allows it to associate phenotype with variants through the mediating effect of expressions. Although effective, this core innovation of TWAS is poorly understood, since the predictive accuracy of the genotype-expression model is generally low and further bounded by expression heritability. This raises the question: to what degree does the accuracy of the expression model affect the power of TWAS? Furthermore, would replacing predictions with actual, experimentally determined expressions improve power? To answer these questions, we compared the power of GWAS, TWAS, and a hypothetical protocol utilizing real expression data. We derived non-centrality parameters (NCPs) for linear mixed models (LMMs) to enable closed-form calculations of statistical power that do not rely on specific protocol implementations. We examined two representative scenarios: causality (genotype contributes to phenotype through expression) and pleiotropy (genotype contributes directly to both phenotype and expression), and also tested the effects of various properties including expression heritability. Our analysis reveals two main outcomes: (1) Under pleiotropy, the use of predicted expressions in TWAS is superior to actual expressions. This explains why TWAS can function with weak expression models, and shows that TWAS remains relevant even when real expressions are available. (2) GWAS outperforms TWAS when expression heritability is below a threshold of 0.04 under causality, or 0.06 under pleiotropy. Analysis of existing publications suggests that TWAS has been misapplied in place of GWAS, in situations where expression heritability is low.
Collapse
Affiliation(s)
- Chen Cao
- Department of Biochemistry & Molecular Biology, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Bowei Ding
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
| | - Qing Li
- Department of Biochemistry & Molecular Biology, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
| | - Devin Kwok
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
| | - Jingjing Wu
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
- Department of Mathematics & Statistics, University of Calgary, Calgary, Canada
- Department of Medical Genetics, University of Calgary, Calgary, Canada
- Hotchkiss Brain Institute, O’Brien Institute for Public Health, University of Calgary, Calgary, Canada
| |
Collapse
|
27
|
Panyard DJ, Kim KM, Darst BF, Deming YK, Zhong X, Wu Y, Kang H, Carlsson CM, Johnson SC, Asthana S, Engelman CD, Lu Q. Cerebrospinal fluid metabolomics identifies 19 brain-related phenotype associations. Commun Biol 2021; 4:63. [PMID: 33437055 PMCID: PMC7803963 DOI: 10.1038/s42003-020-01583-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 12/09/2020] [Indexed: 02/07/2023] Open
Abstract
The study of metabolomics and disease has enabled the discovery of new risk factors, diagnostic markers, and drug targets. For neurological and psychiatric phenotypes, the cerebrospinal fluid (CSF) is of particular importance. However, the CSF metabolome is difficult to study on a large scale due to the relative complexity of the procedure needed to collect the fluid. Here, we present a metabolome-wide association study (MWAS), which uses genetic and metabolomic data to impute metabolites into large samples with genome-wide association summary statistics. We conduct a metabolome-wide, genome-wide association analysis with 338 CSF metabolites, identifying 16 genotype-metabolite associations (metabolite quantitative trait loci, or mQTLs). We then build prediction models for all available CSF metabolites and test for associations with 27 neurological and psychiatric phenotypes, identifying 19 significant CSF metabolite-phenotype associations. Our results demonstrate the feasibility of MWAS to study omic data in scarce sample types.
Collapse
Grants
- R01 AG037639 NIA NIH HHS
- UL1 TR000427 NCATS NIH HHS
- T15 LM007359 NLM NIH HHS
- T32 LM012413 NLM NIH HHS
- RF1 AG027161 NIA NIH HHS
- T32 AG000213 NIA NIH HHS
- P2C HD047873 NICHD NIH HHS
- UL1 TR002373 NCATS NIH HHS
- P30 AG062715 NIA NIH HHS
- P50 AG033514 NIA NIH HHS
- R01 AG027161 NIA NIH HHS
- R01 AG054047 NIA NIH HHS
- P30 AG017266 NIA NIH HHS
- R21 AG067092 NIA NIH HHS
- U.S. Department of Health & Human Services | NIH | National Institute on Aging (U.S. National Institute on Aging)
- U.S. Department of Health & Human Services | NIH | Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)
- U.S. Department of Health & Human Services | NIH | U.S. National Library of Medicine (NLM)
- NSF | Directorate for Mathematical & Physical Sciences | Division of Mathematical Sciences (DMS)
- U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences (NCATS)
- This research is supported by National Institutes of Health (NIH) grants R01AG27161 (Wisconsin Registry for Alzheimer Prevention: Biomarkers of Preclinical AD), R01AG054047 (Genomic and Metabolomic Data Integration in a Longitudinal Cohort at Risk for Alzheimer’s Disease), R21AG067092 (Identifying Metabolomic Risk Factors in Plasma and Cerebrospinal Fluid for Alzheimer’s Disease), R01AG037639 (White Matter Degeneration: Biomarkers in Preclinical Alzheimer’s Disease), P30AG017266 (Center for Demography of Health and Aging), and P50AG033514 and P30AG062715 (Wisconsin Alzheimer’s Disease Research Center Grant), the Helen Bader Foundation, Northwestern Mutual Foundation, Extendicare Foundation, State of Wisconsin, the Clinical and Translational Science Award (CTSA) program through the NIH National Center for Advancing Translational Sciences (NCATS) grant UL1TR000427, and the University of Wisconsin-Madison Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation. This research was supported in part by the Intramural Research Program of the National Institute on Aging. Computational resources were supported by a core grant to the Center for Demography and Ecology at the University of Wisconsin-Madison (P2CHD047873). Author DJP was supported by an NLM training grant to the Bio-Data Science Training Program (T32LM012413). Author BFD was supported by an NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM 5T15LM007359). Author YKD was supported by a training grant from the National Institute on Aging (T32AG000213). Author HK was supported by National Science Foundation (NSF) grant DMS-1811414 (Theory and Methods for Inferring Causal Effects with Mendelian Randomization).
Collapse
Affiliation(s)
- Daniel J Panyard
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, 707 WARF Building, Madison, WI, 53726, USA
| | - Kyeong Mo Kim
- Department of Biotechnology, Yonsei University, 50 Yonsei-ro Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Burcu F Darst
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, 1450 Biggy Street, Los Angeles, CA, 90033, USA
| | - Yuetiva K Deming
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, 707 WARF Building, Madison, WI, 53726, USA
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI, 53792, USA
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI, 53705, USA
| | - Xiaoyuan Zhong
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, WARF Room 201, 610 Walnut Street, Madison, WI, 53726, USA
| | - Yuchang Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, WARF Room 201, 610 Walnut Street, Madison, WI, 53726, USA
| | - Hyunseung Kang
- Department of Statistics, University of Wisconsin-Madison, 1300 University Avenue, Madison, WI, 53706, USA
| | - Cynthia M Carlsson
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI, 53792, USA
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI, 53705, USA
- William S. Middleton Memorial Veterans Hospital, 2500 Overlook Terrace, Madison, WI, 53705, USA
| | - Sterling C Johnson
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI, 53792, USA
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI, 53705, USA
- William S. Middleton Memorial Veterans Hospital, 2500 Overlook Terrace, Madison, WI, 53705, USA
| | - Sanjay Asthana
- Wisconsin Alzheimer's Disease Research Center, University of Wisconsin-Madison, 600 Highland Avenue, J5/1 Mezzanine, Madison, WI, 53792, USA
- Department of Medicine, University of Wisconsin-Madison, 1685 Highland Avenue, 5158 Medical Foundation Centennial Building, Madison, WI, 53705, USA
- William S. Middleton Memorial Veterans Hospital, 2500 Overlook Terrace, Madison, WI, 53705, USA
| | - Corinne D Engelman
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 Walnut Street, 707 WARF Building, Madison, WI, 53726, USA
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, WARF Room 201, 610 Walnut Street, Madison, WI, 53726, USA.
- Department of Statistics, University of Wisconsin-Madison, 1300 University Avenue, Madison, WI, 53706, USA.
| |
Collapse
|
28
|
Xie Y, Shan N, Zhao H, Hou L. Transcriptome wide association studies: general framework and methods. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-020-0228] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
29
|
Knutson KA, Deng Y, Pan W. Implicating causal brain imaging endophenotypes in Alzheimer's disease using multivariable IWAS and GWAS summary data. Neuroimage 2020; 223:117347. [PMID: 32898681 PMCID: PMC7778364 DOI: 10.1016/j.neuroimage.2020.117347] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Revised: 08/24/2020] [Accepted: 08/28/2020] [Indexed: 02/06/2023] Open
Abstract
Recent evidence suggests the existence of many undiscovered heritable brain phenotypes involved in Alzheimer's Disease (AD) pathogenesis. This finding necessitates methods for the discovery of causal brain changes in AD that integrate Magnetic Resonance Imaging measures and genotypic data. However, existing approaches for causal inference in this setting, such as the univariate Imaging Wide Association Study (UV-IWAS), suffer from inconsistent effect estimation and inflated Type I errors in the presence of genetic pleiotropy, the phenomenon in which a variant affects multiple causal intermediate risk phenotypes. In this study, we implement a multivariate extension to the IWAS model, namely MV-IWAS, to consistently estimate and test for the causal effects of multiple brain imaging endophenotypes from the Alzheimer's Disease Neuroimaging Initiative (ADNI) in the presence of pleiotropic and possibly correlated SNPs. We further extend MV-IWAS to incorporate variant-specific direct effects on AD, analogous to the existing Egger regression Mendelian Randomization approach, which allows for testing of remaining pleiotropy after adjusting for multiple intermediate pathways. We propose a convenient approach for implementing MV-IWAS that solely relies on publicly available GWAS summary data and a reference panel. Through simulations with either individual-level or summary data, we demonstrate the well controlled Type I errors and superior power of MV-IWAS over UV-IWAS in the presence of pleiotropic SNPs. We apply the summary statistic based tests to 1578 heritable imaging derived phenotypes (IDPs) from the UK Biobank. MV-IWAS detected numerous IDPs as possible false positives by UV-IWAS while uncovering many additional causal neuroimaging phenotypes in AD which are strongly supported by the existing literature.
Collapse
Affiliation(s)
- Katherine A Knutson
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota United States
| | - Yangqing Deng
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota United States
| | - Wei Pan
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota United States.
| |
Collapse
|
30
|
Zhang Y, Hao Y, Li L, Xia K, Wu G. A Novel Computational Proxy for Characterizing Cognitive Reserve in Alzheimer's Disease. J Alzheimers Dis 2020; 78:1217-1228. [PMID: 33252088 DOI: 10.3233/jad-201011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
BACKGROUND Although the abnormal depositions of amyloid plaques and neurofibrillary tangles are the hallmark of Alzheimer's disease (AD), converging evidence shows that the individual's neurodegeneration trajectory is regulated by the brain's capability to maintain normal cognition. OBJECTIVE The concept of cognitive reserve has been introduced into the field of neuroscience, acting as a moderating factor for explaining the paradoxical relationship between the burden of AD pathology and the clinical outcome. It is of high demand to quantify the degree of conceptual cognitive reserve on an individual basis. METHODS We propose a novel statistical model to quantify an individual's cognitive reserve against neuropathological burdens, where the predictors include demographic data (such as age and gender), socioeconomic factors (such as education and occupation), cerebrospinal fluid biomarkers, and AD-related polygenetic risk score. We conceptualize cognitive reserve as a joint product of AD pathology and socioeconomic factors where their interaction manifests a significant role in counteracting the progression of AD in our statistical model. RESULTS We apply our statistical models to re-investigate the moderated neurodegeneration trajectory by considering cognitive reserve, where we have discovered that 1) high education individuals have significantly higher reserve against the neuropathology than the low education group; however, 2) the cognitive decline in the high education group is significantly faster than low education individuals after the level of pathological burden increases beyond the tipping point. CONCLUSION We propose a computational proxy of cognitive reserve that can be used in clinical routine to assess the progression of AD.
Collapse
Affiliation(s)
- Ying Zhang
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yajing Hao
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Lang Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Kai Xia
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Guorong Wu
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | |
Collapse
|
31
|
Xue H, Pan W. Inferring causal direction between two traits in the presence of horizontal pleiotropy with GWAS summary data. PLoS Genet 2020; 16:e1009105. [PMID: 33137120 PMCID: PMC7660933 DOI: 10.1371/journal.pgen.1009105] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 11/12/2020] [Accepted: 09/08/2020] [Indexed: 01/14/2023] Open
Abstract
Orienting the causal relationship between pairs of traits is a fundamental task in scientific research with significant implications in practice, such as in prioritizing molecular targets and modifiable risk factors for developing therapeutic and interventional strategies for complex diseases. A recent method, called Steiger's method, using a single SNP as an instrument variable (IV) in the framework of Mendelian randomization (MR), has since been widely applied. We report the following new contributions. First, we propose a single SNP-based alternative, overcoming a severe limitation of Steiger's method in simply assuming, instead of inferring, the existence of a causal relationship. We also clarify a condition necessary for the validity of the methods in the presence of hidden confounding. Second, to improve statistical power, we propose combining the results from multiple, and possibly correlated, SNPs as multiple instruments. Third, we develop three goodness-of-fit tests to check modeling assumptions, including those required for valid IVs. Fourth, by relaxing one of the three IV assumptions in MR, we propose several methods, including an Egger regression-like approach and its multivariable version (analogous to multivariable MR), to account for horizontal pleiotropy of the SNPs/IVs, which is often unavoidable in practice. All our methods can simultaneously infer both the existence and (if so) the direction of a causal relationship, largely expanding their applicability over that of Steiger's method. Although we focus on uni-directional causal relationships, we also briefly discuss an extension to bi-directional relationships. Through extensive simulations and an application to infer the causal directions between low density lipoprotein (LDL) cholesterol, or high density lipoprotein (HDL) cholesterol, and coronary artery disease (CAD), we demonstrate the superior performance and advantage of our proposed methods over Steiger's method and bi-directional MR. In particular, after accounting for horizontal pleiotropy, our method confirmed the well known causal direction from LDL to CAD, while other methods, including bi-directional MR, might fail.
Collapse
Affiliation(s)
- Haoran Xue
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
32
|
Neuner SM, Tcw J, Goate AM. Genetic architecture of Alzheimer's disease. Neurobiol Dis 2020; 143:104976. [PMID: 32565066 PMCID: PMC7409822 DOI: 10.1016/j.nbd.2020.104976] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 05/30/2020] [Accepted: 06/13/2020] [Indexed: 02/06/2023] Open
Abstract
Advances in genetic and genomic technologies over the last thirty years have greatly enhanced our knowledge concerning the genetic architecture of Alzheimer's disease (AD). Several genes including APP, PSEN1, PSEN2, and APOE have been shown to exhibit large effects on disease susceptibility, with the remaining risk loci having much smaller effects on AD risk. Notably, common genetic variants impacting AD are not randomly distributed across the genome. Instead, these variants are enriched within regulatory elements active in human myeloid cells, and to a lesser extent liver cells, implicating these cell and tissue types as critical to disease etiology. Integrative approaches are emerging as highly effective for identifying the specific target genes through which AD risk variants act and will likely yield important insights related to potential therapeutic targets in the coming years. In the future, additional consideration of sex- and ethnicity-specific contributions to risk as well as the contribution of complex gene-gene and gene-environment interactions will likely be necessary to further improve our understanding of AD genetic architecture.
Collapse
Affiliation(s)
- Sarah M Neuner
- Nash Department of Neuroscience, Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Julia Tcw
- Nash Department of Neuroscience, Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, USA
| | - Alison M Goate
- Nash Department of Neuroscience, Ronald M. Loeb Center for Alzheimer's Disease, Icahn School of Medicine at Mount Sinai, New York, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, USA.
| |
Collapse
|
33
|
Soleimani Zakeri NS, Pashazadeh S, MotieGhader H. Gene biomarker discovery at different stages of Alzheimer using gene co-expression network approach. Sci Rep 2020; 10:12210. [PMID: 32699331 PMCID: PMC7376049 DOI: 10.1038/s41598-020-69249-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 07/08/2020] [Indexed: 12/24/2022] Open
Abstract
Alzheimer's disease (AD) is a chronic neurodegenerative disorder. It is the most common type of dementia that has remained as an incurable disease in the world, which destroys the brain cells irreversibly. In this study, a systems biology approach was adopted to discover novel micro-RNA and gene-based biomarkers of the diagnosis of Alzheimer's disease. The gene expression data from three AD stages (Normal, Mild Cognitive Impairment, and Alzheimer) were used to reconstruct co-expression networks. After preprocessing and normalization, Weighted Gene Co-Expression Network Analysis (WGCNA) was used on a total of 329 samples, including 145 samples of Alzheimer stage, 80 samples of Mild Cognitive Impairment (MCI) stage, and 104 samples of the Normal stage. Next, three gene-miRNA bipartite networks were reconstructed by comparing the changes in module groups. Then, the functional enrichment analyses of extracted genes of three bipartite networks and miRNAs were done, respectively. Finally, a detailed analysis of the authentic studies was performed to discuss the obtained biomarkers. The outcomes addressed proposed novel genes, including MBOAT1, ARMC7, RABL2B, HNRNPUL1, LAMTOR1, PLAGL2, CREBRF, LCOR, and MRI1and novel miRNAs comprising miR-615-3p, miR-4722-5p, miR-4768-3p, miR-1827, miR-940 and miR-30b-3p which were related to AD. These biomarkers were proposed to be related to AD for the first time and should be examined in future clinical studies.
Collapse
Affiliation(s)
| | - Saeid Pashazadeh
- Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran.
| | - Habib MotieGhader
- Department of Computer Engineering, Gowgan Educational Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| |
Collapse
|
34
|
Xue H, Pan W. Some statistical consideration in transcriptome-wide association studies. Genet Epidemiol 2020; 44:221-232. [PMID: 31821608 PMCID: PMC7064426 DOI: 10.1002/gepi.22274] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2019] [Revised: 10/01/2019] [Accepted: 11/25/2019] [Indexed: 11/08/2022]
Abstract
The methodology of transcriptome-wide association studies (TWAS) has become popular in integrating a reference expression quantitative trait (eQTL) data set with an independent main GWAS data set to identify (putatively) causal genes, shedding mechanistic insights to biological pathways from genetic variants to a GWAS trait mediated by gene expression. Statistically TWAS is a (two-sample) 2-stage least squares (2SLS) method in the framework of instrumental variables analysis for causal inference: in Stage 1 it uses the reference eQTL data to impute a genes expression for the main GWAS data, then in Stage 2 it tests for association between the imputed gene expression and the GWAS trait; if an association is detected in Stage 2, a (putatively) causal relationship between the gene and the GWAS trait is claimed. If a nonlinear model or a generalized linear model (GLM) is fitted in Stage 2 (e.g., for a binary GWAS trait), it is known that using only imputed gene expression, as in standard TWAS, in general does not lead to a consistent (i.e., asymptotically unbiased) estimate for the causal effect; accordingly, a variation of 2SLS, called two-stage residual inclusion (2SRI), has been proposed to yield better estimates (e.g., being consistent under suitable conditions). Our main goal is to investigate whether it is necessary or even better to apply 2SRI, instead of the standard 2SLS. In addition, due to the use of imputed gene expression (i.e., with measurement errors), it is known that in general some correction to the standard error estimate of the causal effect estimate has to be applied, while in the standard TWAS no correction is applied. Is this an issue? We also compare one-sample 2SLS with two-sample 2SLS (i.e., the standard TWAS). We used the Alzheimer's Disease Neuroimaging Initiative (ADNI) data and simulated data mimicking the ADNI data to address the above questions. At the end, we conclude that, in practice with the large sample sizes and small effect sizes of genetic variants, the standard TWAS performs well and is recommended.
Collapse
Affiliation(s)
- Haoran Xue
- School of Statistics, University of Minnesota, Minneapolis, Minnesota
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| |
Collapse
|
35
|
Shen L, Thompson PM. Brain Imaging Genomics: Integrated Analysis and Machine Learning. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2020; 108:125-162. [PMID: 31902950 PMCID: PMC6941751 DOI: 10.1109/jproc.2019.2947272] [Citation(s) in RCA: 88] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Brain imaging genomics is an emerging data science field, where integrated analysis of brain imaging and genomics data, often combined with other biomarker, clinical and environmental data, is performed to gain new insights into the phenotypic, genetic and molecular characteristics of the brain as well as their impact on normal and disordered brain function and behavior. It has enormous potential to contribute significantly to biomedical discoveries in brain science. Given the increasingly important role of statistical and machine learning in biomedicine and rapidly growing literature in brain imaging genomics, we provide an up-to-date and comprehensive review of statistical and machine learning methods for brain imaging genomics, as well as a practical discussion on method selection for various biomedical applications.
Collapse
Affiliation(s)
- Li Shen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, PA 19104, USA
| | - Paul M Thompson
- Imaging Genetics Center, Mark & Mary Stevens Institute for Neuroimaging & Informatics, Keck School of Medicine, University of Southern California, Los Angeles, CA 90232, USA
| |
Collapse
|
36
|
Huang M, Yu Y, Yang W, Feng Q. Incorporating spatial-anatomical similarity into the VGWAS framework for AD biomarker detection. Bioinformatics 2019; 35:5271-5280. [PMID: 31095298 PMCID: PMC6954655 DOI: 10.1093/bioinformatics/btz401] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 04/03/2019] [Accepted: 05/07/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The detection of potential biomarkers of Alzheimer's disease (AD) is crucial for its early prediction, diagnosis and treatment. Voxel-wise genome-wide association study (VGWAS) is a commonly used method in imaging genomics and usually applied to detect AD biomarkers in imaging and genetic data. However, existing VGWAS methods entail large computational cost and disregard spatial correlations within imaging data. A novel method is proposed to solve these issues. RESULTS We introduce a novel method to incorporate spatial correlations into a VGWAS framework for the detection of potential AD biomarkers. To consider the characteristics of AD, we first present a modification of a simple linear iterative clustering method for spatial grouping in an anatomically meaningful manner. Second, we propose a spatial-anatomical similarity matrix to incorporate correlations among voxels. Finally, we detect the potential AD biomarkers from imaging and genetic data by using a fast VGWAS method and test our method on 708 subjects obtained from an Alzheimer's Disease Neuroimaging Initiative dataset. Results show that our method can successfully detect some new risk genes and clusters of AD. The detected imaging and genetic biomarkers are used as predictors to classify AD/normal control subjects, and a high accuracy of AD/normal control classification is achieved. To the best of our knowledge, the association between imaging and genetic data has yet to be systematically investigated while building statistical models for classifying AD subjects to create a link between imaging genetics and AD. Therefore, our method may provide a new way to gain insights into the underlying pathological mechanism of AD. AVAILABILITY AND IMPLEMENTATION https://github.com/Meiyan88/SASM-VGWAS.
Collapse
Affiliation(s)
- Meiyan Huang
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Yuwei Yu
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Wei Yang
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | - Qianjin Feng
- Guangdong Provincial Key Laboratory of Medical Image Processing, School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, China
| | | |
Collapse
|
37
|
Pattee J, Zhan X, Xiao G, Pan W. Integrating germline and somatic genetics to identify genes associated with lung cancer. Genet Epidemiol 2019; 44:233-247. [PMID: 31821614 DOI: 10.1002/gepi.22275] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Revised: 10/31/2019] [Accepted: 11/25/2019] [Indexed: 12/22/2022]
Abstract
Genome-wide association studies (GWAS) have successfully identified many genetic variants associated with complex traits. However, GWAS experience power issues, resulting in the failure to detect certain associated variants. Additionally, GWAS are often unable to parse the biological mechanisms of driving associations. An existing gene-based association test framework, Transcriptome-Wide Association Studies (TWAS), leverages expression quantitative trait loci data to increase the power of association tests and illuminate the biological mechanisms by which genetic variants modulate complex traits. We extend the TWAS methodology to incorporate somatic information from tumors. By integrating germline and somatic data we are able to leverage information from the nuanced somatic landscape of tumors. Thus we can augment the power of TWAS-type tests to detect germline genetic variants associated with cancer phenotypes. We use somatic and germline data on lung adenocarcinomas from The Cancer Genome Atlas in conjunction with a meta-analyzed lung cancer GWAS to identify novel genes associated with lung cancer.
Collapse
Affiliation(s)
- Jack Pattee
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| | - Xiaowei Zhan
- Quantitative Biomedical Research Center, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, Texas
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| |
Collapse
|
38
|
Abstract
Radiogenomics, defined as the integrated analysis of radiologic imaging and genetic data, is a well-established tool shown to augment neuroimaging in the clinical diagnosis, prognostication, and scientific study of late-onset Alzheimer disease (LOAD). Early work using candidate single nucleotide polymorphisms (SNPs) identified genetic variation in APOE, BIN1, CLU, and CR1 as key modifiers of brain structure and function using magnetic resonance imaging (MRI). More recently, polygenic risk scores used in conjunction with MRI and positron emission tomography have shown great promise as a risk-stratification tool for clinical trials and care-management decisions. In addition, recent work using multimodal MRI and positron emission tomography as proxies of LOAD progression has identified novel risk variants that are enhancing our understanding of LOAD pathophysiology and progression. Herein, we highlight key studies and trends in the radiogenomics of LOAD over the past two decades and their implications for clinical practice and scientific research.
Collapse
|
39
|
Wang ZT, Chen SD, Xu W, Chen KL, Wang HF, Tan CC, Cui M, Dong Q, Tan L, Yu JT. Genome-wide association study identifies CD1A associated with rate of increase in plasma neurofilament light in non-demented elders. Aging (Albany NY) 2019; 11:4521-4535. [PMID: 31295725 PMCID: PMC6660034 DOI: 10.18632/aging.102066] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 06/25/2019] [Indexed: 02/07/2023]
Abstract
As a marker of neuroaxonal injury, neurofilament light (NFL) in blood is robustly elevated in many neurodegenerative conditions. We aimed to discover single nucleotide polymorphisms (SNPs) associated with longitudinal changes in plasma NFL levels that affect the risk of developing neurodegenerative disease and clinical disease progression. 545 eligible non-Hispanic white participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) with longitudinal plasma NFL data were included. Three SNPs (rs16840041, p=4.50×10-8; rs2269714, p=4.50×10-8; rs2269715, p=4.83×10-8) in CD1A were in high linkage disequilibrium (LD) and significantly associated with the increase in plasma NFL levels. We demonstrate a promoting effect of rs16840041-A on clinical disease progression (p = 0.006). Moreover, the minor allele (A) of rs16840041 was significantly associated with accelerated decline in [18F] Fluorodeoxyglucose (FDG) (estimate -1.6% per year [95% CI -0.6 to -2.6], p=0.0024). CD1A is a gene involved in longitudinal changes in plasma NFL levels and AD-related phenotypes among non-demented elders. Given the potential effects of these variants, CD1A should be further investigated as a gene of interest in neurodegenerative diseases and as a potential target for monitoring disease trajectories and treating disease.
Collapse
Affiliation(s)
- Zuo-Teng Wang
- Department of Neurology, Qingdao Municipal Hospital, College of Medicine and Pharmaceutics, Ocean University of China, Qingdao, China
| | - Shi-Dong Chen
- Department of Neurology and Institute of Neurology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wei Xu
- Department of Neurology, Qingdao Municipal Hospital, College of Medicine and Pharmaceutics, Ocean University of China, Qingdao, China
| | - Ke-Liang Chen
- Department of Neurology and Institute of Neurology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Hui-Fu Wang
- Department of Neurology, Qingdao Municipal Hospital, Qingdao University, Qingdao, China
| | - Chen-Chen Tan
- Department of Neurology, Qingdao Municipal Hospital, Qingdao University, Qingdao, China
| | - Mei Cui
- Department of Neurology and Institute of Neurology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Qiang Dong
- Department of Neurology and Institute of Neurology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Lan Tan
- Department of Neurology, Qingdao Municipal Hospital, College of Medicine and Pharmaceutics, Ocean University of China, Qingdao, China
- Department of Neurology, Qingdao Municipal Hospital, Qingdao University, Qingdao, China
| | - Jin-Tai Yu
- Department of Neurology and Institute of Neurology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | | |
Collapse
|
40
|
Wu C, Pan W. Integrating eQTL data with GWAS summary statistics in pathway-based analysis with application to schizophrenia. Genet Epidemiol 2018; 42:303-316. [PMID: 29411426 PMCID: PMC5851843 DOI: 10.1002/gepi.22110] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Revised: 01/04/2018] [Accepted: 01/04/2018] [Indexed: 12/11/2022]
Abstract
Many genetic variants affect complex traits through gene expression, which can be exploited to boost statistical power and enhance interpretation in genome-wide association studies (GWASs) as demonstrated by the transcriptome-wide association study (TWAS) approach. Furthermore, due to polygenic inheritance, a complex trait is often affected by multiple genes with similar functions as annotated in gene pathways. Here, we extend TWAS from gene-based analysis to pathway-based analysis: we integrate public pathway collections, expression quantitative trait locus (eQTL) data and GWAS summary association statistics (or GWAS individual-level data) to identify gene pathways associated with complex traits. The basic idea is to weight the SNPs of the genes in a pathway based on their estimated cis-effects on gene expression, then adaptively test for association of the pathway with a GWAS trait by effectively aggregating possibly weak association signals across the genes in the pathway. The P values can be calculated analytically and thus fast. We applied our proposed test with the KEGG and GO pathways to two schizophrenia (SCZ) GWAS summary association data sets, denoted by SCZ1 and SCZ2 with about 20,000 and 150,000 subjects, respectively. Most of the significant pathways identified by analyzing the SCZ1 data were reproduced by the SCZ2 data. Importantly, we identified 15 novel pathways associated with SCZ, such as GABA receptor complex (GO:1902710), which could not be uncovered by the standard single SNP-based analysis or gene-based TWAS. The newly identified pathways may help us gain insights into the biological mechanism underlying SCZ. Our results showcase the power of incorporating gene expression information and gene functional annotations into pathway-based association testing for GWAS.
Collapse
Affiliation(s)
- Chong Wu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
41
|
A Powerful Framework for Integrating eQTL and GWAS Summary Data. Genetics 2017; 207:893-902. [PMID: 28893853 DOI: 10.1534/genetics.117.300270] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Accepted: 09/05/2017] [Indexed: 01/26/2023] Open
Abstract
Two new gene-based association analysis methods, called PrediXcan and TWAS for GWAS individual-level and summary data, respectively, were recently proposed to integrate GWAS with eQTL data, alleviating two common problems in GWAS by boosting statistical power and facilitating biological interpretation of GWAS discoveries. Based on a novel reformulation of PrediXcan and TWAS, we propose a more powerful gene-based association test to integrate single set or multiple sets of eQTL data with GWAS individual-level data or summary statistics. The proposed test was applied to several GWAS datasets, including two lipid summary association datasets based on [Formula: see text] and [Formula: see text] samples, respectively, and uncovered more known or novel trait-associated genes, showcasing much improved performance of our proposed method. The software implementing the proposed method is freely available as an R package.
Collapse
|