1
|
Moulistanos A, Papasakellariou K, Kavakiotis I, Gkagkavouzis K, Karaiskou N, Antonopoulou E, Triantafyllidis A, Papakostas S. Genomic Signatures of Domestication in European Seabass ( Dicentrarchus labrax L.) Reveal a Potential Role for Epigenetic Regulation in Adaptation to Captivity. Ecol Evol 2024; 14:e70512. [PMID: 39629177 PMCID: PMC11612516 DOI: 10.1002/ece3.70512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Revised: 10/17/2024] [Accepted: 10/18/2024] [Indexed: 12/07/2024] Open
Abstract
Genome scans provide a comprehensive method to explore genome-wide variation associated with traits under study. However, linking individual genes to broader functional groupings and pathways is often challenging, yet crucial for understanding the evolutionary mechanisms underlying these traits. This task is particularly relevant for multi-trait processes such as domestication, which are influenced by complex interactions between numerous genetic and non-genetic factors, including epigenetic regulation. As various traits within the broader spectrum of domestication are selected in concert over time, this process offers an opportunity to identify broader functional overlaps and understand the integrated genetic architecture underlying these traits. In this study, we analyzed approximately 600,000 SNPs from a Pool-Seq experiment comparing eight natural-origin and 12 farmed populations of European seabass in the Mediterranean Sea region. We implemented two genome scan approaches and focused on genomic regions supported by both methods, resulting in the identification of 96 candidate genes, including nine CpG islands, which highligt potential epigenetic influences. Many of these genes and CpG islands are in linkage groups previously associated with domestication-related traits. The most significantly overrepresented molecular function was "oxidoreductase activity". Furthermore, a dense network of interactions was identified, connecting 22 of the candidate genes. Within this network, the most significantly enriched pathways and central genes were involved in "chromatin organization", highlighting another potential epigenetic mechanism. Altogether, our findings underscore the utility of interactome-assisted pathway analysis in elucidating the genomic architecture of polygenic traits and suggest that epigenetic regulation may play a crucial role in the domestication of European seabass.
Collapse
Affiliation(s)
- Aristotelis Moulistanos
- Department of Genetics, Development & Molecular Biology, School of Biology, Faculty of SciencesAristotle University of ThessalonikiThessalonikiGreece
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI‐AUTH)Balkan CenterThessalonikiGreece
| | - Konstantinos Papasakellariou
- Department of Genetics, Development & Molecular Biology, School of Biology, Faculty of SciencesAristotle University of ThessalonikiThessalonikiGreece
| | - Ioannis Kavakiotis
- Department of Science and TechnologyInternational Hellenic UniversityThessalonikiGreece
| | - Konstantinos Gkagkavouzis
- Department of Genetics, Development & Molecular Biology, School of Biology, Faculty of SciencesAristotle University of ThessalonikiThessalonikiGreece
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI‐AUTH)Balkan CenterThessalonikiGreece
| | - Nikoleta Karaiskou
- Department of Genetics, Development & Molecular Biology, School of Biology, Faculty of SciencesAristotle University of ThessalonikiThessalonikiGreece
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI‐AUTH)Balkan CenterThessalonikiGreece
| | - Efthimia Antonopoulou
- Department of Zoology, School of Biology, Faculty of SciencesAristotle University of ThessalonikiThessalonikiGreece
| | - Alexandros Triantafyllidis
- Department of Genetics, Development & Molecular Biology, School of Biology, Faculty of SciencesAristotle University of ThessalonikiThessalonikiGreece
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation (CIRI‐AUTH)Balkan CenterThessalonikiGreece
| | - Spiros Papakostas
- Department of Science and TechnologyInternational Hellenic UniversityThessalonikiGreece
| |
Collapse
|
2
|
Aherrahrou R, Reinberger T, Hashmi S, Erdmann J. GWAS breakthroughs: mapping the journey from one locus to 393 significant coronary artery disease associations. Cardiovasc Res 2024; 120:1508-1530. [PMID: 39073758 DOI: 10.1093/cvr/cvae161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Revised: 03/20/2024] [Accepted: 06/12/2024] [Indexed: 07/30/2024] Open
Abstract
Coronary artery disease (CAD) poses a substantial threat to global health, leading to significant morbidity and mortality worldwide. It has a significant genetic component that has been studied through genome-wide association studies (GWAS) over the past 17 years. These studies have made progress with larger sample sizes, diverse ancestral backgrounds, and the discovery of multiple genomic regions related to CAD risk. In this review, we provide a comprehensive overview of CAD GWAS, including information about the genetic makeup of the disease and the importance of ethnic diversity in these studies. We also discuss challenges of identifying causal genes and variants within GWAS loci with a focus on non-coding regions. Additionally, we highlight tissues and cell types relevant to CAD, and discuss clinical implications of GWAS findings including polygenic risk scores, sex-specific differences in CAD genetics, ethnical aspects of personalized interventions, and GWAS guided drug development.
Collapse
Affiliation(s)
- Rédouane Aherrahrou
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland
- Institute for Cardiogenetics, University of Lübeck, Marie-Curie-Str. Haus 67/BMF, 23562 Lübeck, Germany
- DZHK (German Centre for Cardiovascular Research), Institute for Cardiogenetics, Universität zu Lübeck, Partner Site Hamburg/Kiel/Lübeck, Germany
- University Heart Centre Lübeck, University Hospital Schleswig-Holstein, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Tobias Reinberger
- Institute for Cardiogenetics, University of Lübeck, Marie-Curie-Str. Haus 67/BMF, 23562 Lübeck, Germany
- DZHK (German Centre for Cardiovascular Research), Institute for Cardiogenetics, Universität zu Lübeck, Partner Site Hamburg/Kiel/Lübeck, Germany
- University Heart Centre Lübeck, University Hospital Schleswig-Holstein, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Satwat Hashmi
- Department of Biological and Biomedical Sciences, Aga Khan University, Stadium Road, 74800 Karachi, Pakistan
| | - Jeanette Erdmann
- Institute for Cardiogenetics, University of Lübeck, Marie-Curie-Str. Haus 67/BMF, 23562 Lübeck, Germany
- DZHK (German Centre for Cardiovascular Research), Institute for Cardiogenetics, Universität zu Lübeck, Partner Site Hamburg/Kiel/Lübeck, Germany
- University Heart Centre Lübeck, University Hospital Schleswig-Holstein, Ratzeburger Allee 160, 23562 Lübeck, Germany
| |
Collapse
|
3
|
Clarelli F, Corona A, Pääkkönen K, Sorosina M, Zollo A, Piehl F, Olsson T, Stridh P, Jagodic M, Hemmer B, Gasperi C, Harroud A, Shchetynsky K, Mingione A, Mascia E, Misra K, Giordano A, Mazzieri MLT, Priori A, Saarela J, Kockum I, Filippi M, Esposito F, Boneschi FGM. Pharmacogenomics of clinical response to Natalizumab in multiple sclerosis: a genome-wide multi-centric association study. J Neurol 2024; 271:7250-7263. [PMID: 39264442 PMCID: PMC11561017 DOI: 10.1007/s00415-024-12608-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 07/22/2024] [Accepted: 07/23/2024] [Indexed: 09/13/2024]
Abstract
BACKGROUND Inter-individual differences in treatment response are marked in multiple sclerosis (MS). This is true for Natalizumab (NTZ), to which a subset of patients displays sub-optimal treatment response. We conducted a multi-centric genome-wide association study (GWAS), with additional pathway and network analysis to identify genetic predictors of response to NTZ. METHODS MS patients from three different centers were included. Response to NTZ was dichotomized, nominating responders (R) relapse-free patients and non-responders (NR) all the others, over a follow-up of 4 years. Association analysis on ~ 4.7 M imputed autosomal common single-nucleotide polymorphisms (SNPs) was performed fitting logistic regression models, adjusted for baseline covariates, followed by meta-analysis at SNP and gene level. Finally, these signals were projected onto STRING interactome, to elicit modules and hub genes linked to response. RESULTS Overall, 1834 patients were included: 119 from Italy (R = 94, NR = 25), 81 from Germany (R = 61, NR = 20), and 1634 from Sweden (R = 1349, NR = 285). The top-associated variant was rs11132400T (p = 1.33 × 10-6, OR = 0.58), affecting expression of several genes in the locus, like KLKB1. The interactome analysis implicated a module of 135 genes, with over-representation of terms like canonical WNT signaling pathway (padjust = 7.08 × 10-6). Response-associated genes like GRB2 and LRP6, already implicated in MS pathogenesis, were topologically prioritized within the module. CONCLUSION This GWAS, the largest pharmacogenomic study of response to NTZ, suggested MS-implicated genes and Wnt/β-catenin signaling pathway, an essential component for blood-brain barrier formation and maintenance, to be related to treatment response.
Collapse
Affiliation(s)
- Ferdinando Clarelli
- Laboratory of Human Genetics of Neurological Disorders, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
| | - Andrea Corona
- Laboratory of Precision Medicine of Neurological Diseases, Department of Health Science, University of Milan, Milan, Italy
- CRC "Aldo Ravelli" for Experimental Brain Therapeutics, Department of Health Science, University of Milan, Milan, Italy
| | - Kimmo Pääkkönen
- Institute for Molecular Medicine Finland (FIMM), University of FI Helsinki, Helsinki, Finland
| | - Melissa Sorosina
- Laboratory of Human Genetics of Neurological Disorders, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
| | - Alen Zollo
- Laboratory of Precision Medicine of Neurological Diseases, Department of Health Science, University of Milan, Milan, Italy
- CRC "Aldo Ravelli" for Experimental Brain Therapeutics, Department of Health Science, University of Milan, Milan, Italy
| | - Fredrik Piehl
- The Karolinska Neuroimmunology & Multiple Sclerosis Centre, Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Visionsgatan 18, 171 76, Stockholm, Sweden
| | - Tomas Olsson
- The Karolinska Neuroimmunology & Multiple Sclerosis Centre, Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Visionsgatan 18, 171 76, Stockholm, Sweden
| | - Pernilla Stridh
- The Karolinska Neuroimmunology & Multiple Sclerosis Centre, Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Visionsgatan 18, 171 76, Stockholm, Sweden
| | - Maja Jagodic
- The Karolinska Neuroimmunology & Multiple Sclerosis Centre, Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Visionsgatan 18, 171 76, Stockholm, Sweden
| | - Bernhard Hemmer
- Department of Neurology, School of Medicine, Technical University of Munich, Klinikum Rechts Der Isar, Ismaninger Str. 22, Munich, Germany
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Christiane Gasperi
- Department of Neurology, School of Medicine, Technical University of Munich, Klinikum Rechts Der Isar, Ismaninger Str. 22, Munich, Germany
| | - Adil Harroud
- Department of Neurology and Neurosurgery, McGill University, Montréal, QC, Canada
| | - Klementy Shchetynsky
- The Karolinska Neuroimmunology & Multiple Sclerosis Centre, Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Visionsgatan 18, 171 76, Stockholm, Sweden
| | - Alessandra Mingione
- Laboratory of Precision Medicine of Neurological Diseases, Department of Health Science, University of Milan, Milan, Italy
- CRC "Aldo Ravelli" for Experimental Brain Therapeutics, Department of Health Science, University of Milan, Milan, Italy
| | - Elisabetta Mascia
- Laboratory of Human Genetics of Neurological Disorders, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
| | - Kaalindi Misra
- Laboratory of Human Genetics of Neurological Disorders, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
| | - Antonino Giordano
- Laboratory of Human Genetics of Neurological Disorders, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
- Neurology Unit, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
| | - Maria Laura Terzi Mazzieri
- Laboratory of Precision Medicine of Neurological Diseases, Department of Health Science, University of Milan, Milan, Italy
- CRC "Aldo Ravelli" for Experimental Brain Therapeutics, Department of Health Science, University of Milan, Milan, Italy
| | - Alberto Priori
- CRC "Aldo Ravelli" for Experimental Brain Therapeutics, Department of Health Science, University of Milan, Milan, Italy
- Clinical Neurology Unit, Azienda Socio-Sanitaria Territoriale Santi Paolo E Carlo and Department of Health Sciences, University of Milan, Milan, Italy
| | - Janna Saarela
- Institute for Molecular Medicine Finland (FIMM), University of FI Helsinki, Helsinki, Finland
| | - Ingrid Kockum
- The Karolinska Neuroimmunology & Multiple Sclerosis Centre, Department of Clinical Neuroscience, Karolinska Institutet, Center for Molecular Medicine, Karolinska University Hospital, Visionsgatan 18, 171 76, Stockholm, Sweden
| | - Massimo Filippi
- Neurology Unit, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy
- Neurorehabilitation Unit, IRCCS San Raffaele Scientific Institute, Via Olgettina 48, Milan, Italy
- Neurophysiology Service, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
- Neuroimaging Research Unit, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy
- Vita-Salute San Raffaele University, Via Olgettina, 60, Milan, Italy
| | - Federica Esposito
- Laboratory of Human Genetics of Neurological Disorders, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, Milan, Italy.
- Neurology Unit, IRCCS San Raffaele Scientific Institute, Via Olgettina 60, 20132, Milan, Italy.
| | - Filippo Giovanni Martinelli Boneschi
- Laboratory of Precision Medicine of Neurological Diseases, Department of Health Science, University of Milan, Milan, Italy.
- CRC "Aldo Ravelli" for Experimental Brain Therapeutics, Department of Health Science, University of Milan, Milan, Italy.
- Clinical Neurology Unit, Azienda Socio-Sanitaria Territoriale Santi Paolo E Carlo and Department of Health Sciences, University of Milan, Milan, Italy.
| |
Collapse
|
4
|
Koo HJ, Pan W. Are trait-associated genes clustered together in a gene network? Genet Epidemiol 2024. [PMID: 38472164 DOI: 10.1002/gepi.22557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 01/25/2024] [Accepted: 02/23/2024] [Indexed: 03/14/2024]
Abstract
Genome-wide association studies (GWAS) have provided an abundance of information about the genetic variants and their loci that are associated to complex traits and diseases. However, due to linkage disequilibrium (LD) and noncoding regions of loci, it remains a challenge to pinpoint the causal genes. Gene network-based approaches, paired with network diffusion methods, have been proposed to prioritize causal genes and to boost statistical power in GWAS based on the assumption that trait-associated genes are clustered in a gene network. Due to the difficulty in mapping trait-associated variants to genes in GWAS, this assumption has never been directly or rigorously tested empirically. On the other hand, whole exome sequencing (WES) data focuses on the protein-coding regions, directly identifying trait-associated genes. In this study, we tested the assumption by leveraging the recently available exome-based association statistics from the UK Biobank WES data along with two types of networks. We found that almost all trait-associated genes were significantly more proximal to each other than randomly selected genes within both networks. These results support the assumption that trait-associated genes are clustered in gene networks, which can be further leveraged to boost the power of GWAS such as by introducing less stringent p value thresholds.
Collapse
Affiliation(s)
- Hyun Jung Koo
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
5
|
Leger BS, Meredith JJ, Ideker T, Sanchez-Roige S, Palmer AA. Rare and Common Variants Associated with Alcohol Consumption Identify a Conserved Molecular Network. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582195. [PMID: 38464225 PMCID: PMC10925118 DOI: 10.1101/2024.02.26.582195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Genome-wide association studies (GWAS) have identified hundreds of common variants associated with alcohol consumption. In contrast, rare variants have only begun to be studied for their role in alcohol consumption. No studies have examined whether common and rare variants implicate the same genes and molecular networks. To address this knowledge gap, we used publicly available alcohol consumption GWAS summary statistics (GSCAN, N=666,978) and whole exome sequencing data (Genebass, N=393,099) to identify a set of common and rare variants for alcohol consumption. Gene-based analysis of each dataset have implicated 294 (common variants) and 35 (rare variants) genes, including ethanol metabolizing genes ADH1B and ADH1C, which were identified by both analyses, and ANKRD12, GIGYF1, KIF21B, and STK31, which were identified only by rare variant analysis, but have been associated with related psychiatric traits. We then used a network colocalization procedure to propagate the common and rare gene sets onto a shared molecular network, revealing significant overlap. The shared network identified gene families that function in alcohol metabolism, including ADH, ALDH, CYP, and UGT. 74 of the genes in the network were previously implicated in comorbid psychiatric or substance use disorders, but had not previously been identified for alcohol-related behaviors, including EXOC2, EPM2A, CACNB3, and CACNG4. Differential gene expression analysis showed enrichment in the liver and several brain regions supporting the role of network genes in alcohol consumption. Thus, genes implicated by common and rare variants identify shared functions relevant to alcohol consumption, which also underlie psychiatric traits and substance use disorders that are comorbid with alcohol use.
Collapse
Affiliation(s)
- Brittany S Leger
- Program in Biomedical Sciences, University of California San Diego, La Jolla, CA, USA
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - John J Meredith
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Sandra Sanchez-Roige
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University, Nashville, TN, USA
| | - Abraham A Palmer
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
6
|
Tsare EPG, Klapa MI, Moschonas NK. Protein-protein interaction network-based integration of GWAS and functional data for blood pressure regulation analysis. Hum Genomics 2024; 18:15. [PMID: 38326862 PMCID: PMC11465932 DOI: 10.1186/s40246-023-00565-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 11/12/2023] [Indexed: 02/09/2024] Open
Abstract
BACKGROUND It is valuable to analyze the genome-wide association studies (GWAS) data for a complex disease phenotype in the context of the protein-protein interaction (PPI) network, as the related pathophysiology results from the function of interacting polyprotein pathways. The analysis may include the design and curation of a phenotype-specific GWAS meta-database incorporating genotypic and eQTL data linking to PPI and other biological datasets, and the development of systematic workflows for PPI network-based data integration toward protein and pathway prioritization. Here, we pursued this analysis for blood pressure (BP) regulation. METHODS The relational scheme of the implemented in Microsoft SQL Server BP-GWAS meta-database enabled the combined storage of: GWAS data and attributes mined from GWAS Catalog and the literature, Ensembl-defined SNP-transcript associations, and GTEx eQTL data. The BP-protein interactome was reconstructed from the PICKLE PPI meta-database, extending the GWAS-deduced network with the shortest paths connecting all GWAS-proteins into one component. The shortest-path intermediates were considered as BP-related. For protein prioritization, we combined a new integrated GWAS-based scoring scheme with two network-based criteria: one considering the protein role in the reconstructed by shortest-path (RbSP) interactome and one novel promoting the common neighbors of GWAS-prioritized proteins. Prioritized proteins were ranked by the number of satisfied criteria. RESULTS The meta-database includes 6687 variants linked with 1167 BP-associated protein-coding genes. The GWAS-deduced PPI network includes 1065 proteins, with 672 forming a connected component. The RbSP interactome contains 1443 additional, network-deduced proteins and indicated that essentially all BP-GWAS proteins are at most second neighbors. The prioritized BP-protein set was derived from the union of the most BP-significant by any of the GWAS-based or the network-based criteria. It included 335 proteins, with ~ 2/3 deduced from the BP PPI network extension and 126 prioritized by at least two criteria. ESR1 was the only protein satisfying all three criteria, followed in the top-10 by INSR, PTN11, CDK6, CSK, NOS3, SH2B3, ATP2B1, FES and FINC, satisfying two. Pathway analysis of the RbSP interactome revealed numerous bioprocesses, which are indeed functionally supported as BP-associated, extending our understanding about BP regulation. CONCLUSIONS The implemented workflow could be used for other multifactorial diseases.
Collapse
Affiliation(s)
- Evridiki-Pandora G Tsare
- Department of General Biology, School of Medicine, University of Patras, Patras, Greece
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research and Technology-Hellas (FORTH/ICE-HT), Patras, Greece
| | - Maria I Klapa
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research and Technology-Hellas (FORTH/ICE-HT), Patras, Greece.
| | - Nicholas K Moschonas
- Department of General Biology, School of Medicine, University of Patras, Patras, Greece.
- Metabolic Engineering and Systems Biology Laboratory, Institute of Chemical Engineering Sciences, Foundation for Research and Technology-Hellas (FORTH/ICE-HT), Patras, Greece.
| |
Collapse
|
7
|
Lin S, Jia P. scGraph2Vec: a deep generative model for gene embedding augmented by graph neural network and single-cell omics data. Gigascience 2024; 13:giae108. [PMID: 39704704 DOI: 10.1093/gigascience/giae108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 10/18/2024] [Accepted: 11/26/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND Exploring the cellular processes of genes from the aspects of biological networks is of great interest to understanding the properties of complex diseases and biological systems. Biological networks, such as protein-protein interaction networks and gene regulatory networks, provide insights into the molecular basis of cellular processes and often form functional clusters in different tissue and disease contexts. RESULTS We present scGraph2Vec, a deep learning framework for generating informative gene embeddings. scGraph2Vec extends the variational graph autoencoder framework and integrates single-cell datasets and gene-gene interaction networks. We demonstrate that the gene embeddings are biologically interpretable and enable the identification of gene clusters representing functional or tissue-specific cellular processes. By comparing similar tools, we showed that scGraph2Vec clearly distinguished different gene clusters and aggregated more biologically functional genes. scGraph2Vec can be widely applied in diverse biological contexts. We illustrated that the embeddings generated by scGraph2Vec can infer disease-associated genes from genome-wide association study data (e.g., COVID-19 and Alzheimer's disease), identify additional driver genes in lung adenocarcinoma, and reveal regulatory genes responsible for maintaining or transitioning melanoma cell states. CONCLUSIONS scGraph2Vec not only reconstructs tissue-specific gene networks but also obtains a latent representation of genes implying their biological functions.
Collapse
Affiliation(s)
- Shiqi Lin
- National Genomics Data Center, China National Center for Bioinformation, Beijing 100101, China
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Peilin Jia
- National Genomics Data Center, China National Center for Bioinformation, Beijing 100101, China
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
8
|
Ang MY, Takeuchi F, Kato N. Deciphering the genetic landscape of obesity: a data-driven approach to identifying plausible causal genes and therapeutic targets. J Hum Genet 2023; 68:823-833. [PMID: 37620670 PMCID: PMC10678330 DOI: 10.1038/s10038-023-01189-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/08/2023] [Accepted: 08/15/2023] [Indexed: 08/26/2023]
Abstract
OBJECTIVES Genome-wide association studies (GWAS) have successfully revealed numerous susceptibility loci for obesity. However, identifying the causal genes, pathways, and tissues/cell types responsible for these associations remains a challenge, and standardized analysis workflows are lacking. Additionally, due to limited treatment options for obesity, there is a need for the development of new pharmacological therapies. This study aimed to address these issues by performing step-wise utilization of knowledgebase for gene prioritization and assessing the potential relevance of key obesity genes as therapeutic targets. METHODS AND RESULTS First, we generated a list of 28,787 obesity-associated SNPs from the publicly available GWAS dataset (approximately 800,000 individuals in the GIANT meta-analysis). Then, we prioritized 1372 genes with significant in silico evidence against genomic and transcriptomic data, including transcriptionally regulated genes in the brain from transcriptome-wide association studies. In further narrowing down the gene list, we selected key genes, which we found to be useful for the discovery of potential drug seeds as demonstrated in lipid GWAS separately. We thus identified 74 key genes for obesity, which are highly interconnected and enriched in several biological processes that contribute to obesity, including energy expenditure and homeostasis. Of 74 key genes, 37 had not been reported for the pathophysiology of obesity. Finally, by drug-gene interaction analysis, we detected 23 (of 74) key genes that are potential targets for 78 approved and marketed drugs. CONCLUSIONS Our results provide valuable insights into new treatment options for obesity through a data-driven approach that integrates multiple up-to-date knowledgebases.
Collapse
Affiliation(s)
- Mia Yang Ang
- Department of Clinical Genome Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
- Department of Gene Diagnostics and Therapeutics, Medical Genomics Center, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan.
| | - Fumihiko Takeuchi
- Department of Gene Diagnostics and Therapeutics, Medical Genomics Center, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan
| | - Norihiro Kato
- Department of Clinical Genome Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
- Department of Gene Diagnostics and Therapeutics, Medical Genomics Center, Research Institute, National Center for Global Health and Medicine, Tokyo, Japan
| |
Collapse
|
9
|
Fang Y, Wang D, Xiao L, Quan M, Qi W, Song F, Zhou J, Liu X, Qin S, Du Q, Liu Q, El-Kassaby YA, Zhang D. Allelic variation in transcription factor PtoWRKY68 contributes to drought tolerance in Populus. PLANT PHYSIOLOGY 2023; 193:736-755. [PMID: 37247391 PMCID: PMC10469405 DOI: 10.1093/plphys/kiad315] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 04/21/2023] [Accepted: 04/30/2023] [Indexed: 05/31/2023]
Abstract
Drought stress limits woody species productivity and influences tree distribution. However, dissecting the molecular mechanisms that underpin drought responses in forest trees can be challenging due to trait complexity. Here, using a panel of 300 Chinese white poplar (Populus tomentosa) accessions collected from different geographical climatic regions in China, we performed a genome-wide association study (GWAS) on seven drought-related traits and identified PtoWRKY68 as a candidate gene involved in the response to drought stress. A 12-bp insertion and/or deletion and three nonsynonymous variants in the PtoWRKY68 coding sequence categorized natural populations of P. tomentosa into two haplotype groups, PtoWRKY68hap1 and PtoWRKY68hap2. The allelic variation in these two PtoWRKY68 haplotypes conferred differential transcriptional regulatory activities and binding to the promoters of downstream abscisic acid (ABA) efflux and signaling genes. Overexpression of PtoWRKY68hap1 and PtoWRKY68hap2 in Arabidopsis (Arabidopsis thaliana) ameliorated the drought tolerance of two transgenic lines and increased ABA content by 42.7% and 14.3% compared to wild-type plants, respectively. Notably, PtoWRKY68hap1 (associated with drought tolerance) is ubiquitous in accessions in water-deficient environments, whereas the drought-sensitive allele PtoWRKY68hap2 is widely distributed in well-watered regions, consistent with the trends in local precipitation, suggesting that these alleles correspond to geographical adaptation in Populus. Moreover, quantitative trait loci analysis and an electrophoretic mobility shift assay showed that SHORT VEGETATIVE PHASE (PtoSVP.3) positively regulates the expression of PtoWRKY68 under drought stress. We propose a drought tolerance regulatory module in which PtoWRKY68 modulates ABA signaling and accumulation, providing insight into the genetic basis of drought tolerance in trees. Our findings will facilitate molecular breeding to improve the drought tolerance of forest trees.
Collapse
Affiliation(s)
- Yuanyuan Fang
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Dan Wang
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Liang Xiao
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Mingyang Quan
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Weina Qi
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Fangyuan Song
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Jiaxuan Zhou
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Xin Liu
- Institute of Forestry and Pomology, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100093, People’s Republic of China
| | - Shitong Qin
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Qingzhang Du
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| | - Qing Liu
- The Institute of Agriculture and Food Research, CSIRO Agriculture and Food, Black Mountain, Canberra ACT 2601, Australia
| | - Yousry A El-Kassaby
- Department of Forest and Conservation Sciences, Faculty of Forestry, Forest Sciences Centre, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Deqiang Zhang
- National Engineering Research Center of Tree Breeding and Ecological Restoration, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
- Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Ministry of Education, College of Biological Sciences and Technology, Beijing Forestry University, No. 35, Qinghua East Road, Beijing 100083, People’s Republic of China
| |
Collapse
|
10
|
Wright SN, Leger BS, Rosenthal SB, Liu SN, Jia T, Chitre AS, Polesskaya O, Holl K, Gao J, Cheng R, Garcia Martinez A, George A, Gileta AF, Han W, Netzley AH, King CP, Lamparelli A, Martin C, St Pierre CL, Wang T, Bimschleger H, Richards J, Ishiwari K, Chen H, Flagel SB, Meyer P, Robinson TE, Solberg Woods LC, Kreisberg JF, Ideker T, Palmer AA. Genome-wide association studies of human and rat BMI converge on synapse, epigenome, and hormone signaling networks. Cell Rep 2023; 42:112873. [PMID: 37527041 PMCID: PMC10546330 DOI: 10.1016/j.celrep.2023.112873] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 07/05/2023] [Accepted: 07/11/2023] [Indexed: 08/03/2023] Open
Abstract
A vexing observation in genome-wide association studies (GWASs) is that parallel analyses in different species may not identify orthologous genes. Here, we demonstrate that cross-species translation of GWASs can be greatly improved by an analysis of co-localization within molecular networks. Using body mass index (BMI) as an example, we show that the genes associated with BMI in humans lack significant agreement with those identified in rats. However, the networks interconnecting these genes show substantial overlap, highlighting common mechanisms including synaptic signaling, epigenetic modification, and hormonal regulation. Genetic perturbations within these networks cause abnormal BMI phenotypes in mice, too, supporting their broad conservation across mammals. Other mechanisms appear species specific, including carbohydrate biosynthesis (humans) and glycerolipid metabolism (rodents). Finally, network co-localization also identifies cross-species convergence for height/body length. This study advances a general paradigm for determining whether and how phenotypes measured in model species recapitulate human biology.
Collapse
Affiliation(s)
- Sarah N Wright
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA 92093, USA
| | - Brittany S Leger
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA; Program in Biomedical Sciences, University of California San Diego, La Jolla, CA 93093, USA
| | - Sara Brin Rosenthal
- Center for Computational Biology & Bioinformatics, Department of Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Sophie N Liu
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Tongqiu Jia
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Apurva S Chitre
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA
| | - Oksana Polesskaya
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA
| | - Katie Holl
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jianjun Gao
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA
| | - Riyan Cheng
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA
| | - Angel Garcia Martinez
- Department of Pharmacology, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Anthony George
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY 14203, USA
| | - Alexander F Gileta
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Wenyan Han
- Department of Pharmacology, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Alesa H Netzley
- Department of Psychiatry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Christopher P King
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY 14203, USA; Department of Psychology, University at Buffalo, Buffalo, NY 14260, USA
| | | | - Connor Martin
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY 14203, USA; Department of Psychology, University at Buffalo, Buffalo, NY 14260, USA
| | | | - Tengfei Wang
- Department of Pharmacology, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Hannah Bimschleger
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA
| | - Jerry Richards
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY 14203, USA
| | - Keita Ishiwari
- Clinical and Research Institute on Addictions, University at Buffalo, Buffalo, NY 14203, USA; Department of Pharmacology and Toxicology, University at Buffalo, Buffalo, NY 14203, USA
| | - Hao Chen
- Department of Pharmacology, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Shelly B Flagel
- Department of Psychiatry, University of Michigan, Ann Arbor, MI 48109, USA; Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
| | - Paul Meyer
- Department of Psychology, University at Buffalo, Buffalo, NY 14260, USA
| | - Terry E Robinson
- Department of Psychology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Leah C Solberg Woods
- Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA
| | - Jason F Kreisberg
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA.
| | - Abraham A Palmer
- Department of Psychiatry, University of California San Diego, La Jolla, CA 93093, USA; Institute for Genomic Medicine, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
11
|
COVID-GWAB: A Web-Based Prediction of COVID-19 Host Genes via Network Boosting of Genome-Wide Association Data. Biomolecules 2022; 12:biom12101446. [PMID: 36291657 PMCID: PMC9599684 DOI: 10.3390/biom12101446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/01/2022] [Accepted: 10/02/2022] [Indexed: 11/17/2022] Open
Abstract
Host genetics affect both the susceptibility and response to viral infection. Searching for host genes that contribute to COVID-19, the Host Genetics Initiative (HGI) was formed to investigate the genetic factors involved in COVID-19 via genome-wide association studies (GWAS). The GWAS suffer from limited statistical power and in general, only a few genes can pass the conventional significance thresholds. This statistical limitation may be overcome by boosting weak association signals through integrating independent functional information such as molecular interactions. Additionally, the boosted results can be evaluated by various independent data for further connections to COVID-19. We present COVID-GWAB, a web-based tool to boost original GWAS signals from COVID-19 patients by taking the signals of the interactome neighbors. COVID-GWAB takes summary statistics from the COVID-19 HGI or user input data and reprioritizes candidate host genes for COVID-19 using HumanNet, a co-functional human gene network. The current version of COVID-GWAB provides the pre-processed data of releases 5, 6, and 7 of the HGI. Additionally, COVID-GWAB provides web interfaces for a summary of augmented GWAS signals, prediction evaluations by appearance frequency in COVID-19 literature, single-cell transcriptome data, and associated pathways. The web server also enables browsing the candidate gene networks.
Collapse
|
12
|
Wilson JL, Gravina A, Grimes K. From random to predictive: a context-specific interaction framework improves selection of drug protein-protein interactions for unknown drug pathways. Integr Biol (Camb) 2022; 14:13-24. [PMID: 35293584 DOI: 10.1093/intbio/zyac002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 02/01/2022] [Accepted: 02/03/2022] [Indexed: 12/20/2022]
Abstract
With high drug attrition, protein-protein interaction (PPI) network models are attractive as efficient methods for predicting drug outcomes by analyzing proteins downstream of drug targets. Unfortunately, these methods tend to overpredict associations and they have low precision and prediction performance; performance is often no better than random (AUROC ~0.5). Typically, PPI models identify ranked phenotypes associated with downstream proteins, yet methods differ in prioritization of downstream proteins. Most methods apply global approaches for assessing all phenotypes. We hypothesized that a per-phenotype analysis could improve prediction performance. We compared two global approaches-statistical and distance-based-and our novel per-phenotype approach, 'context-specific interaction' (CSI) analysis, on severe side effect prediction. We used a novel dataset of adverse events (or designated medical events, DMEs) and discovered that CSI had a 50% improvement over global approaches (AUROC 0.77 compared to 0.51), and a 76-95% improvement in average precision (0.499 compared to 0.284, 0.256). Our results provide a quantitative rationale for considering downstream proteins on a per-phenotype basis when using PPI network methods to predict drug phenotypes.
Collapse
Affiliation(s)
- Jennifer L Wilson
- Department of Bioengineering, University of California Los Angeles, Los Angeles, CA, USA
| | - Alessio Gravina
- Department of Computer Science, University of Pisa, Pisa, Italy
| | - Kevin Grimes
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA, USA
| |
Collapse
|
13
|
Zhang S, Yang X, Si S, Zhang J. The neurobiological basis of divergent thinking: Insight from gene co-expression network-based analysis. Neuroimage 2021; 245:118762. [PMID: 34838948 DOI: 10.1016/j.neuroimage.2021.118762] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Revised: 10/25/2021] [Accepted: 11/23/2021] [Indexed: 11/30/2022] Open
Abstract
Although many efforts have been made to explore the genetic basis of divergent thinking (DT), there is still a gap in the understanding of how these findings relate to the neurobiology of DT. In a combined sample of 1,682 Chinese participants, by integrating GWAS with previously identified brain-specific gene co-expression network modules, this study explored for the first time the functional brain-specific gene co-expression networks underlying DT. The results showed that gene co-expression network modules in anterior cingulate cortex, caudate, amygdala and substantia nigra were enriched with DT association signals. Further functional enrichment analysis showed that these DT-related gene co-expression network modules were enriched for key biological process and cellular component related to myelination, suggesting that cortical and sub-cortical grey matter myelination may serve as important neurobiological basis of DT. Although the underlying mechanisms need to be further refined, this exploratory study may provide new insight into the neurobiology of DT.
Collapse
Affiliation(s)
- Shun Zhang
- Department of Psychology, Shandong Normal University, No. 88 East Wenhua Road, Jinan 250014, China
| | - Xiaolei Yang
- College of Life Science, Qilu Normal University, Jinan, China
| | - Si Si
- Department of Psychology, Shandong Normal University, No. 88 East Wenhua Road, Jinan 250014, China
| | - Jinghuan Zhang
- Department of Psychology, Shandong Normal University, No. 88 East Wenhua Road, Jinan 250014, China.
| |
Collapse
|
14
|
Meta-analysis of genome-wide association studies and gene networks analysis for milk production traits in Holstein cows. Livest Sci 2021. [DOI: 10.1016/j.livsci.2021.104605] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
15
|
Jia P, Manuel AM, Fernandes BS, Dai Y, Zhao Z. Distinct effect of prenatal and postnatal brain expression across 20 brain disorders and anthropometric social traits: a systematic study of spatiotemporal modularity. Brief Bioinform 2021; 22:6291943. [PMID: 34086851 DOI: 10.1093/bib/bbab214] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/30/2021] [Accepted: 05/15/2021] [Indexed: 02/06/2023] Open
Abstract
Different spatiotemporal abnormalities have been implicated in different neuropsychiatric disorders and anthropometric social traits, yet an investigation in the temporal network modularity with brain tissue transcriptomics has been lacking. We developed a supervised network approach to investigate the genome-wide association study (GWAS) results in the spatial and temporal contexts and demonstrated it in 20 brain disorders and anthropometric social traits. BrainSpan transcriptome profiles were used to discover significant modules enriched with trait susceptibility genes in a developmental stage-stratified manner. We investigated whether, and in which developmental stages, GWAS-implicated genes are coordinately expressed in brain transcriptome. We identified significant network modules for each disorder and trait at different developmental stages, providing a systematic view of network modularity at specific developmental stages for a myriad of brain disorders and traits. Specifically, we observed a strong pattern of the fetal origin for most psychiatric disorders and traits [such as schizophrenia (SCZ), bipolar disorder, obsessive-compulsive disorder and neuroticism], whereas increased co-expression activities of genes were more strongly associated with neurological diseases [such as Alzheimer's disease (AD) and amyotrophic lateral sclerosis] and anthropometric traits (such as college completion, education and subjective well-being) in postnatal brains. Further analyses revealed enriched cell types and functional features that were supported and corroborated prior knowledge in specific brain disorders, such as clathrin-mediated endocytosis in AD, myelin sheath in multiple sclerosis and regulation of synaptic plasticity in both college completion and education. Our study provides a landscape view of the spatiotemporal features in a myriad of brain-related disorders and traits.
Collapse
Affiliation(s)
- Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Astrid M Manuel
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Brisa S Fernandes
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030, USA
| |
Collapse
|
16
|
Zhang P, Cobat A, Lee YS, Wu Y, Bayrak CS, Boccon-Gibod C, Matuozzo D, Lorenzo L, Jain A, Boucherit S, Vallée L, Stüve B, Chabrier S, Casanova JL, Abel L, Zhang SY, Itan Y. A computational approach for detecting physiological homogeneity in the midst of genetic heterogeneity. Am J Hum Genet 2021; 108:1012-1025. [PMID: 34015270 DOI: 10.1016/j.ajhg.2021.04.023] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Accepted: 04/28/2021] [Indexed: 02/07/2023] Open
Abstract
The human genetic dissection of clinical phenotypes is complicated by genetic heterogeneity. Gene burden approaches that detect genetic signals in case-control studies are underpowered in genetically heterogeneous cohorts. We therefore developed a genome-wide computational method, network-based heterogeneity clustering (NHC), to detect physiological homogeneity in the midst of genetic heterogeneity. Simulation studies showed our method to be capable of systematically converging genes in biological proximity on the background biological interaction network, and capturing gene clusters harboring presumably deleterious variants, in an efficient and unbiased manner. We applied NHC to whole-exome sequencing data from a cohort of 122 individuals with herpes simplex encephalitis (HSE), including 13 individuals with previously published monogenic inborn errors of TLR3-dependent IFN-α/β immunity. The top gene cluster identified by our approach successfully detected and prioritized all causal variants of five TLR3 pathway genes in the 13 previously reported individuals. This approach also suggested candidate variants of three reported genes and four candidate genes from the same pathway in another ten previously unstudied individuals. TLR3 responsiveness was impaired in dermal fibroblasts from four of the five individuals tested, suggesting that the variants detected were causal for HSE. NHC is, therefore, an effective and unbiased approach for unraveling genetic heterogeneity by detecting physiological homogeneity.
Collapse
Affiliation(s)
- Peng Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA.
| | - Aurélie Cobat
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France
| | - Yoon-Seung Lee
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Yiming Wu
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Cigdem Sevim Bayrak
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Clémentine Boccon-Gibod
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA
| | - Daniela Matuozzo
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France
| | - Lazaro Lorenzo
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France
| | - Aayushee Jain
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Soraya Boucherit
- Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France
| | - Louis Vallée
- Neuropediatric Department, Roger Salengro Hospital, Lille 59037, France
| | - Burkhard Stüve
- Clinics of the City of Cologne gGmbH, Cologne 53323, Germany
| | - Stéphane Chabrier
- CHU Saint-Étienne, French Centre for Pediatric Stroke, Saint-Étienne, France
| | - Jean-Laurent Casanova
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France; Howard Hughes Medical Institute, New York, NY 10065, USA.
| | - Laurent Abel
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France
| | - Shen-Ying Zhang
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA; Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, Paris 75015, France; University of Paris, Imagine Institute, Paris 75015, France
| | - Yuval Itan
- St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA; The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
17
|
Yuan J, Chen F, Fan D, Jiang Q, Xue Z, Zhang J, Yu X, Li K, Qu J, Su J. EyeDiseases: an integrated resource for dedicating to genetic variants, gene expression and epigenetic factors of human eye diseases. NAR Genom Bioinform 2021; 3:lqab050. [PMID: 34085038 PMCID: PMC8168129 DOI: 10.1093/nargab/lqab050] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 04/22/2021] [Accepted: 05/19/2021] [Indexed: 02/06/2023] Open
Abstract
Eye diseases are remarkably common and encompass a large and diverse range of morbidities that affect different components of the visual system and visual function. With advances in omics technology of eye disorders, genome-scale datasets have been rapidly accumulated in genetics and epigenetics field. However, the efficient collection and comprehensive analysis of different kinds of omics data are lacking. Herein, we developed EyeDiseases (https://eyediseases.bio-data.cn/), the first database for multi-omics data integration and interpretation of human eyes diseases. It contains 1344 disease-associated genes with genetic variation, 1774 transcription files of bulk cell expression and single-cell RNA-seq, 105 epigenomics data across 185 kinds of human eye diseases. Using EyeDiseases, we investigated SARS-CoV-2 potential tropism in eye infection and found that the SARS-CoV-2 entry factors, ACE2 and TMPRSS2 are highly correlated with cornea and keratoconus, suggest that ocular surface cells are susceptible to infection by SARS-CoV-2. Additionally, integrating analysis of Age-related macular degeneration (AMD) GWAS loci and co-expression data revealed 9 associated genes involved in HIF-1 signaling pathway and voltage-gate potassium channel complex. The EyeDiseases provides a valuable resource for accelerating the discovery and validation of candidate loci and genes contributed to the molecular diagnosis and therapeutic vulnerabilities with various eyes diseases.
Collapse
Affiliation(s)
- Jian Yuan
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Fukun Chen
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Dandan Fan
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Qi Jiang
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Zhengbo Xue
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Ji Zhang
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Xiangyi Yu
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| | - Kai Li
- Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325011, Zhejiang, China
| | - Jia Qu
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
| | - Jianzhong Su
- School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, China
- National Clinical Research Center for Ocular Disease, Wenzhou 325027, China
- Institute of Biomedical Big Data, Wenzhou Medical University, Wenzhou 325027, China
| |
Collapse
|
18
|
Silberstein M, Nesbit N, Cai J, Lee PH. Pathway analysis for genome-wide genetic variation data: Analytic principles, latest developments, and new opportunities. J Genet Genomics 2021; 48:173-183. [PMID: 33896739 PMCID: PMC8286309 DOI: 10.1016/j.jgg.2021.01.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/24/2021] [Accepted: 01/25/2021] [Indexed: 12/23/2022]
Abstract
Pathway analysis, also known as gene-set enrichment analysis, is a multilocus analytic strategy that integrates a priori, biological knowledge into the statistical analysis of high-throughput genetics data. Originally developed for the studies of gene expression data, it has become a powerful analytic procedure for in-depth mining of genome-wide genetic variation data. Astonishing discoveries were made in the past years, uncovering genes and biological mechanisms underlying common and complex disorders. However, as massive amounts of diverse functional genomics data accrue, there is a pressing need for newer generations of pathway analysis methods that can utilize multiple layers of high-throughput genomics data. In this review, we provide an intellectual foundation of this powerful analytic strategy, as well as an update of the state-of-the-art in recent method developments. The goal of this review is threefold: (1) introduce the motivation and basic steps of pathway analysis for genome-wide genetic variation data; (2) review the merits and the shortcomings of classic and newly emerging integrative pathway analysis tools; and (3) discuss remaining challenges and future directions for further method developments.
Collapse
Affiliation(s)
- Micah Silberstein
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Nicholas Nesbit
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Jacquelyn Cai
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Phil H Lee
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Psychiatry, Harvard Medical School, Boston, MA 02115, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
19
|
Zhu H, Shang L, Zhou X. A Review of Statistical Methods for Identifying Trait-Relevant Tissues and Cell Types. Front Genet 2021; 11:587887. [PMID: 33584792 PMCID: PMC7874162 DOI: 10.3389/fgene.2020.587887] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 12/30/2020] [Indexed: 11/17/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified and replicated many genetic variants that are associated with diseases and disease-related complex traits. However, the biological mechanisms underlying these identified associations remain largely elusive. Exploring the biological mechanisms underlying these associations requires identifying trait-relevant tissues and cell types, as genetic variants likely influence complex traits in a tissue- and cell type-specific manner. Recently, several statistical methods have been developed to integrate genomic data with GWASs for identifying trait-relevant tissues and cell types. These methods often rely on different genomic information and use different statistical models for trait-tissue relevance inference. Here, we present a comprehensive technical review to summarize ten existing methods for trait-tissue relevance inference. These methods make use of different genomic information that include functional annotation information, expression quantitative trait loci information, genetically regulated gene expression information, as well as gene co-expression network information. These methods also use different statistical models that range from linear mixed models to covariance network models. We hope that this review can serve as a useful reference both for methodologists who develop methods and for applied analysts who apply these methods for identifying trait relevant tissues and cell types.
Collapse
Affiliation(s)
- Huanhuan Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
| | - Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, United States
| |
Collapse
|
20
|
Biswas S, Pal S, Majumder PP, Bhattacharjee S. A framework for pathway knowledge driven prioritization in genome-wide association studies. Genet Epidemiol 2020; 44:841-853. [PMID: 32779262 PMCID: PMC7116354 DOI: 10.1002/gepi.22345] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 06/18/2020] [Accepted: 07/10/2020] [Indexed: 12/27/2022]
Abstract
Many variants with low frequencies or with low to modest effects likely remain unidentified in genome-wide association studies (GWAS) because of stringent genome-wide thresholds for detection. To improve the power of detection, variant prioritization based on their functional annotations and epigenetic landmarks has been used successfully. Here, we propose a novel method of prioritization of a GWAS by exploiting gene-level knowledge (e.g., annotations to pathways and ontologies) and show that it further improves power. Often, disease associated variants are found near genes that are coinvolved in specific biological pathways relevant to disease process. Utilization of this knowledge to conduct a prioritized scan increases the power to detect loci that map to genes clustered in a few specific pathways. We have developed a computationally scalable framework based on penalized logistic regression (termed GKnowMTest-Genomic Knowledge-guided Multiplte Testing) to enable a prioritized pathway-guided GWAS scan with a very large number of gene-level annotations. We demonstrate that the proposed strategy improves overall power and maintains the Type 1 error globally. Our method works on genome-wide summary level data and a user-specified list of pathways (e.g., those extracted from large pathway databases without reference to biology of a specific disease). It automatically reweights the input p values by incorporating the pathway enrichments as "adaptively learned" from the data using a cross-validation technique to avoid overfitting. We used whole-genome simulations and some publicly available GWAS data sets to illustrate the application of our method. The GKnowMTest framework has been implemented as a user-friendly open-source R package.
Collapse
Affiliation(s)
| | - Soumen Pal
- National Institute of Biomedical Genomics, Kalyani, India
| | | | | |
Collapse
|
21
|
Genetic Basis of Maize Resistance to Multiple Insect Pests: Integrated Genome-Wide Comparative Mapping and Candidate Gene Prioritization. Genes (Basel) 2020; 11:genes11060689. [PMID: 32599710 PMCID: PMC7349181 DOI: 10.3390/genes11060689] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 05/30/2020] [Accepted: 06/01/2020] [Indexed: 01/01/2023] Open
Abstract
Several species of herbivores feed on maize in field and storage setups, making the development of multiple insect resistance a critical breeding target. In this study, an association mapping panel of 341 tropical maize lines was evaluated in three field environments for resistance to fall armyworm (FAW), whilst bulked grains were subjected to a maize weevil (MW) bioassay and genotyped with Diversity Array Technology's single nucleotide polymorphisms (SNPs) markers. A multi-locus genome-wide association study (GWAS) revealed 62 quantitative trait nucleotides (QTNs) associated with FAW and MW resistance traits on all 10 maize chromosomes, of which, 47 and 31 were discovered at stringent Bonferroni genome-wide significance levels of 0.05 and 0.01, respectively, and located within or close to multiple insect resistance genomic regions (MIRGRs) concerning FAW, SB, and MW. Sixteen QTNs influenced multiple traits, of which, six were associated with resistance to both FAW and MW, suggesting a pleiotropic genetic control. Functional prioritization of candidate genes (CGs) located within 10-30 kb of the QTNs revealed 64 putative GWAS-based CGs (GbCGs) showing evidence of involvement in plant defense mechanisms. Only one GbCG was associated with each of the five of the six combined resistance QTNs, thus reinforcing the pleiotropy hypothesis. In addition, through in silico co-functional network inferences, an additional 107 network-based CGs (NbCGs), biologically connected to the 64 GbCGs, and differentially expressed under biotic or abiotic stress, were revealed within MIRGRs. The provided multiple insect resistance physical map should contribute to the development of combined insect resistance in maize.
Collapse
|
22
|
Åkerborg Ö, Spalinskas R, Pradhananga S, Anil A, Höjer P, Poujade FA, Folkersen L, Eriksson PP, Sahlén P. High-Resolution Regulatory Maps Connect Vascular Risk Variants to Disease-Related Pathways. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2020; 12:e002353. [PMID: 30786239 PMCID: PMC8104016 DOI: 10.1161/circgen.118.002353] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Supplemental Digital Content is available in the text. Genetic variant landscape of coronary artery disease is dominated by noncoding variants among which many occur within putative enhancers regulating the expression levels of relevant genes. It is crucial to assign the genetic variants to their correct genes both to gain insights into perturbed functions and better assess the risk of disease.
Collapse
Affiliation(s)
- Örjan Åkerborg
- Science for Life Laboratory, Division of Gene Technology, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Solna, Sweden (Ö.Å., R.S., S.P., A.A., P.H., P.S.)
| | - Rapolas Spalinskas
- Science for Life Laboratory, Division of Gene Technology, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Solna, Sweden (Ö.Å., R.S., S.P., A.A., P.H., P.S.)
| | - Sailendra Pradhananga
- Science for Life Laboratory, Division of Gene Technology, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Solna, Sweden (Ö.Å., R.S., S.P., A.A., P.H., P.S.)
| | - Anandashankar Anil
- Science for Life Laboratory, Division of Gene Technology, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Solna, Sweden (Ö.Å., R.S., S.P., A.A., P.H., P.S.)
| | - Pontus Höjer
- Science for Life Laboratory, Division of Gene Technology, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Solna, Sweden (Ö.Å., R.S., S.P., A.A., P.H., P.S.)
| | - Flore-Anne Poujade
- Cardiovascular Medicine Unit, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden (F.-A.P., P.E.)
| | - Lasse Folkersen
- Department of Bioinformatics, Technical University of Denmark, Copenhagen, Denmark (L.F.)
| | - Professor Per Eriksson
- Cardiovascular Medicine Unit, Department of Medicine, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden (F.-A.P., P.E.)
| | - Pelin Sahlén
- Science for Life Laboratory, Division of Gene Technology, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, Solna, Sweden (Ö.Å., R.S., S.P., A.A., P.H., P.S.)
| |
Collapse
|
23
|
Wu Y, Li X, Liu J, Luo XJ, Yao YG. SZDB2.0: an updated comprehensive resource for schizophrenia research. Hum Genet 2020; 139:1285-1297. [DOI: 10.1007/s00439-020-02171-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Accepted: 04/25/2020] [Indexed: 12/11/2022]
|
24
|
Zheng Q, Ma Y, Chen S, Che Q, Chen D. The Integrated Landscape of Biological Candidate Causal Genes in Coronary Artery Disease. Front Genet 2020; 11:320. [PMID: 32373157 PMCID: PMC7186505 DOI: 10.3389/fgene.2020.00320] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Accepted: 03/18/2020] [Indexed: 12/27/2022] Open
Abstract
Background Genome-wide association studies (GWASs) have identified more than 150 genetic loci that demonstrate robust association with coronary artery disease (CAD). In contrast to the success of GWAS, the translation from statistical signals to biological mechanism and exploration of causal genes for drug development remain difficult, owing to the complexity of gene regulatory and linkage disequilibrium patterns. We aim to prioritize the plausible causal genes for CAD at a genome-wide level. Methods We integrated the latest GWAS summary statistics with other omics data from different layers and utilized eight different computational methods to predict CAD potential causal genes. The prioritized candidate genes were further characterized by pathway enrichment analysis, tissue-specific expression analysis, and pathway crosstalk analysis. Results Our analysis identified 55 high-confidence causal genes for CAD, among which 15 genes (LPL, COL4A2, PLG, CDKN2B, COL4A1, FES, FLT1, FN1, IL6R, LPA, PCSK9, PSRC1, SMAD3, SWAP70, and VAMP8) ranked the highest priority because of consistent evidence from different data-driven approaches. GO analysis showed that these plausible causal genes were enriched in lipid metabolic and extracellular regions. Tissue-specific enrichment analysis revealed that these genes were significantly overexpressed in adipose and liver tissues. Further, KEGG and crosstalk analysis also revealed several key pathways involved in the pathogenesis of CAD. Conclusion Our study delineated the landscape of CAD potential causal genes and highlighted several biological processes involved in CAD pathogenesis. Further studies and experimental validations of these genes may shed light on mechanistic insights into CAD development and provide potential drug targets for future therapeutics.
Collapse
Affiliation(s)
- Qiwen Zheng
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Yujia Ma
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Si Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Qianzi Che
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| | - Dafang Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing, China
| |
Collapse
|
25
|
Manuel AM, Dai Y, Freeman LA, Jia P, Zhao Z. Dense module searching for gene networks associated with multiple sclerosis. BMC Med Genomics 2020; 13:48. [PMID: 32241259 PMCID: PMC7118851 DOI: 10.1186/s12920-020-0674-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Multiple sclerosis (MS) is a complex disease in which the immune system attacks the central nervous system. The molecular mechanisms contributing to the etiology of MS remain poorly understood. Genome-wide association studies (GWAS) of MS have identified a small number of genetic loci significant at the genome level, but they are mainly non-coding variants. Network-assisted analysis may help better interpret the functional roles of the variants with association signals and potential translational medicine application. The Dense Module Searching of GWAS tool (dmGWAS version 2.4) developed in our team is applied to 2 MS GWAS datasets (GeneMSA and IMSGC GWAS) using the human protein interactome as the reference network. A dual evaluation strategy is used to generate results with reproducibility. RESULTS Approximately 7500 significant network modules were identified for each independent GWAS dataset, and 20 significant modules were identified from the dual evaluation. The top modules included GRB2, HDAC1, JAK2, MAPK1, and STAT3 as central genes. Top module genes were enriched with functional terms such as "regulation of glial cell differentiation" (adjusted p-value = 2.58 × 10- 3), "T-cell costimulation" (adjusted p-value = 2.11 × 10- 6) and "virus receptor activity" (adjusted p-value = 1.67 × 10- 3). Interestingly, top gene networks included several MS FDA approved drug target genes HDAC1, IL2RA, KEAP1, and RELA, CONCLUSIONS: Our dmGWAS network analyses highlighted several genes (GRB2, HDAC1, IL2RA, JAK2, KEAP1, MAPK1, RELA and STAT3) in top modules that are promising to interpret GWAS signals and link to MS drug targets. The genes enriched with glial cell differentiation are important for understanding neurodegenerative processes in MS and for remyelination therapy investigation. Importantly, our identified genetic signals enriched in T cell costimulation and viral receptor activity supported the viral infection onset hypothesis for MS.
Collapse
Affiliation(s)
- Astrid M. Manuel
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
| | - Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
| | - Leorah A. Freeman
- Department of Neurology, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, 7000 Fannin St. Suite 600, Houston, TX 77030 USA
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| |
Collapse
|
26
|
Shang L, Smith JA, Zhou X. Leveraging gene co-expression patterns to infer trait-relevant tissues in genome-wide association studies. PLoS Genet 2020; 16:e1008734. [PMID: 32310941 PMCID: PMC7192514 DOI: 10.1371/journal.pgen.1008734] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Revised: 04/30/2020] [Accepted: 03/24/2020] [Indexed: 12/11/2022] Open
Abstract
Genome-wide association studies (GWASs) have identified many SNPs associated with various common diseases. Understanding the biological functions of these identified SNP associations requires identifying disease/trait relevant tissues or cell types. Here, we develop a network method, CoCoNet, to facilitate the identification of trait-relevant tissues or cell types. Different from existing approaches, CoCoNet incorporates tissue-specific gene co-expression networks constructed from either bulk or single cell RNA sequencing (RNAseq) studies with GWAS data for trait-tissue inference. In particular, CoCoNet relies on a covariance regression network model to express gene-level effect measurements for the given GWAS trait as a function of the tissue-specific co-expression adjacency matrix. With a composite likelihood-based inference algorithm, CoCoNet is scalable to tens of thousands of genes. We validate the performance of CoCoNet through extensive simulations. We apply CoCoNet for an in-depth analysis of four neurological disorders and four autoimmune diseases, where we integrate the corresponding GWASs with bulk RNAseq data from 38 tissues and single cell RNAseq data from 10 cell types. In the real data applications, we show how CoCoNet can help identify specific glial cell types relevant for neurological disorders and identify disease-targeted colon tissues as relevant for autoimmune diseases.
Collapse
Affiliation(s)
- Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States of America
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI, United States of America
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States of America
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, United States of America
| |
Collapse
|
27
|
Kliebenstein DJ. Using networks to identify and interpret natural variation. CURRENT OPINION IN PLANT BIOLOGY 2020; 54:122-126. [PMID: 32413801 DOI: 10.1016/j.pbi.2020.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 04/20/2020] [Indexed: 06/11/2023]
Abstract
Studies on natural variation and network biology inherently work to summarize vast amounts of information and data. The combination of these two areas of study while creating datasets of immense complexity is critical to their mutual progress. Networks are necessary as a way to work to reduce the dimensionality inherent in natural variation with 100 s to 1000 s of genotypes. Correspondingly natural variation is essential for testing how networks may or may not be shared across individuals or species. Advances in this area of cross-fertilization including using networks directly as phenotypes and the use of networks to help in prioritizing candidate gene validation efforts. Interesting new observations on frequent presence-absence variation in gene content and adaptation is beginning to highlight the potential for natural variation in network presence-absence. This review attempts to delve into these new insights.
Collapse
Affiliation(s)
- Daniel J Kliebenstein
- Department of Plant Sciences, University of California, Davis, One Shields Avenue, Davis, CA, 95616, USA; DynaMo Center of Excellence, University of Copenhagen, Thorvaldsensvej 40, DK-1871, Frederiksberg C, Denmark.
| |
Collapse
|
28
|
Leal LG, David A, Jarvelin MR, Sebert S, Männikkö M, Karhunen V, Seaby E, Hoggart C, Sternberg MJE. Identification of disease-associated loci using machine learning for genotype and network data integration. Bioinformatics 2019; 35:5182-5190. [PMID: 31070705 PMCID: PMC6954643 DOI: 10.1093/bioinformatics/btz310] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Revised: 03/28/2019] [Accepted: 04/25/2019] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION Integration of different omics data could markedly help to identify biological signatures, understand the missing heritability of complex diseases and ultimately achieve personalized medicine. Standard regression models used in Genome-Wide Association Studies (GWAS) identify loci with a strong effect size, whereas GWAS meta-analyses are often needed to capture weak loci contributing to the missing heritability. Development of novel machine learning algorithms for merging genotype data with other omics data is highly needed as it could enhance the prioritization of weak loci. RESULTS We developed cNMTF (corrected non-negative matrix tri-factorization), an integrative algorithm based on clustering techniques of biological data. This method assesses the inter-relatedness between genotypes, phenotypes, the damaging effect of the variants and gene networks in order to identify loci-trait associations. cNMTF was used to prioritize genes associated with lipid traits in two population cohorts. We replicated 129 genes reported in GWAS world-wide and provided evidence that supports 85% of our findings (226 out of 265 genes), including recent associations in literature (NLGN1), regulators of lipid metabolism (DAB1) and pleiotropic genes for lipid traits (CARM1). Moreover, cNMTF performed efficiently against strong population structures by accounting for the individuals' ancestry. As the method is flexible in the incorporation of diverse omics data sources, it can be easily adapted to the user's research needs. AVAILABILITY AND IMPLEMENTATION An R package (cnmtf) is available at https://lgl15.github.io/cnmtf_web/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Luis G Leal
- Department of Life Sciences, Centre for Integrative Systems Biology and Bioinformatics, Imperial College London, London SW7 2AZ, UK
| | - Alessia David
- Department of Life Sciences, Centre for Integrative Systems Biology and Bioinformatics, Imperial College London, London SW7 2AZ, UK
| | - Marjo-Riita Jarvelin
- Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu FI-90014, Finland
- Biocenter Oulu, University of Oulu, Oulu 90220, Finland
- Unit of Primary Health Care, Oulu University Hospital, Oulu 90220, Finland
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London W2 1PG, UK
- Department of Life Sciences, College of Health and Life Sciences, Brunel University London, Middlesex UB8 3PH, UK
| | - Sylvain Sebert
- Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu FI-90014, Finland
- Biocenter Oulu, University of Oulu, Oulu 90220, Finland
| | - Minna Männikkö
- Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu FI-90014, Finland
| | - Ville Karhunen
- Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu FI-90014, Finland
- Biocenter Oulu, University of Oulu, Oulu 90220, Finland
- Unit of Primary Health Care, Oulu University Hospital, Oulu 90220, Finland
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London W2 1PG, UK
- Department of Life Sciences, College of Health and Life Sciences, Brunel University London, Middlesex UB8 3PH, UK
| | - Eleanor Seaby
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Clive Hoggart
- Department of Medicine, Imperial College London, London W2 1PG, UK
| | - Michael J E Sternberg
- Department of Life Sciences, Centre for Integrative Systems Biology and Bioinformatics, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
29
|
Picart-Armada S, Barrett SJ, Willé DR, Perera-Lluna A, Gutteridge A, Dessailly BH. Benchmarking network propagation methods for disease gene identification. PLoS Comput Biol 2019; 15:e1007276. [PMID: 31479437 PMCID: PMC6743778 DOI: 10.1371/journal.pcbi.1007276] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 09/13/2019] [Accepted: 07/16/2019] [Indexed: 12/17/2022] Open
Abstract
In-silico identification of potential target genes for disease is an essential aspect of drug target discovery. Recent studies suggest that successful targets can be found through by leveraging genetic, genomic and protein interaction information. Here, we systematically tested the ability of 12 varied algorithms, based on network propagation, to identify genes that have been targeted by any drug, on gene-disease data from 22 common non-cancerous diseases in OpenTargets. We considered two biological networks, six performance metrics and compared two types of input gene-disease association scores. The impact of the design factors in performance was quantified through additive explanatory models. Standard cross-validation led to over-optimistic performance estimates due to the presence of protein complexes. In order to obtain realistic estimates, we introduced two novel protein complex-aware cross-validation schemes. When seeding biological networks with known drug targets, machine learning and diffusion-based methods found around 2-4 true targets within the top 20 suggestions. Seeding the networks with genes associated to disease by genetics decreased performance below 1 true hit on average. The use of a larger network, although noisier, improved overall performance. We conclude that diffusion-based prioritisers and machine learning applied to diffusion-based features are suited for drug discovery in practice and improve over simpler neighbour-voting methods. We also demonstrate the large impact of choosing an adequate validation strategy and the definition of seed disease genes. The use of biological network data has proven its effectiveness in many areas from computational biology. Networks consist of nodes, usually genes or proteins, and edges that connect pairs of nodes, representing information such as physical interactions, regulatory roles or co-occurrence. In order to find new candidate nodes for a given biological property, the so-called network propagation algorithms start from the set of known nodes with that property and leverage the connections from the biological network to make predictions. Here, we assess the performance of several network propagation algorithms to find sensible gene targets for 22 common non-cancerous diseases, i.e. those that have been found promising enough to start the clinical trials with any compound. We focus on obtaining performance metrics that reflect a practical scenario in drug development where only a small set of genes can be essayed. We found that the presence of protein complexes biased the performance estimates, leading to over-optimistic conclusions, and introduced two novel strategies to address it. Our results support that network propagation is still a viable approach to find drug targets, but that special care needs to be put on the validation strategy. Algorithms benefitted from the use of a larger -although noisier- network and of direct evidence data, rather than indirect genetic associations to disease.
Collapse
Affiliation(s)
- Sergio Picart-Armada
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, Spain
- Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Spain
- * E-mail:
| | | | | | - Alexandre Perera-Lluna
- B2SLab, Departament d’Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya, CIBER-BBN, Barcelona, Spain
- Networking Biomedical Research Centre in the subject area of Bioengineering, Biomaterials and Nanomedicine (CIBER-BBN), Madrid, Spain
- Institut de Recerca Pediàtrica Hospital Sant Joan de Déu, Esplugues de Llobregat, Spain
| | - Alex Gutteridge
- Computational Biology and Statistics, GSK, Stevenage, United Kingdom
| | | |
Collapse
|
30
|
Marshall-Colón A, Kliebenstein DJ. Plant Networks as Traits and Hypotheses: Moving Beyond Description. TRENDS IN PLANT SCIENCE 2019; 24:840-852. [PMID: 31300195 DOI: 10.1016/j.tplants.2019.06.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 05/31/2019] [Accepted: 06/04/2019] [Indexed: 05/04/2023]
Abstract
Biology relies on the central thesis that the genes in an organism encode molecular mechanisms that combine with stimuli and raw materials from the environment to create a final phenotypic expression representative of the genomic programming. While conceptually simple, the genotype-to-phenotype linkage in a eukaryotic organism relies on the interactions of thousands of genes and an environment with a potentially unknowable level of complexity. Modern biology has moved to the use of networks in systems biology to try to simplify this complexity to decode how an organism's genome works. Previously, biological networks were basic ways to organize, simplify, and analyze data. However, recent advances are allowing networks to move beyond description and become phenotypes or hypotheses in their own right. This review discusses these efforts, like mapping responses across biological scales, including relationships among cellular entities, and the direct use of networks as traits or hypotheses.
Collapse
Affiliation(s)
- Amy Marshall-Colón
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Daniel J Kliebenstein
- Department of Plant Sciences, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA; DynaMo Center of Excellence, University of Copenhagen, Thorvaldsensvej 40, DK-1871 Frederiksberg C, Denmark.
| |
Collapse
|
31
|
Fine RS, Pers TH, Amariuta T, Raychaudhuri S, Hirschhorn JN. Benchmarker: An Unbiased, Association-Data-Driven Strategy to Evaluate Gene Prioritization Algorithms. Am J Hum Genet 2019; 104:1025-1039. [PMID: 31056107 PMCID: PMC6556976 DOI: 10.1016/j.ajhg.2019.03.027] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 03/28/2019] [Indexed: 01/17/2023] Open
Abstract
Genome-wide association studies (GWASs) are valuable for understanding human biology, but associated loci typically contain multiple associated variants and genes. Thus, algorithms that prioritize likely causal genes and variants for a given phenotype can provide biological interpretations of association data. However, a critical, currently missing capability is to objectively compare performance of such algorithms. Typical comparisons rely on "gold standard" genes harboring causal coding variants, but such gold standards may be biased and incomplete. To address this issue, we developed Benchmarker, an unbiased, data-driven benchmarking method that compares performance of similarity-based prioritization strategies to each other (and to random chance) by leave-one-chromosome-out cross-validation with stratified linkage disequilibrium (LD) score regression. We first applied Benchmarker to 20 well-powered GWASs and compared gene prioritization based on strategies employing three different data sources, including annotated gene sets and gene expression; genes prioritized based on gene sets had higher per-SNP heritability than those prioritized based on gene expression. Additionally, in a direct comparison of three methods, DEPICT and MAGMA outperformed NetWAS. We also evaluated combinations of methods; our results indicated that combining data sources and algorithms can help prioritize higher-quality genes for follow-up. Benchmarker provides an unbiased approach to evaluate any similarity-based method that provides genome-wide prioritization of genes, variants, or gene sets and can determine the best such method for any particular GWAS. Our method addresses an important unmet need for rigorous tool assessment and can assist in mapping genetic associations to causal function.
Collapse
Affiliation(s)
- Rebecca S Fine
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Division of Endocrinology and Center for Basic and Translational Obesity Research, Boston Children's Hospital, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Ph.D. Program in Biological and Biomedical Sciences, Graduate School of Arts and Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Tune H Pers
- The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark; Department of Epidemiology Research, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - Tiffany Amariuta
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Ph.D. Program in Bioinformatics and Integrative Genomics, Graduate School of Arts and Sciences, Harvard University, Cambridge, MA 02138, USA
| | - Soumya Raychaudhuri
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Division of Genetics, Brigham and Women's Hospital, Boston, MA 02115, USA; Division of Rheumatology, Immunology, and Allergy, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA; Arthritis Research UK Centre for Genetics and Genomics, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, The University of Manchester, Manchester M13 9PL, UK
| | - Joel N Hirschhorn
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA; Division of Endocrinology and Center for Basic and Translational Obesity Research, Boston Children's Hospital, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
32
|
Carlin DE, Fong SH, Qin Y, Jia T, Huang JK, Bao B, Zhang C, Ideker T. A Fast and Flexible Framework for Network-Assisted Genomic Association. iScience 2019; 16:155-161. [PMID: 31174177 PMCID: PMC6554232 DOI: 10.1016/j.isci.2019.05.025] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 04/09/2019] [Accepted: 05/11/2019] [Indexed: 02/06/2023] Open
Abstract
We present an accessible, fast, and customizable network propagation system for pathway boosting and interpretation of genome-wide association studies. This system-NAGA (Network Assisted Genomic Association)-taps the NDEx biological network resource to gain access to thousands of protein networks and select those most relevant and performative for a specific association study. The method works efficiently, completing genome-wide analysis in under 5 minutes on a modern laptop computer. We show that NAGA recovers many known disease genes from analysis of schizophrenia genetic data, and it substantially boosts associations with previously unappreciated genes such as amyloid beta precursor. On this and seven other gene-disease association tasks, NAGA outperforms conventional approaches in recovery of known disease genes and replicability of results. Protein interactions associated with disease are visualized and annotated in Cytoscape, which, in addition to standard programmatic interfaces, allows for downstream analysis.
Collapse
Affiliation(s)
- Daniel E Carlin
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA.
| | - Samson H Fong
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Yue Qin
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Tongqiu Jia
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Justin K Huang
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Bokan Bao
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Chao Zhang
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| | - Trey Ideker
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA; Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
33
|
Gajera M, Desai N, Suzuki A, Li A, Zhang M, Jun G, Jia P, Zhao Z, Iwata J. MicroRNA-655-3p and microRNA-497-5p inhibit cell proliferation in cultured human lip cells through the regulation of genes related to human cleft lip. BMC Med Genomics 2019; 12:70. [PMID: 31122291 PMCID: PMC6533741 DOI: 10.1186/s12920-019-0535-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 05/16/2019] [Indexed: 12/12/2022] Open
Abstract
Background The etiology of cleft lip with or without cleft palate (CL/P), a common congenital birth defect, is complex and involves the contribution of genetic and environmental factors. Although many candidate genes have been identified, the regulation and interaction of these genes in CL/P remain unclear. In addition, the contribution of microRNAs (miRNAs), non-coding RNAs that regulate the expression of multiple genes, to the etiology of CL/P is largely unknown. Methods To identify the signatures of causative biological pathways for human CL/P, we conducted a systematic literature review for human CL/P candidate genes and subsequent bioinformatics analyses. Functional enrichment analyses of the candidate CL/P genes were conducted using the pathway databases GO and KEGG. The miRNA-mediated post-transcriptional regulation of the CL/P candidate genes was analyzed with miRanda, PITA, and TargetScan, and miRTarbase. Genotype-phenotype association analysis was conducted using GWAS. The functional significance of the candidate miRNAs was evaluated experimentally in cell proliferation and target gene regulation assays in human lip fibroblasts. Results Through an extensive search of the main biomedical databases, we mined 177 genes with mutations or association/linkage reported in individuals with CL/P, and considered them as candidate genes for human CL/P. The genotype-phenotype association study revealed that mutations in 12 genes (ABCA4, ADAM3A, FOXE1, IRF6, MSX2, MTHFR, NTN1, PAX7, TP63, TPM1, VAX1, and WNT9B) were significantly associated with CL/P. In addition, our bioinformatics analysis predicted 16 microRNAs (miRNAs) to be post-transcriptional regulators of CL/P genes. To validate the bioinformatics results, the top six candidate miRNAs (miR-124-3p, miR-369-3p, miR-374a-5p, miR-374b-5p, miR-497-5p, and miR-655-3p) were evaluated by cell proliferation/survival assays and miRNA-gene regulation assays in cultured human lip fibroblasts. We found that miR-497-5p and miR-655-3p significantly suppressed cell proliferation in these cells. Furthermore, the expression of the predicted miRNA-target genes was significantly downregulated by either miR-497-5p or miR-655-3p mimic. Conclusion Expression of miR-497-5p and miR-655-3p suppresses cell proliferation through the regulation of human CL/P-candidate genes. This study provides insights into the role of miRNAs in the etiology of CL/P and suggests possible strategies for the diagnosis of CL/P. Electronic supplementary material The online version of this article (10.1186/s12920-019-0535-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mona Gajera
- Department of Diagnostic & Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Neha Desai
- Department of Diagnostic & Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Akiko Suzuki
- Department of Diagnostic & Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, USA.,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Aimin Li
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Musi Zhang
- Department of Diagnostic & Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, USA.,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Goo Jun
- Department of Epidemiology, Human Genetics & Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.,Department of Epidemiology, Human Genetics & Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA.,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, USA
| | - Junichi Iwata
- Department of Diagnostic & Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, USA. .,Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, USA. .,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, USA.
| |
Collapse
|
34
|
Tenenbaum JD, Bhuvaneshwar K, Gagliardi JP, Fultz Hollis K, Jia P, Ma L, Nagarajan R, Rakesh G, Subbian V, Visweswaran S, Zhao Z, Rozenblit L. Translational bioinformatics in mental health: open access data sources and computational biomarker discovery. Brief Bioinform 2019; 20:842-856. [PMID: 29186302 PMCID: PMC6585382 DOI: 10.1093/bib/bbx157] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Revised: 10/24/2017] [Indexed: 12/12/2022] Open
Abstract
Mental illness is increasingly recognized as both a significant cost to society and a significant area of opportunity for biological breakthrough. As -omics and imaging technologies enable researchers to probe molecular and physiological underpinnings of multiple diseases, opportunities arise to explore the biological basis for behavioral health and disease. From individual investigators to large international consortia, researchers have generated rich data sets in the area of mental health, including genomic, transcriptomic, metabolomic, proteomic, clinical and imaging resources. General data repositories such as the Gene Expression Omnibus (GEO) and Database of Genotypes and Phenotypes (dbGaP) and mental health (MH)-specific initiatives, such as the Psychiatric Genomics Consortium, MH Research Network and PsychENCODE represent a wealth of information yet to be gleaned. At the same time, novel approaches to integrate and analyze data sets are enabling important discoveries in the area of mental and behavioral health. This review will discuss and catalog into an organizing framework the increasingly diverse set of MH data resources available, using schizophrenia as a focus area, and will describe novel and integrative approaches to molecular biomarker discovery that make use of mental health data.
Collapse
Affiliation(s)
- Jessica D Tenenbaum
- Department of Biostatistics and Bioinformatics at the Duke University School of Medicine
| | | | | | - Kate Fultz Hollis
- Department of Biomedical Informatics and Clinical Epidemiology at Oregon Health and Science University
| | - Peilin Jia
- University of Texas Health Science Center at Houston
| | - Liang Ma
- Bioinformatics and Systems Medicine Laboratory (BSML), Center for Precision Health, School of Biomedical Informatics, the University of Texas Health Science Center at Houston
| | | | | | - Vignesh Subbian
- Department of Biomedical Engineering and the Department of Systems and Industrial Engineering at the University of Arizona
| | | | | | | |
Collapse
|
35
|
Xiao W, Wu Y, Wang J, Luo Z, Long L, Deng N, Ning S, Zeng Y, Long H, Xiao B. Network and Pathway-Based Analysis of Single-Nucleotide Polymorphism of miRNA in Temporal Lobe Epilepsy. Mol Neurobiol 2019; 56:7022-7031. [PMID: 30968344 DOI: 10.1007/s12035-019-1584-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 03/21/2019] [Indexed: 12/13/2022]
Abstract
Temporal lobe epilepsy (TLE) is a complex disease with its pathogenetic mechanism still unclear. Single-nucleotide polymorphisms (SNPs) of miRNA (miRSNPs) are SNPs located on miRNA genes or target sites of miRNAs, which have been proved to be associated with neuropsychic disease development by interfering with miRNA-mediated regulatory function. In this study, we integrated TLE-related risk genes and risk pathways multi-dimensionally based on public data resources. Furthermore, we systematically screened candidate functional miRSNPs for TLE and constructed a TLE-associated pathway-based miRSNP switching network, which included 92 miRNAs that target 12 TLE risk pathways. Moreover, we dissected thoroughly the correlation between 5 risk genes of 4 risk pathways and TLE development. Additionally, the biological function of several candidate miRSNPs were validated by luciferase reporter assay. In silico approach facilitates to select potential "miRSNP-miRNA-risk gene-pathway" axis for experimental validation, which provided new insights into the mechanism of miRSNPs as potential genetic risk factors of TLE.
Collapse
Affiliation(s)
- Wenbiao Xiao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, China
| | - Yanhao Wu
- Department of Respiratory Medicine, Xiangya Hospital, Central South University, Changsha, 410008, China
| | - Jianjian Wang
- Department of Neurology, the Second Affiliated Hospital, Harbin Medical University, Harbin, 150081, China
| | - Zhaohui Luo
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, China
| | - Lili Long
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, China
| | - Na Deng
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, China
| | - Shangwei Ning
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, China
| | - Yi Zeng
- Department of Geriatrics, Second Xiangya Hospital, Central South University, Changsha, 410011, China
| | - Hongyu Long
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, China.
| | - Bo Xiao
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, 410008, China.
| |
Collapse
|
36
|
Worthman CM, Dockray S, Marceau K. Puberty and the Evolution of Developmental Science. JOURNAL OF RESEARCH ON ADOLESCENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR RESEARCH ON ADOLESCENCE 2019; 29:9-31. [PMID: 30869841 PMCID: PMC6961839 DOI: 10.1111/jora.12411] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
In recent decades, theoretical and methodological advances have operated synergistically to advance understanding of puberty and prompt increasingly comprehensive models that engage with the temporal, psychosocial, and biological dimensions of this maturational milepost. This integrative overview discusses these theoretical and methodological advances and their implications for research and intervention to promote human development in the context of changing maturational schedules and massive ongoing social transformations.
Collapse
|
37
|
Sun R, Hui S, Bader GD, Lin X, Kraft P. Powerful gene set analysis in GWAS with the Generalized Berk-Jones statistic. PLoS Genet 2019; 15:e1007530. [PMID: 30875371 PMCID: PMC6436759 DOI: 10.1371/journal.pgen.1007530] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2018] [Revised: 03/27/2019] [Accepted: 02/28/2019] [Indexed: 11/19/2022] Open
Abstract
A common complementary strategy in Genome-Wide Association Studies (GWAS) is to perform Gene Set Analysis (GSA), which tests for the association between one phenotype of interest and an entire set of Single Nucleotide Polymorphisms (SNPs) residing in selected genes. While there exist many tools for performing GSA, popular methods often include a number of ad-hoc steps that are difficult to justify statistically, provide complicated interpretations based on permutation inference, and demonstrate poor operating characteristics. Additionally, the lack of gold standard gene set lists can produce misleading results and create difficulties in comparing analyses even across the same phenotype. We introduce the Generalized Berk-Jones (GBJ) statistic for GSA, a permutation-free parametric framework that offers asymptotic power guarantees in certain set-based testing settings. To adjust for confounding introduced by different gene set lists, we further develop a GBJ step-down inference technique that can discriminate between gene sets driven to significance by single genes and those demonstrating group-level effects. We compare GBJ to popular alternatives through simulation and re-analysis of summary statistics from a large breast cancer GWAS, and we show how GBJ can increase power by incorporating information from multiple signals in the same gene. In addition, we illustrate how breast cancer pathway analysis can be confounded by the frequency of FGFR2 in pathway lists. Our approach is further validated on two other datasets of summary statistics generated from GWAS of height and schizophrenia.
Collapse
Affiliation(s)
- Ryan Sun
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Shirley Hui
- The Donnelly Center, University of Toronto, Toronto, Ontario, Canada
| | - Gary D. Bader
- The Donnelly Center, University of Toronto, Toronto, Ontario, Canada
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Peter Kraft
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
38
|
Grozeva D, Saad S, Menzies GE, Sims R. Benefits and Challenges of Rare Genetic Variation in Alzheimer's Disease. CURRENT GENETIC MEDICINE REPORTS 2019; 7:53-62. [PMID: 39649954 PMCID: PMC7617023 DOI: 10.1007/s40142-019-0161-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Purpose of Review It is well established that sporadic Alzheimer's disease (AD) is polygenic with common and rare genetic variation alongside environmental factors contributing to disease. Here, we review our current understanding of the genetic architecture of disease, paying specific attention to rare susceptibility variants, and explore some of the limitations in rare variant detection and analysis. Recent Findings Rare variation has been shown to robustly associate with disease. These include potentially damaging and loss of function mutations that are easily modelled in silico, in vitro and in vivo, and represent potentially druggable targets. A number of risk genes, including TREM2, SORL1 and ABCA7 show multiple independent associations suggesting that they may influence disease via multiple mechanisms. With transcriptional regulation, inflammatory response and modification of protein production suggested to be of primary importance. Summary We are at the beginning of our journey of rare variant detection in AD. Whole exome sequencing has been the predominant technology of choice. While fruitful, this has introduced a number of challenges with regard to data integration. Ultimately the future of disease-associated rare variant identification lies in whole genome sequencing projects that will allow the testing of the full range of genomic variation.
Collapse
Affiliation(s)
- Detelina Grozeva
- Division of Psychological Medicine and Clinical Neuroscience, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, UK
| | - Salha Saad
- Division of Psychological Medicine and Clinical Neuroscience, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, UK
| | - Georgina E. Menzies
- UK Dementia Research Institute at Cardiff, School of Medicine, Cardiff University, Cardiff, UK
| | - Rebecca Sims
- Division of Psychological Medicine and Clinical Neuroscience, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, UK
- UK Dementia Research Institute at Cardiff, School of Medicine, Cardiff University, Cardiff, UK
| |
Collapse
|
39
|
|
40
|
Maynard RD, Ackert-Bicknell CL. Mouse Models and Online Resources for Functional Analysis of Osteoporosis Genome-Wide Association Studies. Front Endocrinol (Lausanne) 2019; 10:277. [PMID: 31133984 PMCID: PMC6515928 DOI: 10.3389/fendo.2019.00277] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 04/16/2019] [Indexed: 12/13/2022] Open
Abstract
Osteoporosis is a complex genetic disease in which the number of loci associated with the bone mineral density, a clinical risk factor for fracture, has increased at an exponential rate in the last decade. The identification of the causative variants and candidate genes underlying these loci has not been able to keep pace with the rate of locus discovery. A large number of tools and data resources have been built around the use of the mouse as model of human genetic disease. Herein, we describe resources available for functional validation of human Genome Wide Association Study (GWAS) loci using mouse models. We specifically focus on large-scale phenotyping efforts focused on bone relevant phenotypes and repositories of genotype-phenotype data that exist for transgenic and mutant mice, which can be readily mined as a first step toward more targeted efforts designed to deeply characterize the role of a gene in bone biology.
Collapse
Affiliation(s)
- Robert D. Maynard
- Center for Musculoskeletal Research, University of Rochester, Rochester, NY, United States
| | - Cheryl L. Ackert-Bicknell
- Center for Musculoskeletal Research, University of Rochester, Rochester, NY, United States
- Department of Orthopaedics and Rehabilitation, University of Rochester, Rochester, NY, United States
- *Correspondence: Cheryl L. Ackert-Bicknell
| |
Collapse
|
41
|
Kwok MK, Lin SL, Schooling CM. Re-thinking Alzheimer's disease therapeutic targets using gene-based tests. EBioMedicine 2018; 37:461-470. [PMID: 30314892 PMCID: PMC6446018 DOI: 10.1016/j.ebiom.2018.10.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Revised: 09/11/2018] [Accepted: 10/01/2018] [Indexed: 12/12/2022] Open
Abstract
Background Alzheimer's disease (AD) is a devastating condition with no known effective drug treatments. Existing drugs only alleviate symptoms. Given repeated expensive drug failures, we assessed systematically whether approved and investigational AD drugs are targeting products of genes strongly associated with AD and whether these genes are targeted by existing drugs for other indications which could be re-purposed. Methods We identified genes strongly associated with late-onset AD from the loci of genetic variants associated with AD at genome-wide-significance and from a gene-based test applied to the most extensively genotyped late-onset AD case (n = 17,008)-control (n = 37,154) study, the International Genomics of Alzheimer's Project. We used three gene-to-drug cross-references, Kyoto Encyclopedia of Genes and Genomes, Drugbank and Drug Repurposing Hub, to identify genetically validated targets of AD drugs and any existing drugs or nutraceuticals targeting products of the genes strongly associated with late-onset AD. Findings A total of 67 autosomal genes (forming 9 gene clusters) were identified as strongly associated with late-onset AD, 28 from the loci of single genetic variants, 51 from the gene-based test and 12 by both methods. Existing approved or investigational AD drugs did not target products of any of these 67 genes. Drugs for other indications targeted 11 of these genes, including immunosuppressive disease-modifying anti-rheumatic drugs targeting PTK2B gene products. Interpretation Approved and investigational AD drugs are not targeting products of genes strongly associated with late-onset AD. However, other drugs targeting products of these genes exist and could perhaps be re-purposing to combat late-onset AD after further scrutiny.
Collapse
Affiliation(s)
- Man Ki Kwok
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, 1/F, Patrick Manson Building (North Wing), 7 Sassoon Road, Hong Kong, China
| | - Shi Lin Lin
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, 1/F, Patrick Manson Building (North Wing), 7 Sassoon Road, Hong Kong, China
| | - C Mary Schooling
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, 1/F, Patrick Manson Building (North Wing), 7 Sassoon Road, Hong Kong, China; City University of New York, Graduate School of Public Health and Health Policy, New York, United States.
| |
Collapse
|
42
|
Integrating RNA-seq and GWAS reveals novel genetic mutations for buffalo reproductive traits. Anim Reprod Sci 2018; 197:290-295. [PMID: 30190187 DOI: 10.1016/j.anireprosci.2018.08.041] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 08/14/2018] [Accepted: 08/28/2018] [Indexed: 01/08/2023]
Abstract
Genome-wide association study (GWAS) has been applied in buffalo breeding programs and been used to identify a number of candidate genes associated with buffalo reproductive traits. The genetic code of specific genes underlying buffalo reproductive traits remains unclear. Association study that measures both genetic and transcriptional variation has been applied for the investigation of complex traits. To investigate genes involved in buffalo reproductive traits, integrated RNA-seq results were investigated of buffalo granulosa cells and candidate genes which were reported to be associated with buffalo reproductive traits in a previous GWAS. A large number of variants were detected by RNA-seq, and 214 variants were located within the buffalo reproductive candidate genes identified by GWAS. A further association study in 462 Italian Mediterranean buffalo indicated that 25 SNPs distributed in 13 genes were associated with reproductive traits. Of the 13 genes, 11 were expressed in granulosa cells of all antral follicle development stages, and significant difference was found in the expression of NDUFS2 between follicles of diameter <8 mm and > 8 mm. These findings extend the results of GWAS by expanding the knowledge about new and potentially functional single-nucleotide polymorphisms and provide useful information about regulatory genes affecting buffalo reproductive traits.
Collapse
|
43
|
Kim CY, Lee M, Lee K, Yoon SS, Lee I. Network-based genetic investigation of virulence-associated phenotypes in methicillin-resistant Staphylococcus aureus. Sci Rep 2018; 8:10796. [PMID: 30018396 PMCID: PMC6050336 DOI: 10.1038/s41598-018-29120-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 07/02/2018] [Indexed: 12/16/2022] Open
Abstract
Staphylococcus aureus is a gram-positive bacterium that causes a wide range of infections. Recently, the spread of methicillin-resistant S. aureus (MRSA) strains has seriously reduced antibiotic treatment options. Anti-virulence strategies, the objective of which is to target the virulence instead of the viability of the pathogen, have become widely accepted as a means of avoiding the emergence of new antibiotic-resistant strains. To increase the number of anti-virulence therapeutic options, it is necessary to identify as many novel virulence-associated genes as possible in MRSA. Co-functional networks have proved useful for mapping gene-to-phenotype associations in various organisms. Herein, we present StaphNet (www.inetbio.org/staphnet), a genome-scale co-functional network for an MRSA strain, S. aureus subsp. USA300_FPR3757. StaphNet, which was constructed by the integration of seven distinct types of genomics data within a Bayesian statistics framework, covers approximately 94% of the coding genome with a high degree of accuracy. We implemented a companion web server for network-based gene prioritization of the phenotypes of 31 different S. aureus strains. We demonstrated that StaphNet can effectively identify genes for virulence-associated phenotypes in MRSA. These results suggest that StaphNet can facilitate target discovery for the development of anti-virulence drugs to treat MRSA infection.
Collapse
Affiliation(s)
- Chan Yeong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Korea
| | - Muyoung Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Korea
| | - Keehoon Lee
- Department of Microbiology and Immunology, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, Korea
| | - Sang Sun Yoon
- Department of Microbiology and Immunology, Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, Korea
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Korea.
| |
Collapse
|
44
|
Dharanishanthi V, Ghosh Dasgupta M. Co-expression network of transcription factors reveal ethylene-responsive element-binding factor as key regulator of wood phenotype in Eucalyptus tereticornis. 3 Biotech 2018; 8:315. [PMID: 30023147 DOI: 10.1007/s13205-018-1344-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2018] [Accepted: 07/09/2018] [Indexed: 12/28/2022] Open
Abstract
Suitability of wood biomass for pulp production is dependent on the cellular architecture and composition of secondary cell wall. Presently, systems genetics approach is being employed to understand the molecular basis of trait variation and co-expression network analysis has enabled holistic understanding of complex trait such as secondary development. Transcription factors (TFs) are reported as key regulators of meristematic growth and wood formation. The hierarchical TF network is a multi-layered system which interacts with downstream structural genes involved in biosynthesis of cellulose, hemicelluloses and lignin. Several TFs have been associated with wood formation in tree species such as Populus, Eucalyptus, Picea and Pinus. However, TF-specific co-expression networks to understand the interaction between these regulators are not reported. In the present study, co-expression network was developed for TFs expressed during wood formation in Eucalyptus tereticornis and ethylene-responsive element-binding factor, EtERF2, was identified as the major hub transcript which co-expressed with other secondary cell wall biogenesis-specific TFs such as EtSND2, EtVND1, EtVND4, EtVND6, EtMYB70, EtGRAS and EtSCL8. This study reveals a probable role of ethylene in determining natural variation in wood properties in Eucalyptus species. Understanding this transcriptional regulation underpinning the complex bio-processing trait of wood biomass will complement the Eucalyptus breeding program through selection of industrially suitable phenotypes by marker-assisted selection.
Collapse
|
45
|
Lancour D, Naj A, Mayeux R, Haines JL, Pericak-Vance MA, Schellenberg GD, Crovella M, Farrer LA, Kasif S. One for all and all for One: Improving replication of genetic studies through network diffusion. PLoS Genet 2018; 14:e1007306. [PMID: 29684019 PMCID: PMC5933817 DOI: 10.1371/journal.pgen.1007306] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 05/03/2018] [Accepted: 03/11/2018] [Indexed: 12/31/2022] Open
Abstract
Improving accuracy in genetic studies would greatly accelerate understanding the genetic basis of complex diseases. One approach to achieve such an improvement for risk variants identified by the genome wide association study (GWAS) approach is to incorporate previously known biology when screening variants across the genome. We developed a simple approach for improving the prioritization of candidate disease genes that incorporates a network diffusion of scores from known disease genes using a protein network and a novel integration with GWAS risk scores, and tested this approach on a large Alzheimer disease (AD) GWAS dataset. Using a statistical bootstrap approach, we cross-validated the method and for the first time showed that a network approach improves the expected replication rates in GWAS studies. Several novel AD genes were predicted including CR2, SHARPIN, and PTPN2. Our re-prioritized results are enriched for established known AD-associated biological pathways including inflammation, immune response, and metabolism, whereas standard non-prioritized results were not. Our findings support a strategy of considering network information when investigating genetic risk factors. Integrating multiple types of -omics data is a rapidly growing research area due in part to the increasing amount of diverse and publicly accessible data. In this study, we demonstrated that integration of genetic association and protein interaction data using a network diffusion approach measurably improves reproducibility of top candidate genes. Application of this approach to Alzheimer disease (AD) using a large dataset assembled by the Alzheimer’s Disease Genetics Consortium identified several novel candidate AD genes that are supported by pre-existing knowledge of AD pathobiology. Our findings support a strategy of considering network information when investigating genetic risk factors. Finally, we developed a transparent and easy-to-use R package that can facilitate the extension of our methodology to other phenotypes for which genetic data are available.
Collapse
Affiliation(s)
- Daniel Lancour
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Adam Naj
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Richard Mayeux
- Department of Neurology and Sergievsky Center, Columbia University, New York, New York, United States of America
| | - Jonathan L. Haines
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Margaret A. Pericak-Vance
- Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Gerard D. Schellenberg
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Mark Crovella
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Computer Science, Boston University, Boston, Massachusetts, United States of America
| | - Lindsay A. Farrer
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts, United States of America
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Department of Ophthalmology, Boston University School of Medicine, Boston, Massachusetts, United States of America
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
- Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts, United States of America
- * E-mail:
| | - Simon Kasif
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
| |
Collapse
|
46
|
Nadeem MA, Nawaz MA, Shahid MQ, Doğan Y, Comertpay G, Yıldız M, Hatipoğlu R, Ahmad F, Alsaleh A, Labhane N, Özkan H, Chung G, Baloch FS. DNA molecular markers in plant breeding: current status and recent advancements in genomic selection and genome editing. BIOTECHNOL BIOTEC EQ 2017. [DOI: 10.1080/13102818.2017.1400401] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Muhammad Azhar Nadeem
- Department of Field Crops, Faculty of Agricultural and Natural Sciences, Abant İzzet Baysal University, Bolu, Turkey
| | - Muhammad Amjad Nawaz
- Department of Biotechnology, School of Engineering, Chonnam National University, Yeosu, Korea
| | - Muhammad Qasim Shahid
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Agriculture, South China Agricultural University, Guangzhou, P. R. China
| | - Yıldız Doğan
- Department of Field Crops, Eastern Mediterranean Agricultural Research Institute, Agricultural Ministry, Adana, Turkey
| | - Gonul Comertpay
- Department of Field Crops, Eastern Mediterranean Agricultural Research Institute, Agricultural Ministry, Adana, Turkey
| | - Mehtap Yıldız
- Department of Agricultural Biotechnology, Faculty of Agriculture, Yuzuncu Yıl University, Van, Turkey
| | - Rüştü Hatipoğlu
- Department of Field Crops, Faculty of Agriculture, University of Çukurova, Adana, Turkey
| | - Fiaz Ahmad
- Botany Division, Institute of Pure and Applied Biology, Bahauddin Zakariya University, Punjab, Pakistan
| | - Ahmad Alsaleh
- Molecular Genetics Laboratory, Science and Technology Application and Research Center, Bozok University, Yozgat, Turkey
| | - Nitin Labhane
- Department of Botany, Bhavan's College, University of Mumbai, Mumbai, India
| | - Hakan Özkan
- Department of Field Crops, Faculty of Agriculture, University of Çukurova, Adana, Turkey
| | - Gyuhwa Chung
- Department of Biotechnology, School of Engineering, Chonnam National University, Yeosu, Korea
| | - Faheem Shehzad Baloch
- Department of Field Crops, Faculty of Agricultural and Natural Sciences, Abant İzzet Baysal University, Bolu, Turkey
| |
Collapse
|
47
|
Convergence between biological, behavioural and genetic determinants of obesity. Nat Rev Genet 2017; 18:731-748. [PMID: 28989171 DOI: 10.1038/nrg.2017.72] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Multiple biological, behavioural and genetic determinants or correlates of obesity have been identified to date. Genome-wide association studies (GWAS) have contributed to the identification of more than 100 obesity-associated genetic variants, but their roles in causal processes leading to obesity remain largely unknown. Most variants are likely to have tissue-specific regulatory roles through joint contributions to biological pathways and networks, through changes in gene expression that influence quantitative traits, or through the regulation of the epigenome. The recent availability of large-scale functional genomics resources provides an opportunity to re-examine obesity GWAS data to begin elucidating the function of genetic variants. Interrogation of knockout mouse phenotype resources provides a further avenue to test for evidence of convergence between genetic variation and biological or behavioural determinants of obesity.
Collapse
|
48
|
Shim JE, Bang C, Yang S, Lee T, Hwang S, Kim CY, Singh-Blom UM, Marcotte EM, Lee I. GWAB: a web server for the network-based boosting of human genome-wide association data. Nucleic Acids Res 2017; 45:W154-W161. [PMID: 28449091 PMCID: PMC5793838 DOI: 10.1093/nar/gkx284] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2017] [Revised: 04/01/2017] [Accepted: 04/17/2017] [Indexed: 12/29/2022] Open
Abstract
During the last decade, genome-wide association studies (GWAS) have represented a major approach to dissect complex human genetic diseases. Due in part to limited statistical power, most studies identify only small numbers of candidate genes that pass the conventional significance thresholds (e.g. P ≤ 5 × 10-8). This limitation can be partly overcome by increasing the sample size, but this comes at a higher cost. Alternatively, weak association signals can be boosted by incorporating independent data. Previously, we demonstrated the feasibility of boosting GWAS disease associations using gene networks. Here, we present a web server, GWAB (www.inetbio.org/gwab), for the network-based boosting of human GWAS data. Using GWAS summary statistics (P-values) for SNPs along with reference genes for a disease of interest, GWAB reprioritizes candidate disease genes by integrating the GWAS and network data. We found that GWAB could more effectively retrieve disease-associated reference genes than GWAS could alone. As an example, we describe GWAB-boosted candidate genes for coronary artery disease and supporting data in the literature. These results highlight the inherent value in sub-threshold GWAS associations, which are often not publicly released. GWAB offers a feasible general approach to boost such associations for human disease genetics.
Collapse
Affiliation(s)
- Jung Eun Shim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| | - Changbae Bang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| | - Sunmo Yang
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| | - Tak Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| | - Sohyun Hwang
- Department of Biomedical Science, College of Life Science, CHA University, Seongnam-si 13496, Korea
| | - Chan Yeong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| | - U Martin Singh-Blom
- Cognition Group, Schibsted Products & Technologies, Västra Järnvägsgatan 21, 111 64 Stockholm, Sweden
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas, Austin, TX 78712, USA
- Department of Molecular Biosciences, University of Texas at Austin, TX 78712, USA
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 120-749, Korea
| |
Collapse
|
49
|
Kelly NJ, Radder JE, Baust JJ, Burton CL, Lai YC, Potoka KC, Agostini BA, Wood JP, Bachman TN, Vanderpool RR, Dandachi N, Leme AS, Gregory AD, Morris A, Mora AL, Gladwin MT, Shapiro SD. Mouse Genome-Wide Association Study of Preclinical Group II Pulmonary Hypertension Identifies Epidermal Growth Factor Receptor. Am J Respir Cell Mol Biol 2017; 56:488-496. [PMID: 28085498 DOI: 10.1165/rcmb.2016-0176oc] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Pulmonary hypertension (PH) is associated with features of obesity and metabolic syndrome that translate to the induction of PH by chronic high-fat diet (HFD) in some inbred mouse strains. We conducted a genome-wide association study (GWAS) to identify candidate genes associated with susceptibility to HFD-induced PH. Mice from 36 inbred and wild-derived strains were fed with regular diet or HFD for 20 weeks beginning at 6-12 weeks of age, after which right ventricular (RV) and left ventricular (LV) end-systolic pressure (ESP) and maximum pressure (MaxP) were measured by cardiac catheterization. We tested for association of RV MaxP and RV ESP and identified genomic regions enriched with nominal associations to both of these phenotypes. We excluded genomic regions if they were also associated with LV MaxP, LV ESP, or body weight. Genes within significant regions were scored based on the shortest-path betweenness centrality, a measure of network connectivity, of their human orthologs in a gene interaction network of human PH-related genes. WSB/EiJ, NON/ShiLtJ, and AKR/J mice had the largest increases in RV MaxP after high-fat feeding. Network-based scoring of GWAS candidates identified epidermal growth factor receptor (Egfr) as having the highest shortest-path betweenness centrality of GWAS candidates. Expression studies of lung homogenate showed that EGFR expression is increased in the AKR/J strain, which developed a significant increase in RV MaxP after high-fat feeding as compared with C57BL/6J, which did not. Our combined GWAS and network-based approach adds evidence for a role for Egfr in murine PH.
Collapse
Affiliation(s)
| | | | | | | | - Yen-Chun Lai
- 1 Department of Medicine.,2 Vascular Medicine Institute, and
| | - Karin C Potoka
- 1 Department of Medicine.,3 Department of Pediatrics, University of Pittsburgh and University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | | | | | | | | | | | | | | | | | - Ana L Mora
- 1 Department of Medicine.,2 Vascular Medicine Institute, and
| | - Mark T Gladwin
- 1 Department of Medicine.,2 Vascular Medicine Institute, and
| | | |
Collapse
|
50
|
Abstract
The rapid increase in loci discovered in genome-wide association studies has created a need to understand the biological implications of these results. Gene-set analysis provides a means of gaining such understanding, but the statistical properties of gene-set analysis are not well understood, which compromises our ability to interpret its results. In this Analysis article, we provide an extensive statistical evaluation of the core structure that is inherent to all gene- set analyses and we examine current implementations in available tools. We show which factors affect valid and successful detection of gene sets and which provide a solid foundation for performing and interpreting gene-set analysis.
Collapse
|