1
|
Ziętek MM, Jaszczyk A, Stankiewicz AM, Sampino S. Prenatal gene-environment interactions mediate the impact of advanced maternal age on mouse offspring behavior. Sci Rep 2024; 14:31733. [PMID: 39738558 DOI: 10.1038/s41598-024-82070-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 12/02/2024] [Indexed: 01/02/2025] Open
Abstract
Autism spectrum disorders encompass diverse neurodevelopmental conditions marked by alterations in social communication and repetitive behaviors. Advanced maternal age is associated with an increased risk of bearing children affected by autism but the etiological factors underlying this association are not well known. Here, we investigated the effects of advanced maternal age on offspring health and behavior in two genetically divergent mouse strains: the BTBR T+ Itpr3tf/J (BTBR) mouse model of idiopathic autism, and the C57BL/6 J (B6) control strain, as a model of genetic variability. In both strains, advanced maternal age negatively affected female reproductive and pregnancy outcomes, and perturbed placental and fetal growth, and the expression of genes in the fetal brain tissues. Postnatally, advanced maternal age had strain-dependent effects on offspring sociability, learning skills, and the occurrence of perseverative behaviors, varying between male and female offspring. These findings disentangle the relationship between genetic determinants and maternal age-related factors in shaping the emergence of autism-like behaviors in mice, highlighting the interplay between maternal age, genetic variability, and prenatal programming, in the occurrence of neurodevelopmental disorders.
Collapse
Affiliation(s)
- Marta Marlena Ziętek
- Department of Experimental Embryology, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, Jastrzębiec, Poland
| | - Aneta Jaszczyk
- Department of Animal Behavior and Welfare, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, Jastrzębiec, Poland
| | - Adrian Mateusz Stankiewicz
- Department of Animal Behavior and Welfare, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, Jastrzębiec, Poland
| | - Silvestre Sampino
- Department of Experimental Embryology, Institute of Genetics and Animal Biotechnology of the Polish Academy of Sciences, Jastrzębiec, Poland.
| |
Collapse
|
2
|
Hagenauer MH, Sannah Y, Hebda-Bauer EK, Rhoads C, O'Connor AM, Flandreau E, Watson SJ, Akil H. Resource: A curated database of brain-related functional gene sets (Brain.GMT). MethodsX 2024; 13:102788. [PMID: 39049932 PMCID: PMC11267058 DOI: 10.1016/j.mex.2024.102788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 05/31/2024] [Indexed: 07/27/2024] Open
Abstract
Transcriptional profiling has become a common tool for investigating the nervous system. During analysis, differential expression results are often compared to functional ontology databases, which contain curated gene sets representing well-studied pathways. This dependence can cause neuroscience studies to be interpreted in terms of functional pathways documented in better studied tissues (e.g., liver) and topics (e.g., cancer), and systematically emphasizes well-studied genes, leaving other findings in the obscurity of the brain "ignorome". To address this issue, we compiled a curated database of 918 gene sets related to nervous system function, tissue, and cell types ("Brain.GMT") that can be used within common analysis pipelines (GSEA, limma, edgeR) to interpret results from three species (rat, mouse, human). Brain.GMT includes brain-related gene sets curated from the Molecular Signatures Database (MSigDB) and extracted from public databases (GeneWeaver, Gemma, DropViz, BrainInABlender, HippoSeq) and published studies containing differential expression results. Although Brain.GMT is still undergoing development and currently only represents a fraction of available brain gene sets, "brain ignorome" genes are already better represented than in traditional Gene Ontology databases. Moreover, Brain.GMT substantially improves the quantity and quality of gene sets identified as enriched with differential expression in neuroscience studies, enhancing interpretation. •We compiled a curated database of 918 gene sets related to nervous system function, tissue, and cell types ("Brain.GMT").•Brain.GMT can be used within common analysis pipelines (GSEA, limma, edgeR) to interpret neuroscience transcriptional profiling results from three species (rat, mouse, human).•Although Brain.GMT is still undergoing development, it substantially improved the interpretation of differential expression results within our initial use cases.
Collapse
Affiliation(s)
- Megan H. Hagenauer
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yusra Sannah
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Cosette Rhoads
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
- National Institutes of Health, Bethesda, MD 20892, USA
| | - Angela M. O'Connor
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
| | | | - Stanley J. Watson
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
| | - Huda Akil
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
Duan TQ, Hagenauer MH, Flandreau EI, Bader A, Nguyen DM, Maras PM, Merscher S De Lima R, Gyles T, Mclain C, Meaney MJ, Nestler EJ, Watson SJ, Akil H. A meta-analysis of the effects of early life stress on the prefrontal cortex transcriptome suggests long-term effects on myelin. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.22.624315. [PMID: 39605735 PMCID: PMC11601536 DOI: 10.1101/2024.11.22.624315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Background Early life stress (ELS) refers to exposure to negative childhood experiences, such as neglect, disaster, and physical, mental, or emotional abuse. ELS can permanently alter the brain, leading to cognitive impairment, increased sensitivity to future stressors, and mental health risks. The prefrontal cortex (PFC) is a key brain region implicated in the effects of ELS. Methods To better understand the effects of ELS on the PFC, we ran a meta-analysis of publicly available transcriptional profiling datasets. We identified five datasets (GSE89692, GSE116416, GSE14720, GSE153043, GSE124387) that characterized the long-term effects of multi-day postnatal ELS paradigms (maternal separation, limited nesting/bedding) in male and female laboratory rodents (rats, mice). The outcome variable was gene expression in the PFC later in adulthood as measured by microarray or RNA-Seq. To conduct the meta-analysis, preprocessed gene expression data were extracted from the Gemma database. Following quality control, the final sample size was n=89: n=42 controls & n=47 ELS: GSE116416 n=23 (no outliers); GSE116416 n=44 (2 outliers); GSE14720 n=7 (no outliers); GSE153043 n=9 (1 outlier), and GSE124387 n=6 (no outliers). Differential expression was calculated using the limma pipeline followed by an empirical Bayes correction. For each gene, a random effects meta-analysis model was then fit to the ELS vs. Control effect sizes (Log2 Fold Changes) from each study. Results Our meta-analysis yielded stable estimates for 11,885 genes, identifying five genes with differential expression following ELS (false discovery rate< 0.05): transforming growth factor alpha ( Tgfa ), IQ motif containing GTPase activating protein 3 ( Iqgap3 ), collagen, type XI, alpha 1 ( Col11a1 ), claudin 11 ( Cldn11 ) and myelin associated glycoprotein ( Mag ), all of which were downregulated. Broadly, gene sets associated with oligodendrocyte differentiation, myelination, and brain development were downregulated following ELS. In contrast, genes previously shown to be upregulated in Major Depressive Disorder patients were upregulated following ELS. Conclusion These findings suggest that ELS during critical periods of development may produce long-term effects on the efficiency of transmission in the PFC and drive changes in gene expression similar to those underlying depression.
Collapse
|
4
|
Yuan H, Hicks P, Ahmadian M, Johnson KA, Valtadoros L, Krishnan A. Annotating publicly-available samples and studies using interpretable modeling of unstructured metadata. Brief Bioinform 2024; 26:bbae652. [PMID: 39710433 DOI: 10.1093/bib/bbae652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 10/31/2024] [Accepted: 12/13/2024] [Indexed: 12/24/2024] Open
Abstract
Reusing massive collections of publicly available biomedical data can significantly impact knowledge discovery. However, these public samples and studies are typically described using unstructured plain text, hindering the findability and further reuse of the data. To combat this problem, we propose txt2onto 2.0, a general-purpose method based on natural language processing and machine learning for annotating biomedical unstructured metadata to controlled vocabularies of diseases and tissues. Compared to the previous version (txt2onto 1.0), which uses numerical embeddings as features, this new version uses words as features, resulting in improved interpretability and performance, especially when few positive training instances are available. Txt2onto 2.0 uses embeddings from a large language model during prediction to deal with unseen-yet-relevant words related to each disease and tissue term being predicted from the input text, thereby explaining the basis of every annotation. We demonstrate the generalizability of txt2onto 2.0 by accurately predicting disease annotations for studies from independent datasets, using proteomics and clinical trials as examples. Overall, our approach can annotate biomedical text regardless of experimental types or sources. Code, data, and trained models are available at https://github.com/krishnanlab/txt2onto2.0.
Collapse
Affiliation(s)
- Hao Yuan
- Genetics and Genome Sciences Program, Michigan State University, East Lansing, MI 48823, United States
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI 48823, United States
| | - Parker Hicks
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - Mansooreh Ahmadian
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - Kayla A Johnson
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - Lydia Valtadoros
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| | - Arjun Krishnan
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, United States
| |
Collapse
|
5
|
Cavalcante BRR, Freitas RD, Siquara da Rocha LO, Santos RSB, Souza BSDF, Ramos PIP, Rocha GV, Gurgel Rocha CA. In silico approaches for drug repurposing in oncology: a scoping review. Front Pharmacol 2024; 15:1400029. [PMID: 38919258 PMCID: PMC11196849 DOI: 10.3389/fphar.2024.1400029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 05/14/2024] [Indexed: 06/27/2024] Open
Abstract
Introduction: Cancer refers to a group of diseases characterized by the uncontrolled growth and spread of abnormal cells in the body. Due to its complexity, it has been hard to find an ideal medicine to treat all cancer types, although there is an urgent need for it. However, the cost of developing a new drug is high and time-consuming. In this sense, drug repurposing (DR) can hasten drug discovery by giving existing drugs new disease indications. Many computational methods have been applied to achieve DR, but just a few have succeeded. Therefore, this review aims to show in silico DR approaches and the gap between these strategies and their ultimate application in oncology. Methods: The scoping review was conducted according to the Arksey and O'Malley framework and the Joanna Briggs Institute recommendations. Relevant studies were identified through electronic searching of PubMed/MEDLINE, Embase, Scopus, and Web of Science databases, as well as the grey literature. We included peer-reviewed research articles involving in silico strategies applied to drug repurposing in oncology, published between 1 January 2003, and 31 December 2021. Results: We identified 238 studies for inclusion in the review. Most studies revealed that the United States, India, China, South Korea, and Italy are top publishers. Regarding cancer types, breast cancer, lymphomas and leukemias, lung, colorectal, and prostate cancer are the top investigated. Additionally, most studies solely used computational methods, and just a few assessed more complex scientific models. Lastly, molecular modeling, which includes molecular docking and molecular dynamics simulations, was the most frequently used method, followed by signature-, Machine Learning-, and network-based strategies. Discussion: DR is a trending opportunity but still demands extensive testing to ensure its safety and efficacy for the new indications. Finally, implementing DR can be challenging due to various factors, including lack of quality data, patient populations, cost, intellectual property issues, market considerations, and regulatory requirements. Despite all the hurdles, DR remains an exciting strategy for identifying new treatments for numerous diseases, including cancer types, and giving patients faster access to new medications.
Collapse
Affiliation(s)
- Bruno Raphael Ribeiro Cavalcante
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- Department of Pathology and Forensic Medicine of the School of Medicine, Federal University of Bahia, Salvador, Brazil
| | - Raíza Dias Freitas
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- Department of Social and Pediatric Dentistry of the School of Dentistry, Federal University of Bahia, Salvador, Brazil
| | - Leonardo de Oliveira Siquara da Rocha
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- Department of Pathology and Forensic Medicine of the School of Medicine, Federal University of Bahia, Salvador, Brazil
| | | | - Bruno Solano de Freitas Souza
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- D’Or Institute for Research and Education (IDOR), Salvador, Brazil
| | - Pablo Ivan Pereira Ramos
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- Center of Data and Knowledge Integration for Health (CIDACS), Salvador, Brazil
| | - Gisele Vieira Rocha
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- D’Or Institute for Research and Education (IDOR), Salvador, Brazil
| | - Clarissa Araújo Gurgel Rocha
- Gonçalo Moniz Institute, Oswaldo Cruz Foundation (IGM-FIOCRUZ/BA), Salvador, Brazil
- Department of Pathology and Forensic Medicine of the School of Medicine, Federal University of Bahia, Salvador, Brazil
- D’Or Institute for Research and Education (IDOR), Salvador, Brazil
- Department of Propaedeutics, School of Dentistry of the Federal University of Bahia, Salvador, Brazil
| |
Collapse
|
6
|
Piquer-Esteban S, Arnau V, Diaz W, Moya A. OMD Curation Toolkit: a workflow for in-house curation of public omics datasets. BMC Bioinformatics 2024; 25:184. [PMID: 38724907 PMCID: PMC11084137 DOI: 10.1186/s12859-024-05803-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 05/07/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND Major advances in sequencing technologies and the sharing of data and metadata in science have resulted in a wealth of publicly available datasets. However, working with and especially curating public omics datasets remains challenging despite these efforts. While a growing number of initiatives aim to re-use previous results, these present limitations that often lead to the need for further in-house curation and processing. RESULTS Here, we present the Omics Dataset Curation Toolkit (OMD Curation Toolkit), a python3 package designed to accompany and guide the researcher during the curation process of metadata and fastq files of public omics datasets. This workflow provides a standardized framework with multiple capabilities (collection, control check, treatment and integration) to facilitate the arduous task of curating public sequencing data projects. While centered on the European Nucleotide Archive (ENA), the majority of the provided tools are generic and can be used to curate datasets from different sources. CONCLUSIONS Thus, it offers valuable tools for the in-house curation previously needed to re-use public omics data. Due to its workflow structure and capabilities, it can be easily used and benefit investigators in developing novel omics meta-analyses based on sequencing data.
Collapse
Affiliation(s)
- Samuel Piquer-Esteban
- Institute for Integrative Systems Biology (I2SysBio), University of Valencia and Spanish National Research Council, Valencia, Spain.
- Area of Genomics and Health, Foundation for the Promotion of Sanitary and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain.
| | - Vicente Arnau
- Institute for Integrative Systems Biology (I2SysBio), University of Valencia and Spanish National Research Council, Valencia, Spain
- Area of Genomics and Health, Foundation for the Promotion of Sanitary and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain
- Biomedical Research Networking Centre for Epidemiology and Public Health (CIBEResp), Madrid, Spain
| | - Wladimiro Diaz
- Institute for Integrative Systems Biology (I2SysBio), University of Valencia and Spanish National Research Council, Valencia, Spain
- Area of Genomics and Health, Foundation for the Promotion of Sanitary and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain
- Biomedical Research Networking Centre for Epidemiology and Public Health (CIBEResp), Madrid, Spain
| | - Andrés Moya
- Institute for Integrative Systems Biology (I2SysBio), University of Valencia and Spanish National Research Council, Valencia, Spain.
- Area of Genomics and Health, Foundation for the Promotion of Sanitary and Biomedical Research of Valencia Region (FISABIO-Public Health), Valencia, Spain.
- Biomedical Research Networking Centre for Epidemiology and Public Health (CIBEResp), Madrid, Spain.
| |
Collapse
|
7
|
Hagenauer MH, Sannah Y, Hebda-Bauer EK, Rhoads C, O'Connor AM, Watson SJ, Akil H. Resource: A Curated Database of Brain-Related Functional Gene Sets (Brain.GMT). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.05.588301. [PMID: 38645214 PMCID: PMC11030436 DOI: 10.1101/2024.04.05.588301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Transcriptional profiling has become a common tool for investigating the nervous system. During analysis, differential expression results are often compared to functional ontology databases, which contain curated gene sets representing well-studied pathways. This dependence can cause neuroscience studies to be interpreted in terms of functional pathways documented in better studied tissues (e.g., liver) and topics (e.g., cancer), and systematically emphasizes well-studied genes, leaving other findings in the obscurity of the brain "ignorome". To address this issue, we compiled a curated database of 918 gene sets related to nervous system function, tissue, and cell types ("Brain.GMT") that can be used within common analysis pipelines (GSEA, limma, edgeR) to interpret results from three species (rat, mouse, human). Brain.GMT includes brain-related gene sets curated from the Molecular Signatures Database (MSigDB) and extracted from public databases (GeneWeaver, Gemma, DropViz, BrainInABlender, HippoSeq) and published studies containing differential expression results. Although Brain.GMT is still undergoing development and currently only represents a fraction of available brain gene sets, "brain ignorome" genes are already better represented than in traditional Gene Ontology databases. Moreover, Brain.GMT substantially improves the quantity and quality of gene sets identified as enriched with differential expression in neuroscience studies, enhancing interpretation.
Collapse
Affiliation(s)
- Megan H Hagenauer
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
| | - Yusra Sannah
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
| | - Elaine K Hebda-Bauer
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
| | - Cosette Rhoads
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
- National Institutes of Health, Bethesda, MD 20892, USA
| | - Angela M O'Connor
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
| | - Stanley J Watson
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
| | - Huda Akil
- Michigan Neuroscience Institute, University of Michigan, Ann Arbor; MI 48109, USA
| |
Collapse
|
8
|
Mecham A, Stephenson A, Quinteros BI, Brown GS, Piccolo SR. TidyGEO: preparing analysis-ready datasets from Gene Expression Omnibus. J Integr Bioinform 2024; 21:jib-2023-0021. [PMID: 38047898 PMCID: PMC11294518 DOI: 10.1515/jib-2023-0021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 11/20/2023] [Indexed: 12/05/2023] Open
Abstract
TidyGEO is a Web-based tool for downloading, tidying, and reformatting data series from Gene Expression Omnibus (GEO). As a freely accessible repository with data from over 6 million biological samples across more than 4000 organisms, GEO provides diverse opportunities for secondary research. Although scientists may find assay data relevant to a given research question, most analyses require sample-level annotations. In GEO, such annotations are stored alongside assay data in delimited, text-based files. However, the structure and semantics of the annotations vary widely from one series to another, and many annotations are not useful for analysis purposes. Thus, every GEO series must be tidied before it is analyzed. Manual approaches may be used, but these are error prone and take time away from other research tasks. Custom computer scripts can be written, but many scientists lack the computational expertise to create such scripts. To address these challenges, we created TidyGEO, which supports essential data-cleaning tasks for sample-level annotations, such as selecting informative columns, renaming columns, splitting or merging columns, standardizing data values, and filtering samples. Additionally, users can integrate annotations with assay data, restructure assay data, and generate code that enables others to reproduce these steps.
Collapse
Affiliation(s)
- Avery Mecham
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Ashlie Stephenson
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Badi I. Quinteros
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Grace S. Brown
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | | |
Collapse
|
9
|
Buzzao D, Castresana-Aguirre M, Guala D, Sonnhammer ELL. Benchmarking enrichment analysis methods with the disease pathway network. Brief Bioinform 2024; 25:bbae069. [PMID: 38436561 PMCID: PMC10939300 DOI: 10.1093/bib/bbae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 01/10/2024] [Accepted: 02/03/2024] [Indexed: 03/05/2024] Open
Abstract
Enrichment analysis (EA) is a common approach to gain functional insights from genome-scale experiments. As a consequence, a large number of EA methods have been developed, yet it is unclear from previous studies which method is the best for a given dataset. The main issues with previous benchmarks include the complexity of correctly assigning true pathways to a test dataset, and lack of generality of the evaluation metrics, for which the rank of a single target pathway is commonly used. We here provide a generalized EA benchmark and apply it to the most widely used EA methods, representing all four categories of current approaches. The benchmark employs a new set of 82 curated gene expression datasets from DNA microarray and RNA-Seq experiments for 26 diseases, of which only 13 are cancers. In order to address the shortcomings of the single target pathway approach and to enhance the sensitivity evaluation, we present the Disease Pathway Network, in which related Kyoto Encyclopedia of Genes and Genomes pathways are linked. We introduce a novel approach to evaluate pathway EA by combining sensitivity and specificity to provide a balanced evaluation of EA methods. This approach identifies Network Enrichment Analysis methods as the overall top performers compared with overlap-based methods. By using randomized gene expression datasets, we explore the null hypothesis bias of each method, revealing that most of them produce skewed P-values.
Collapse
Affiliation(s)
- Davide Buzzao
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 171 21 Solna, Sweden
| | | | - Dimitri Guala
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 171 21 Solna, Sweden
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 171 21 Solna, Sweden
| |
Collapse
|
10
|
Ratajczak F, Joblin M, Hildebrandt M, Ringsquandl M, Falter-Braun P, Heinig M. Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases. Nat Commun 2023; 14:7206. [PMID: 37938585 PMCID: PMC10632370 DOI: 10.1038/s41467-023-42975-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 10/27/2023] [Indexed: 11/09/2023] Open
Abstract
Understanding phenotype-to-genotype relationships is a grand challenge of 21st century biology with translational implications. The recently proposed "omnigenic" model postulates that effects of genetic variation on traits are mediated by core-genes and -proteins whose activities mechanistically influence the phenotype, whereas peripheral genes encode a regulatory network that indirectly affects phenotypes via core gene products. Here, we develop a positive-unlabeled graph representation-learning ensemble-approach based on a nested cross-validation to predict core-like genes for diverse diseases using Mendelian disorder genes for training. Employing mouse knockout phenotypes for external validations, we demonstrate that core-like genes display several key properties of core genes: Mouse knockouts of genes corresponding to our most confident predictions give rise to relevant mouse phenotypes at rates on par with the Mendelian disorder genes, and all candidates exhibit core gene properties like transcriptional deregulation in disease and loss-of-function intolerance. Moreover, as predicted for core genes, our candidates are enriched for drug targets and druggable proteins. In contrast to Mendelian disorder genes the new core-like genes are enriched for druggable yet untargeted gene products, which are therefore attractive targets for drug development. Interpretation of the underlying deep learning model suggests plausible explanations for our core gene predictions in form of molecular mechanisms and physical interactions. Our results demonstrate the potential of graph representation learning for the interpretation of biological complexity and pave the way for studying core gene properties and future drug development.
Collapse
Affiliation(s)
- Florin Ratajczak
- Institute of Network Biology (INET), Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich, Neuherberg, Germany
| | | | | | | | - Pascal Falter-Braun
- Institute of Network Biology (INET), Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich, Neuherberg, Germany.
- Microbe-Host Interactions, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany.
| | - Matthias Heinig
- Institute of Computational Biology (ICB), Helmholtz Munich, Neuherberg, Germany.
- Department of Computer Science, TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- German Centre for Cardiovascular Research (DZHK), Munich Heart Association, Partner Site Munich, Berlin, Germany.
| |
Collapse
|
11
|
Morin A, Chu ECP, Sharma A, Adrian-Hamazaki A, Pavlidis P. Characterizing the targets of transcription regulators by aggregating ChIP-seq and perturbation expression data sets. Genome Res 2023; 33:763-778. [PMID: 37308292 PMCID: PMC10317128 DOI: 10.1101/gr.277273.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 04/26/2023] [Indexed: 06/14/2023]
Abstract
Mapping the gene targets of chromatin-associated transcription regulators (TRs) is a major goal of genomics research. ChIP-seq of TRs and experiments that perturb a TR and measure the differential abundance of gene transcripts are a primary means by which direct relationships are tested on a genomic scale. It has been reported that there is a poor overlap in the evidence across gene regulation strategies, emphasizing the need for integrating results from multiple experiments. Although research consortia interested in gene regulation have produced a valuable trove of high-quality data, there is an even greater volume of TR-specific data throughout the literature. In this study, we show a workflow for the identification, uniform processing, and aggregation of ChIP-seq and TR perturbation experiments for the ultimate purpose of ranking human and mouse TR-target interactions. Focusing on an initial set of eight regulators (ASCL1, HES1, MECP2, MEF2C, NEUROD1, PAX6, RUNX1, and TCF4), we identified 497 experiments suitable for analysis. We used this corpus to examine data concordance, to identify systematic patterns of the two data types, and to identify putative orthologous interactions between human and mouse. We build upon commonly used strategies to forward a procedure for aggregating and combining these two genomic methodologies, assessing these rankings against independent literature-curated evidence. Beyond a framework extensible to other TRs, our work also provides empirically ranked TR-target listings, as well as transparent experiment-level gene summaries for community use.
Collapse
Affiliation(s)
- Alexander Morin
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Eric Ching-Pan Chu
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Aman Sharma
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Alex Adrian-Hamazaki
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
12
|
Overnight Corticosterone and Gene Expression in Mouse Hippocampus: Time Course during Resting Period. Int J Mol Sci 2023; 24:ijms24032828. [PMID: 36769150 PMCID: PMC9917930 DOI: 10.3390/ijms24032828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 02/05/2023] Open
Abstract
The aim of the experiment was to test the effect of an elevated level of glucocorticoids on the mouse hippocampal transcriptome after 12 h of treatment with corticosterone that was administered during an active phase of the circadian cycle. Additionally, we also tested the circadian changes in gene expression and the decay time of transcriptomic response to corticosterone. Gene expression was analyzed using microarrays. Obtained results show that transcriptomic responses to glucocorticoids are heterogeneous in terms of the decay time with some genes displaying persistent changes in expression during 9 h of rest. We have also found a considerable overlap between genes regulated by corticosterone and genes implicated previously in stress response. The examples of such genes are Acer2, Agt, Apod, Aqp4, Etnppl, Fabp7, Fam107a, Fjx1, Fmo2, Galnt15, Gjc2, Heph, Hes5, Htra1, Jdp2, Kif5a, Lfng, Lrg1, Mgp, Mt1, Pglyrp1, Pla2g3, Plin4, Pllp, Ptgds, Ptn, Slc2a1, Slco1c1, Sult1a1, Thbd and Txnip. This indicates that the applied model is a useful tool for the investigation of mechanisms underlying the stress response.
Collapse
|
13
|
Ament SA, Adkins RS, Carter R, Chrysostomou E, Colantuoni C, Crabtree J, Creasy HH, Degatano K, Felix V, Gandt P, Garden G, Giglio M, Herb BR, Khajouei F, Kiernan E, McCracken C, McDaniel K, Nadendla S, Nickel L, Olley D, Orvis J, Receveur J, Schor M, Sonthalia S, Tickle T, Way J, Hertzano R, Mahurkar A, White O. The Neuroscience Multi-Omic Archive: a BRAIN Initiative resource for single-cell transcriptomic and epigenomic data from the mammalian brain. Nucleic Acids Res 2023; 51:D1075-D1085. [PMID: 36318260 PMCID: PMC9825473 DOI: 10.1093/nar/gkac962] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 09/30/2022] [Accepted: 10/27/2022] [Indexed: 11/06/2022] Open
Abstract
Scalable technologies to sequence the transcriptomes and epigenomes of single cells are transforming our understanding of cell types and cell states. The Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative Cell Census Network (BICCN) is applying these technologies at unprecedented scale to map the cell types in the mammalian brain. In an effort to increase data FAIRness (Findable, Accessible, Interoperable, Reusable), the NIH has established repositories to make data generated by the BICCN and related BRAIN Initiative projects accessible to the broader research community. Here, we describe the Neuroscience Multi-Omic Archive (NeMO Archive; nemoarchive.org), which serves as the primary repository for genomics data from the BRAIN Initiative. Working closely with other BRAIN Initiative researchers, we have organized these data into a continually expanding, curated repository, which contains transcriptomic and epigenomic data from over 50 million brain cells, including single-cell genomic data from all of the major regions of the adult and prenatal human and mouse brains, as well as substantial single-cell genomic data from non-human primates. We make available several tools for accessing these data, including a searchable web portal, a cloud-computing interface for large-scale data processing (implemented on Terra, terra.bio), and a visualization and analysis platform, NeMO Analytics (nemoanalytics.org).
Collapse
Affiliation(s)
- Seth A Ament
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Ricky S Adkins
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Robert Carter
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Elena Chrysostomou
- Department of Otorhinolaryngology Head and Neck Surgery, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Carlo Colantuoni
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Departments of Neurology and Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Jonathan Crabtree
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Heather H Creasy
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Kylee Degatano
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Victor Felix
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Peter Gandt
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Gwenn A Garden
- Department of Neurology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Michelle Giglio
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Brian R Herb
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Farzaneh Khajouei
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Elizabeth Kiernan
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Carrie McCracken
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Kennedy McDaniel
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Suvarna Nadendla
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Lance Nickel
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Dustin Olley
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Joshua Orvis
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Joseph P Receveur
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Mike Schor
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Shreyash Sonthalia
- Department of Biomedical Engineering, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Timothy L Tickle
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jessica Way
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ronna Hertzano
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Otorhinolaryngology Head and Neck Surgery, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Anatomy and Neurobiology, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Anup A Mahurkar
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Owen R White
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA
- Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD, USA
| |
Collapse
|
14
|
Stankiewicz AM, Jaszczyk A, Goscik J, Juszczak GR. Stress and the brain transcriptome: Identifying commonalities and clusters in standardized data from published experiments. Prog Neuropsychopharmacol Biol Psychiatry 2022; 119:110558. [PMID: 35405299 DOI: 10.1016/j.pnpbp.2022.110558] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 03/17/2022] [Accepted: 04/04/2022] [Indexed: 12/28/2022]
Abstract
Interpretation of transcriptomic experiments is hindered by many problems including false positives/negatives inherent to big-data methods and changes in gene nomenclature. To find the most consistent effect of stress on brain transcriptome, we retrieved data from 79 studies applying animal models and 3 human studies investigating post-traumatic stress disorder (PTSD). The analyzed data were obtained either with microarrays or RNA sequencing applied to samples collected from more than 1887 laboratory animals and from 121 human subjects. Based on the initial database containing a quarter million differential expression effect sizes representing transcripts in three species, we identified the most frequently reported genes in 223 stress-control comparisons. Additionally, the analysis considers sex, individual vulnerability and contribution of glucocorticoids. We also found an overlap between gene expression in PTSD patients and animals which indicates relevance of laboratory models for human stress response. Our analysis points to genes that, as far as we know, were not specifically tested for their role in stress response (Pllp, Arrdc2, Midn, Mfsd2a, Ccn1, Htra1, Csrnp1, Tenm4, Tnfrsf25, Sema3b, Fmo2, Adamts4, Gjb1, Errfi1, Fgf18, Galnt6, Slc25a42, Ifi30, Slc4a1, Cemip, Klf10, Tom1, Dcdc2c, Fancd2, Luzp2, Trpm1, Abcc12, Osbpl1a, Ptp4a2). Provided transcriptomic resource will be useful for guiding the new research.
Collapse
Affiliation(s)
- Adrian M Stankiewicz
- Department of Molecular Biology, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Jastrzebiec, Poland
| | - Aneta Jaszczyk
- Department of Animal Behavior and Welfare, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Jastrzebiec, Poland
| | - Joanna Goscik
- Faculty of Computer Science, Bialystok University of Technology, Bialystok, Poland
| | - Grzegorz R Juszczak
- Department of Animal Behavior and Welfare, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Jastrzebiec, Poland.
| |
Collapse
|
15
|
Figueiredo RQ, Del Ser SD, Raschka T, Hofmann-Apitius M, Kodamullil AT, Mubeen S, Domingo-Fernández D. Elucidating gene expression patterns across multiple biological contexts through a large-scale investigation of transcriptomic datasets. BMC Bioinformatics 2022; 23:231. [PMID: 35705903 PMCID: PMC9202106 DOI: 10.1186/s12859-022-04765-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/03/2022] [Indexed: 11/10/2022] Open
Abstract
Distinct gene expression patterns within cells are foundational for the diversity of functions and unique characteristics observed in specific contexts, such as human tissues and cell types. Though some biological processes commonly occur across contexts, by harnessing the vast amounts of available gene expression data, we can decipher the processes that are unique to a specific context. Therefore, with the goal of developing a portrait of context-specific patterns to better elucidate how they govern distinct biological processes, this work presents a large-scale exploration of transcriptomic signatures across three different contexts (i.e., tissues, cell types, and cell lines) by leveraging over 600 gene expression datasets categorized into 98 subcontexts. The strongest pairwise correlations between genes from these subcontexts are used for the construction of co-expression networks. Using a network-based approach, we then pinpoint patterns that are unique and common across these subcontexts. First, we focused on patterns at the level of individual nodes and evaluated their functional roles using a human protein-protein interactome as a referential network. Next, within each context, we systematically overlaid the co-expression networks to identify specific and shared correlations as well as relations already described in scientific literature. Additionally, in a pathway-level analysis, we overlaid node and edge sets from co-expression networks against pathway knowledge to identify biological processes that are related to specific subcontexts or groups of them. Finally, we have released our data and scripts at https://zenodo.org/record/5831786 and https://github.com/ContNeXt/ , respectively and developed ContNeXt ( https://contnext.scai.fraunhofer.de/ ), a web application to explore the networks generated in this work.
Collapse
Affiliation(s)
- Rebeca Queiroz Figueiredo
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Sara Díaz Del Ser
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Tamara Raschka
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany.,Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany
| | - Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany.,Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany. .,Fraunhofer Center for Machine Learning, Sankt Augustin, Germany. .,Enveda Biosciences, Boulder, CO, 80301, USA.
| |
Collapse
|
16
|
Luca BA, Steen CB, Matusiak M, Azizi A, Varma S, Zhu C, Przybyl J, Espín-Pérez A, Diehn M, Alizadeh AA, van de Rijn M, Gentles AJ, Newman AM. Atlas of clinically distinct cell states and ecosystems across human solid tumors. Cell 2021; 184:5482-5496.e28. [PMID: 34597583 DOI: 10.1016/j.cell.2021.09.014] [Citation(s) in RCA: 147] [Impact Index Per Article: 36.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Revised: 06/21/2021] [Accepted: 09/08/2021] [Indexed: 12/31/2022]
Abstract
Determining how cells vary with their local signaling environment and organize into distinct cellular communities is critical for understanding processes as diverse as development, aging, and cancer. Here we introduce EcoTyper, a machine learning framework for large-scale identification and validation of cell states and multicellular communities from bulk, single-cell, and spatially resolved gene expression data. When applied to 12 major cell lineages across 16 types of human carcinoma, EcoTyper identified 69 transcriptionally defined cell states. Most states were specific to neoplastic tissue, ubiquitous across tumor types, and significantly prognostic. By analyzing cell-state co-occurrence patterns, we discovered ten clinically distinct multicellular communities with unexpectedly strong conservation, including three with myeloid and stromal elements linked to adverse survival, one enriched in normal tissue, and two associated with early cancer development. This study elucidates fundamental units of cellular organization in human carcinoma and provides a framework for large-scale profiling of cellular ecosystems in any tissue.
Collapse
Affiliation(s)
- Bogdan A Luca
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Chloé B Steen
- Division of Oncology, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | | | - Armon Azizi
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Sushama Varma
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Chunfang Zhu
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Joanna Przybyl
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Almudena Espín-Pérez
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Maximilian Diehn
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Department of Radiation Oncology, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA
| | - Ash A Alizadeh
- Division of Oncology, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA; Division of Hematology, Department of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Matt van de Rijn
- Department of Pathology, Stanford University, Stanford, CA 94305, USA
| | - Andrew J Gentles
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA.
| | - Aaron M Newman
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA 94305, USA; Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
17
|
Construction of a Potentially Functional circRNA-miRNA-mRNA Network in Intervertebral Disc Degeneration by Bioinformatics Analysis. BIOMED RESEARCH INTERNATIONAL 2021; 2021:8352683. [PMID: 34395625 PMCID: PMC8357516 DOI: 10.1155/2021/8352683] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/21/2021] [Indexed: 12/19/2022]
Abstract
Background The competing endogenous RNA- (ceRNA-) mediated regulatory mechanisms are known to play a pivotal role in intervertebral disc degeneration (IDD). Our research intended to establish a ceRNA regulatory network related to IDD through bioinformatics analyses. Methods The expression profiles of circRNA, miRNA, and mRNA were obtained from the public Gene Expression Omnibus (GEO) datasets. Then, we use sequence-based bioinformatics methods to select differentially expressed mRNAs (DEmRNAs), microRNAs (DEmiRNAs), or circRNAs (DEcircRNAs) related to IDD. We used ChEA3 to verify the targets of transcription factors (TFs). Then, we used DAVID to annotate the DEmRNAs. Finally, we constructed a potentially circRNA-miRNA-mRNA network related to IDD by predicting in the database (ENCORI, TargetScan, miRecords, miRmap, and circBank). Results We identified 31 common DEmRNAs by Venn analysis, of which MMP2 was regarded as the key hub genes. Simultaneously, miR-423-5p and miR-185-5p were predicted as the upstream molecules of MMP2. Furthermore, a total of six DEcircRNAs were predicted as the upstream circRNAs of miR-423-5p and miR-185-5p. Then, a potential circRNA-miRNA-mRNA network related to IDD was constructed by bioinformatics analysis. Conclusion A comprehensive ceRNA regulatory network was constructed, which was found to be significant in IDD progression.
Collapse
|
18
|
Patel S, Howard D, French L. A pH-eQTL Interaction at the RIT2- SYT4 Parkinson's Disease Risk Locus in the Substantia Nigra. Front Aging Neurosci 2021; 13:690632. [PMID: 34305570 PMCID: PMC8299340 DOI: 10.3389/fnagi.2021.690632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 06/14/2021] [Indexed: 11/13/2022] Open
Abstract
Parkinson's disease causes severe motor and cognitive disabilities that result from the progressive loss of dopamine neurons in the substantia nigra. The rs12456492 variant in the RIT2 gene has been repeatedly associated with increased risk for Parkinson's disease. From a transcriptomic perspective, a meta-analysis found that RIT2 gene expression is correlated with pH in the human brain. To assess these pH associations in relation to Parkinson's disease risk, we examined the two datasets that assayed rs12456492, gene expression, and pH in the postmortem human brain. Using the BrainEAC dataset, we replicate the positive correlation between RIT2 gene expression and pH in the human brain (n = 100). Furthermore, we found that the relationship between expression and pH is influenced by rs12456492. When tested across ten brain regions, this interaction is specifically found in the substantia nigra. A similar association was found for the co-localized SYT4 gene. In addition, SYT4 associations are stronger in a combined model with both genes, and the SYT4 interaction appears to be specific to males. In the Genotype-Tissue Expression (GTEx) dataset, the pH associations involving rs12456492 and expression of either SYT4 and RIT2 were not seen. This null finding may be due to the short postmortem intervals of the GTEx tissue samples. In the BrainEAC data, we tested the effect of postmortem interval and only observed the interactions in samples with the longer intervals. These previously unknown associations suggest novel roles for rs12456492, RIT2, and SYT4 in the regulation and response to pH in the substantia nigra.
Collapse
Affiliation(s)
- Sejal Patel
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Derek Howard
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Leon French
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Institute for Medical Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
19
|
Patel S, Howard D, Chowdhury N, Derieux C, Wellslager B, Yilmaz Ö, French L. Characterization of Human Genes Modulated by Porphyromonas gingivalis Highlights the Ribosome, Hypothalamus, and Cholinergic Neurons. Front Immunol 2021; 12:646259. [PMID: 34194426 PMCID: PMC8236716 DOI: 10.3389/fimmu.2021.646259] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2020] [Accepted: 05/20/2021] [Indexed: 12/12/2022] Open
Abstract
Porphyromonas gingivalis, a bacterium associated with periodontal disease, is a suspected cause of Alzheimer's disease. This bacterium is reliant on gingipain proteases, which cleave host proteins after arginine and lysine residues. To characterize gingipain susceptibility, we performed enrichment analyses of arginine and lysine proportion proteome-wide. Genes differentially expressed in brain samples with detected P. gingivalis reads were also examined. Genes from these analyses were tested for functional enrichment and specific neuroanatomical expression patterns. Proteins in the SRP-dependent cotranslational protein targeting to membrane pathway were enriched for these residues and previously associated with periodontal and Alzheimer's disease. These ribosomal genes are up-regulated in prefrontal cortex samples with detected P. gingivalis sequences. Other differentially expressed genes have been previously associated with dementia (ITM2B, MAPT, ZNF267, and DHX37). For an anatomical perspective, we characterized the expression of the P. gingivalis associated genes in the mouse and human brain. This analysis highlighted the hypothalamus, cholinergic neurons, and the basal forebrain. Our results suggest markers of neural P. gingivalis infection and link the cholinergic and gingipain hypotheses of Alzheimer's disease.
Collapse
Affiliation(s)
- Sejal Patel
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Derek Howard
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Nityananda Chowdhury
- Department of Oral Health Sciences, Medical University of South Carolina, Charleston, SC, United States
| | - Casey Derieux
- Department of Oral Health Sciences, Medical University of South Carolina, Charleston, SC, United States
| | - Bridgette Wellslager
- Department of Oral Health Sciences, Medical University of South Carolina, Charleston, SC, United States
| | - Özlem Yilmaz
- Department of Oral Health Sciences, Medical University of South Carolina, Charleston, SC, United States
- Department of Microbiology and Immunology, Medical University of South Carolina, Charleston, SC, United States
| | - Leon French
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Institute for Medical Science, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
20
|
Sicherman J, Newton DF, Pavlidis P, Sibille E, Tripathy SJ. Estimating and Correcting for Off-Target Cellular Contamination in Brain Cell Type Specific RNA-Seq Data. Front Mol Neurosci 2021; 14:637143. [PMID: 33746712 PMCID: PMC7966716 DOI: 10.3389/fnmol.2021.637143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Accepted: 02/02/2021] [Indexed: 11/13/2022] Open
Abstract
Transcriptionally profiling minor cellular populations remains an ongoing challenge in molecular genomics. Single-cell RNA sequencing has provided valuable insights into a number of hypotheses, but practical and analytical challenges have limited its widespread adoption. A similar approach, which we term single-cell type RNA sequencing (sctRNA-seq), involves the enrichment and sequencing of a pool of cells, yielding cell type-level resolution transcriptomes. While this approach offers benefits in terms of mRNA sampling from targeted cell types, it is potentially affected by off-target contamination from surrounding cell types. Here, we leveraged single-cell sequencing datasets to apply a computational approach for estimating and controlling the amount of off-target cell type contamination in sctRNA-seq datasets. In datasets obtained using a number of technologies for cell purification, we found that most sctRNA-seq datasets tended to show some amount of off-target mRNA contamination from surrounding cells. However, using covariates for cellular contamination in downstream differential expression analyses increased the quality of our models for differential expression analysis in case/control comparisons and typically resulted in the discovery of more differentially expressed genes. In general, our method provides a flexible approach for detecting and controlling off-target cell type contamination in sctRNA-seq datasets.
Collapse
Affiliation(s)
- Jordan Sicherman
- Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Dwight F. Newton
- Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON, Canada
- Centre for Addiction and Mental Health, Campbell Family Mental Health Research Institute, Toronto, ON, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
- Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, BC, Canada
| | - Etienne Sibille
- Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON, Canada
- Centre for Addiction and Mental Health, Campbell Family Mental Health Research Institute, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| | - Shreejoy J. Tripathy
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Krembil Centre for Neuroinformatics, Centre for Addiction and Mental Health, Toronto, ON, Canada
| |
Collapse
|