1
|
Cortes-Guzman MA, Treviño V. CoGTEx: Unscaled system-level coexpression estimation from GTEx data forecast novel functional gene partners. PLoS One 2024; 19:e0309961. [PMID: 39365797 PMCID: PMC11451983 DOI: 10.1371/journal.pone.0309961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 08/21/2024] [Indexed: 10/06/2024] Open
Abstract
MOTIVATION Coexpression estimations are helpful for analysis of pathways, cofactors, regulators, targets, and human health and disease. Ideally, coexpression estimations should consider as many diverse cell types as possible and consider that available data is not uniform across tissues. Importantly, the coexpression estimations accessible today are performed on a "tissue level", which is based on cell type standardized formulations. Little or no attention is paid to overall gene expression levels. The tissue-level estimation assumes that variance expression levels are more important than mean expression levels. Here, we challenge this assumption by estimating a coexpression calculation at the "system level", which is estimated without standardization by tissue, and show that it provides valuable information. We made available a resource to view, download, and analyze both, tissue- and system-level coexpression estimations from GTEx human data. METHODS GTEx v8 expression data was globally normalized, batch-processed, and filtered. Then, PCA, clustering, and tSNE stringent procedures were applied to generate 42 distinct and curated tissue clusters. Coexpression was estimated from these 42 tissue clusters computing the correlation of 33,445 genes by sampling 70 samples per tissue cluster to avoid tissue overrepresentation. This process was repeated 20 times, extracting the minimum value provided as a robust estimation. Three metrics were calculated (Pearson, Spearman, and G-statistic) in two data processing modes, at the system-level (TPM scale) and tissue levels (z-score scale). RESULTS We first validate our tissue-level estimations compared with other databases. Then, by specific analyses in several examples and literature validations of predictions, we show that system-level coexpression estimation differs from tissue-level estimations and that both contain valuable information reflected in biological pathways. We also show that coexpression estimations are associated to transcriptional regulation. Finally, we present CoGTEx, a valuable resource for viewing and analyzing coexpressed genes in human adult tissues from GTEx v8 data. We introduce our web resource to list, view and explore the coexpressed genes from GTEx data. CONCLUSION We conclude that system-level coexpression is a novel and interesting coexpression metric capable of generating plausible predictions and biological hypotheses; and that CoGTEx is a valuable resource to view, compare, and download system- and tissue- level coexpression estimations from GTEx data. AVAILABILITY The web resource is available at http://bioinformatics.mx/cogtex.
Collapse
Affiliation(s)
| | - Víctor Treviño
- Tecnologico de Monterrey, Escuela de Medicina, Bioinformática, Monterrey, Nuevo León, México
- Tecnologico de Monterrey, OriGen Project, Monterrey, Nuevo León, México
| |
Collapse
|
2
|
Cuevas-Diaz Duran R, Wei H, Wu J. Data normalization for addressing the challenges in the analysis of single-cell transcriptomic datasets. BMC Genomics 2024; 25:444. [PMID: 38711017 PMCID: PMC11073985 DOI: 10.1186/s12864-024-10364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 04/29/2024] [Indexed: 05/08/2024] Open
Abstract
BACKGROUND Normalization is a critical step in the analysis of single-cell RNA-sequencing (scRNA-seq) datasets. Its main goal is to make gene counts comparable within and between cells. To do so, normalization methods must account for technical and biological variability. Numerous normalization methods have been developed addressing different sources of dispersion and making specific assumptions about the count data. MAIN BODY The selection of a normalization method has a direct impact on downstream analysis, for example differential gene expression and cluster identification. Thus, the objective of this review is to guide the reader in making an informed decision on the most appropriate normalization method to use. To this aim, we first give an overview of the different single cell sequencing platforms and methods commonly used including isolation and library preparation protocols. Next, we discuss the inherent sources of variability of scRNA-seq datasets. We describe the categories of normalization methods and include examples of each. We also delineate imputation and batch-effect correction methods. Furthermore, we describe data-driven metrics commonly used to evaluate the performance of normalization methods. We also discuss common scRNA-seq methods and toolkits used for integrated data analysis. CONCLUSIONS According to the correction performed, normalization methods can be broadly classified as within and between-sample algorithms. Moreover, with respect to the mathematical model used, normalization methods can further be classified into: global scaling methods, generalized linear models, mixed methods, and machine learning-based methods. Each of these methods depict pros and cons and make different statistical assumptions. However, there is no better performing normalization method. Instead, metrics such as silhouette width, K-nearest neighbor batch-effect test, or Highly Variable Genes are recommended to assess the performance of normalization methods.
Collapse
Affiliation(s)
- Raquel Cuevas-Diaz Duran
- Tecnologico de Monterrey, Escuela de Medicina y Ciencias de la Salud, Monterrey, Nuevo Leon, 64710, Mexico.
| | - Haichao Wei
- The Vivian L. Smith Department of Neurosurgery, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- Center for Stem Cell and Regenerative Medicine, UT Brown Foundation Institute of Molecular Medicine, Houston, TX, 77030, USA
| | - Jiaqian Wu
- The Vivian L. Smith Department of Neurosurgery, McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
- Center for Stem Cell and Regenerative Medicine, UT Brown Foundation Institute of Molecular Medicine, Houston, TX, 77030, USA.
- MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, 77030, USA.
| |
Collapse
|
3
|
Fisher JL, Clark AD, Jones EF, Lasseigne BN. Sex-biased gene expression and gene-regulatory networks of sex-biased adverse event drug targets and drug metabolism genes. BMC Pharmacol Toxicol 2024; 25:5. [PMID: 38167211 PMCID: PMC10763002 DOI: 10.1186/s40360-023-00727-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 12/18/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Previous pharmacovigilance studies and a retroactive review of cancer clinical trial studies identified that women were more likely to experience drug adverse events (i.e., any unintended effects of medication), and men were more likely to experience adverse events that resulted in hospitalization or death. These sex-biased adverse events (SBAEs) are due to many factors not entirely understood, including differences in body mass, hormones, pharmacokinetics, and liver drug metabolism enzymes and transporters. METHODS We first identified drugs associated with SBAEs from the FDA Adverse Event Reporting System (FAERS) database. Next, we evaluated sex-specific gene expression of the known drug targets and metabolism enzymes for those SBAE-associated drugs. We also constructed sex-specific tissue gene-regulatory networks to determine if these known drug targets and metabolism enzymes from the SBAE-associated drugs had sex-specific gene-regulatory network properties and predicted regulatory relationships. RESULTS We identified liver-specific gene-regulatory differences for drug metabolism genes between males and females, which could explain observed sex differences in pharmacokinetics and pharmacodynamics. In addition, we found that ~ 85% of SBAE-associated drug targets had sex-biased gene expression or were core genes of sex- and tissue-specific network communities, significantly higher than randomly selected drug targets. Lastly, we provide the sex-biased drug-adverse event pairs, drug targets, and drug metabolism enzymes as a resource for the research community. CONCLUSIONS Overall, we provide evidence that many SBAEs are associated with drug targets and drug metabolism genes that are differentially expressed and regulated between males and females. These SBAE-associated drug metabolism enzymes and drug targets may be useful for future studies seeking to explain or predict SBAEs.
Collapse
Affiliation(s)
- Jennifer L Fisher
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Amanda D Clark
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Emma F Jones
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Brittany N Lasseigne
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
4
|
Fisher JL, Clark AD, Jones EF, Lasseigne BN. Sex-biased gene expression and gene-regulatory networks of sex-biased adverse event drug targets and drug metabolism genes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.23.541950. [PMID: 37362157 PMCID: PMC10290285 DOI: 10.1101/2023.05.23.541950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/28/2023]
Abstract
Background Previous pharmacovigilance studies and a retroactive review of cancer clinical trial studies identified that women were more likely to experience drug adverse events (i.e., any unintended effects of medication), and men were more likely to experience adverse events that resulted in hospitalization or death. These sex-biased adverse events (SBAEs) are due to many factors not entirely understood, including differences in body mass, hormones, pharmacokinetics, and liver drug metabolism enzymes and transporters. Methods We first identified drugs associated with SBAEs from the FDA Adverse Event Reporting System (FAERS) database. Next, we evaluated sex-specific gene expression of the known drug targets and metabolism enzymes for those SBAE-associated drugs. We also constructed sex-specific tissue gene-regulatory networks to determine if these known drug targets and metabolism enzymes from the SBAE-associated drugs had sex-specific gene-regulatory network properties and predicted regulatory relationships. Results We identified liver-specific gene-regulatory differences for drug metabolism genes between males and females, which could explain observed sex differences in pharmacokinetics and pharmacodynamics. In addition, we found that ~85% of SBAE-associated drug targets had sex-biased gene expression or were core genes of sex- and tissue-specific network communities, significantly higher than randomly selected drug targets. Lastly, we provide the sex-biased drug-adverse event pairs, drug targets, and drug metabolism enzymes as a resource for the research community. Conclusions Overall, we provide evidence that many SBAEs are associated with drug targets and drug metabolism genes that are differentially expressed and regulated between males and females. These SBAE-associated drug metabolism enzymes and drug targets may be useful for future studies seeking to explain or predict SBAEs.
Collapse
Affiliation(s)
- Jennifer L. Fisher
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, 35294, USA
| | - Amanda D. Clark
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, 35294, USA
| | - Emma F. Jones
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, 35294, USA
| | - Brittany N. Lasseigne
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, 35294, USA
| |
Collapse
|
5
|
Hsieh PH, Lopes-Ramos CM, Zucknick M, Sandve GK, Glass K, Kuijjer ML. Adjustment of spurious correlations in co-expression measurements from RNA-Sequencing data. Bioinformatics 2023; 39:btad610. [PMID: 37802917 PMCID: PMC10598588 DOI: 10.1093/bioinformatics/btad610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 08/05/2023] [Accepted: 10/05/2023] [Indexed: 10/08/2023] Open
Abstract
MOTIVATION Gene co-expression measurements are widely used in computational biology to identify coordinated expression patterns across a group of samples. Coordinated expression of genes may indicate that they are controlled by the same transcriptional regulatory program, or involved in common biological processes. Gene co-expression is generally estimated from RNA-Sequencing data, which are commonly normalized to remove technical variability. Here, we demonstrate that certain normalization methods, in particular quantile-based methods, can introduce false-positive associations between genes. These false-positive associations can consequently hamper downstream co-expression network analysis. Quantile-based normalization can, however, be extremely powerful. In particular, when preprocessing large-scale heterogeneous data, quantile-based normalization methods such as smooth quantile normalization can be applied to remove technical variability while maintaining global differences in expression for samples with different biological attributes. RESULTS We developed SNAIL (Smooth-quantile Normalization Adaptation for the Inference of co-expression Links), a normalization method based on smooth quantile normalization specifically designed for modeling of co-expression measurements. We show that SNAIL avoids formation of false-positive associations in co-expression as well as in downstream network analyses. Using SNAIL, one can avoid arbitrary gene filtering and retain associations to genes that only express in small subgroups of samples. This highlights the method's potential future impact on network modeling and other association-based approaches in large-scale heterogeneous data. AVAILABILITY AND IMPLEMENTATION The implementation of the SNAIL algorithm and code to reproduce the analyses described in this work can be found in the GitHub repository https://github.com/kuijjerlab/PySNAIL.
Collapse
Affiliation(s)
- Ping-Han Hsieh
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo 0318, Norway
- Department of Informatics, University of Oslo, Oslo 0316, Norway
| | - Camila Miranda Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
- Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, United States
| | - Manuela Zucknick
- Oslo Centre for Biostatistics and Epidemiology, Institute of Basic Medical Sciences, University of Oslo, Oslo 0317, Norway
| | | | - Kimberly Glass
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, United States
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, United States
| | - Marieke Lydia Kuijjer
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo 0318, Norway
- Department of Pathology, Leiden University Medical Center, Leiden 2300RC, The Netherlands
- Leiden Center of Computational Oncology, Leiden University Medical Center,Leiden 2300RC, The Netherlands
| |
Collapse
|
6
|
Herrera-Uribe J, Lim KS, Byrne KA, Daharsh L, Liu H, Corbett RJ, Marco G, Schroyen M, Koltes JE, Loving CL, Tuggle CK. Integrative profiling of gene expression and chromatin accessibility elucidates specific transcriptional networks in porcine neutrophils. Front Genet 2023; 14:1107462. [PMID: 37287538 PMCID: PMC10242145 DOI: 10.3389/fgene.2023.1107462] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 04/27/2023] [Indexed: 06/09/2023] Open
Abstract
Neutrophils are vital components of the immune system for limiting the invasion and proliferation of pathogens in the body. Surprisingly, the functional annotation of porcine neutrophils is still limited. The transcriptomic and epigenetic assessment of porcine neutrophils from healthy pigs was performed by bulk RNA sequencing and transposase accessible chromatin sequencing (ATAC-seq). First, we sequenced and compared the transcriptome of porcine neutrophils with eight other immune cell transcriptomes to identify a neutrophil-enriched gene list within a detected neutrophil co-expression module. Second, we used ATAC-seq analysis to report for the first time the genome-wide chromatin accessible regions of porcine neutrophils. A combined analysis using both transcriptomic and chromatin accessibility data further defined the neutrophil co-expression network controlled by transcription factors likely important for neutrophil lineage commitment and function. We identified chromatin accessible regions around promoters of neutrophil-specific genes that were predicted to be bound by neutrophil-specific transcription factors. Additionally, published DNA methylation data from porcine immune cells including neutrophils were used to link low DNA methylation patterns to accessible chromatin regions and genes with highly enriched expression in porcine neutrophils. In summary, our data provides the first integrative analysis of the accessible chromatin regions and transcriptional status of porcine neutrophils, contributing to the Functional Annotation of Animal Genomes (FAANG) project, and demonstrates the utility of chromatin accessible regions to identify and enrich our understanding of transcriptional networks in a cell type such as neutrophils.
Collapse
Affiliation(s)
- Juber Herrera-Uribe
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Kyu-Sang Lim
- Department of Animal Science, Iowa State University, Ames, IA, United States
- Department of Animal Resource Science, Kongju National University, Yesan, Republic of Korea
| | - Kristen A. Byrne
- USDA-Agriculture Research Service, National Animal Disease Center, Food Safety and Enteric Pathogens Research Unit, Ames, IA, United States
| | - Lance Daharsh
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Haibo Liu
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Ryan J. Corbett
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Gianna Marco
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Martine Schroyen
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - James E. Koltes
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Crystal L. Loving
- USDA-Agriculture Research Service, National Animal Disease Center, Food Safety and Enteric Pathogens Research Unit, Ames, IA, United States
| | | |
Collapse
|
7
|
Ben Guebila M, Wang T, Lopes-Ramos CM, Fanfani V, Weighill D, Burkholz R, Schlauch D, Paulson JN, Altenbuchinger M, Shutta KH, Sonawane AR, Lim J, Calderer G, van IJzendoorn DGP, Morgan D, Marin A, Chen CY, Song Q, Saha E, DeMeo DL, Padi M, Platig J, Kuijjer ML, Glass K, Quackenbush J. The Network Zoo: a multilingual package for the inference and analysis of gene regulatory networks. Genome Biol 2023; 24:45. [PMID: 36894939 PMCID: PMC9999668 DOI: 10.1186/s13059-023-02877-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Accepted: 02/15/2023] [Indexed: 03/11/2023] Open
Abstract
Inference and analysis of gene regulatory networks (GRNs) require software that integrates multi-omic data from various sources. The Network Zoo (netZoo; netzoo.github.io) is a collection of open-source methods to infer GRNs, conduct differential network analyses, estimate community structure, and explore the transitions between biological states. The netZoo builds on our ongoing development of network methods, harmonizing the implementations in various computing languages and between methods to allow better integration of these tools into analytical pipelines. We demonstrate the utility using multi-omic data from the Cancer Cell Line Encyclopedia. We will continue to expand the netZoo to incorporate additional methods.
Collapse
Affiliation(s)
- Marouen Ben Guebila
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Tian Wang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Biology Department, Boston College, Chestnut Hill, MA, USA
| | - Camila M Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Viola Fanfani
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Des Weighill
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Rebekka Burkholz
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
| | - Daniel Schlauch
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Genospace, LLC, Boston, MA, USA
| | - Joseph N Paulson
- Department of Biochemistry and Molecular Biology, Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Michael Altenbuchinger
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Present Address: Department of Medical Bioinformatics, University Medical Center Göttingen, Göttingen, Germany
| | - Katherine H Shutta
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Abhijeet R Sonawane
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Present Address: Center for Interdisciplinary Cardiovascular Sciences, Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - James Lim
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
- Present Address: Monoceros Biosystems, LLC, San Diego, CA, USA
| | - Genis Calderer
- Center for Molecular Medicine Norway, Nordic EMBL Partnership, University of Oslo, Oslo, Norway
| | - David G P van IJzendoorn
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Present Address: Department of Pathology, Stanford University School of Medicine, Palo Alto, CA, USA
| | - Daniel Morgan
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Present Address: School of Biomedical Sciences, Hong Kong University, Pokfulam, Hong Kong
| | | | - Cho-Yi Chen
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Dana-Farber Cancer Institute, Boston, MA, USA
- Present Address: Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, 112, Taiwan
| | - Qi Song
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Present Address: Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Enakshi Saha
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Megha Padi
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
| | - John Platig
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Marieke L Kuijjer
- Center for Molecular Medicine Norway, Nordic EMBL Partnership, University of Oslo, Oslo, Norway
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Center for Computational Oncology, Leiden University, Leiden, The Netherlands
| | - Kimberly Glass
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
8
|
Venkat V, Abdelhalim H, DeGroat W, Zeeshan S, Ahmed Z. Investigating genes associated with heart failure, atrial fibrillation, and other cardiovascular diseases, and predicting disease using machine learning techniques for translational research and precision medicine. Genomics 2023; 115:110584. [PMID: 36813091 DOI: 10.1016/j.ygeno.2023.110584] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/06/2023] [Accepted: 02/11/2023] [Indexed: 02/22/2023]
Abstract
Cardiovascular disease (CVD) is the leading cause of mortality and loss of disability adjusted life years (DALYs) globally. CVDs like Heart Failure (HF) and Atrial Fibrillation (AF) are associated with physical effects on the heart muscles. As a result of the complex nature, progression, inherent genetic makeup, and heterogeneity of CVDs, personalized treatments are believed to be critical. Rightful application of artificial intelligence (AI) and machine learning (ML) approaches can lead to new insights into CVDs for providing better personalized treatments with predictive analysis and deep phenotyping. In this study we focused on implementing AI/ML techniques on RNA-seq driven gene-expression data to investigate genes associated with HF, AF, and other CVDs, and predict disease with high accuracy. The study involved generating RNA-seq data derived from the serum of consented CVD patients. Next, we processed the sequenced data using our RNA-seq pipeline and applied GVViZ for gene-disease data annotation and expression analysis. To achieve our research objectives, we developed a new Findable, Accessible, Intelligent, and Reproducible (FAIR) approach that includes a five-level biostatistical evaluation, primarily based on the Random Forest (RF) algorithm. During our AI/ML analysis, we have fitted, trained, and implemented our model to classify and distinguish high-risk CVD patients based on their age, gender, and race. With the successful execution of our model, we predicted the association of highly significant HF, AF, and other CVDs genes with demographic variables.
Collapse
Affiliation(s)
- Vignesh Venkat
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - William DeGroat
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers University, 195 Little Albany St, New Brunswick, NJ, USA
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA; Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson St, New Brunswick, NJ, USA.
| |
Collapse
|
9
|
Zogopoulos VL, Malatras A, Kyriakidis K, Charalampous C, Makrygianni EA, Duguez S, Koutsi MA, Pouliou M, Vasileiou C, Duddy WJ, Agelopoulos M, Chrousos GP, Iconomidou VA, Michalopoulos I. HGCA2.0: An RNA-Seq Based Webtool for Gene Coexpression Analysis in Homo sapiens. Cells 2023; 12:cells12030388. [PMID: 36766730 PMCID: PMC9913097 DOI: 10.3390/cells12030388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 01/09/2023] [Accepted: 01/19/2023] [Indexed: 01/24/2023] Open
Abstract
Genes with similar expression patterns in a set of diverse samples may be considered coexpressed. Human Gene Coexpression Analysis 2.0 (HGCA2.0) is a webtool which studies the global coexpression landscape of human genes. The website is based on the hierarchical clustering of 55,431 Homo sapiens genes based on a large-scale coexpression analysis of 3500 GTEx bulk RNA-Seq samples of healthy individuals, which were selected as the best representative samples of each tissue type. HGCA2.0 presents subclades of coexpressed genes to a gene of interest, and performs various built-in gene term enrichment analyses on the coexpressed genes, including gene ontologies, biological pathways, protein families, and diseases, while also being unique in revealing enriched transcription factors driving coexpression. HGCA2.0 has been successful in identifying not only genes with ubiquitous expression patterns, but also tissue-specific genes. Benchmarking showed that HGCA2.0 belongs to the top performing coexpression webtools, as shown by STRING analysis. HGCA2.0 creates working hypotheses for the discovery of gene partners or common biological processes that can be experimentally validated. It offers a simple and intuitive website design and user interface, as well as an API endpoint.
Collapse
Affiliation(s)
- Vasileios L. Zogopoulos
- Centre of Systems Biology, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Section of Cell Biology and Biophysics, Department of Biology, National and Kapodistrian University of Athens, 15701 Athens, Greece
| | - Apostolos Malatras
- Biobank.cy Center of Excellence in Biobanking and Biomedical Research, University of Cyprus, 2029 Nicosia, Cyprus
| | - Konstantinos Kyriakidis
- Centre of Systems Biology, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- School of Pharmacy, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - Chrysanthi Charalampous
- Centre of Basic Research, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
| | - Evanthia A. Makrygianni
- University Research Institute of Maternal and Child Health and Precision Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Stéphanie Duguez
- Personalised Medicine Centre, School of Medicine, Ulster University, Derry-Londonderry BT47 6SB, UK
| | - Marianna A. Koutsi
- Centre of Basic Research, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
| | - Marialena Pouliou
- Centre of Basic Research, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
| | - Christos Vasileiou
- Centre of Systems Biology, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Engineering Design and Computing Laboratory, ETH Zurich, 8092 Zurich, Switzerland
| | - William J. Duddy
- Personalised Medicine Centre, School of Medicine, Ulster University, Derry-Londonderry BT47 6SB, UK
| | - Marios Agelopoulos
- Centre of Basic Research, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
| | - George P. Chrousos
- University Research Institute of Maternal and Child Health and Precision Medicine, National and Kapodistrian University of Athens, 11527 Athens, Greece
| | - Vassiliki A. Iconomidou
- Section of Cell Biology and Biophysics, Department of Biology, National and Kapodistrian University of Athens, 15701 Athens, Greece
| | - Ioannis Michalopoulos
- Centre of Systems Biology, Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Correspondence:
| |
Collapse
|
10
|
Sharon M, Vinogradov E, Argov CM, Lazarescu O, Zoabi Y, Hekselman I, Yeger-Lotem E. The differential activity of biological processes in tissues and cell subsets can illuminate disease-related processes and cell-type identities. Bioinformatics 2022; 38:1584-1592. [PMID: 35015838 DOI: 10.1093/bioinformatics/btab883] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 12/09/2021] [Accepted: 01/02/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION The distinct functionalities of human tissues and cell types underlie complex phenotype-genotype relationships, yet often remain elusive. Harnessing the multitude of bulk and single-cell human transcriptomes while focusing on processes can help reveal these distinct functionalities. RESULTS The Tissue-Process Activity (TiPA) method aims to identify processes that are preferentially active or under-expressed in specific contexts, by comparing the expression levels of process genes between contexts. We tested TiPA on 1579 tissue-specific processes and bulk tissue transcriptomes, finding that it performed better than another method. Next, we used TiPA to ask whether the activity of certain processes could underlie the tissue-specific manifestation of 1233 hereditary diseases. We found that 21% of the disease-causing genes indeed participated in such processes, thereby illuminating their genotype-phenotype relationships. Lastly, we applied TiPA to single-cell transcriptomes of 108 human cell types, revealing that process activities often match cell-type identities and can thus aid annotation efforts. Hence, differential activity of processes can highlight the distinct functionality of tissues and cells in a robust and meaningful manner. AVAILABILITY AND IMPLEMENTATION TiPA code is available in GitHub (https://github.com/moranshar/TiPA). In addition, all data are available as part of the Supplementary Material. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Moran Sharon
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ekaterina Vinogradov
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Chanan M Argov
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Or Lazarescu
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Yazeed Zoabi
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Idan Hekselman
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel.,The National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| |
Collapse
|
11
|
Buphamalai P, Kokotovic T, Nagy V, Menche J. Network analysis reveals rare disease signatures across multiple levels of biological organization. Nat Commun 2021; 12:6306. [PMID: 34753928 PMCID: PMC8578255 DOI: 10.1038/s41467-021-26674-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 10/19/2021] [Indexed: 01/26/2023] Open
Abstract
Rare genetic diseases are typically caused by a single gene defect. Despite this clear causal relationship between genotype and phenotype, identifying the pathobiological mechanisms at various levels of biological organization remains a practical and conceptual challenge. Here, we introduce a network approach for evaluating the impact of rare gene defects across biological scales. We construct a multiplex network consisting of over 20 million gene relationships that are organized into 46 network layers spanning six major biological scales between genotype and phenotype. A comprehensive analysis of 3,771 rare diseases reveals distinct phenotypic modules within individual layers. These modules can be exploited to mechanistically dissect the impact of gene defects and accurately predict rare disease gene candidates. Our results show that the disease module formalism can be applied to rare diseases and generalized beyond physical interaction networks. These findings open up new venues to apply network-based tools for cross-scale data integration.
Collapse
Affiliation(s)
- Pisanu Buphamalai
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna BioCenter 5, 1030, Vienna, Austria
| | - Tomislav Kokotovic
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Department of Neurology, Medical University of Vienna, Währinger Gürtel 18-20, 1090, Vienna, Austria
| | - Vanja Nagy
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria
- Department of Neurology, Medical University of Vienna, Währinger Gürtel 18-20, 1090, Vienna, Austria
| | - Jörg Menche
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Lazarettgasse 14, AKH BT 25.3, 1090, Vienna, Austria.
- Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna BioCenter 5, 1030, Vienna, Austria.
- Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090, Vienna, Austria.
| |
Collapse
|
12
|
Poon CL, Chen CY. Exploring the Impact of Cerebrovascular Disease and Major Depression on Non-diseased Human Tissue Transcriptomes. Front Genet 2021; 12:696836. [PMID: 34349785 PMCID: PMC8327210 DOI: 10.3389/fgene.2021.696836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open
Abstract
Background The development of complex diseases is contributed by the combination of multiple factors and complicated interactions between them. Inflammation has recently been associated with many complex diseases and may cause long-term damage to the human body. In this study, we examined whether two types of complex disease, cerebrovascular disease (CVD) or major depression (MD), systematically altered the transcriptomes of non-diseased human tissues and whether inflammation is linked to identifiable molecular signatures, using post-mortem samples from the Genotype-Tissue Expression (GTEx) project. Results Following a series of differential expression analyses, dozens to hundreds of differentially expressed genes (DEGs) were identified in multiple tissues between subjects with and without a history of CVD or MD. DEGs from these disease-associated tissues-the visceral adipose, tibial artery, caudate, and spinal cord for CVD; and the hypothalamus, putamen, and spinal cord for MD-were further analyzed for functional enrichment. Many pathways associated with immunological events were enriched in the upregulated DEGs of the CVD-associated tissues, as were the neurological and metabolic pathways in DEGs of the MD-associated tissues. Eight gene-tissue pairs were found to overlap with those prioritized by our transcriptome-wide association studies, indicating a potential genetic effect on gene expression for circulating cytokine phenotypes. Conclusion Cerebrovascular disease and major depression cause detectable changes in the gene expression of non-diseased tissues, suggesting that a possible long-term impact of diseases, lifestyles and environmental factors may together contribute to the appearance of "transcriptomic scars" on the human body. Furthermore, inflammation is probably one of the systemic and long-lasting effects of cerebrovascular events.
Collapse
Affiliation(s)
- Chi-Lam Poon
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, United States
| | - Cho-Yi Chen
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, Taiwan.,Brain Research Center, National Yang Ming Chiao Tung University, Taipei, Taiwan
| |
Collapse
|
13
|
Herrera-Uribe J, Wiarda JE, Sivasankaran SK, Daharsh L, Liu H, Byrne KA, Smith TPL, Lunney JK, Loving CL, Tuggle CK. Reference Transcriptomes of Porcine Peripheral Immune Cells Created Through Bulk and Single-Cell RNA Sequencing. Front Genet 2021; 12:689406. [PMID: 34249103 PMCID: PMC8261551 DOI: 10.3389/fgene.2021.689406] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 05/18/2021] [Indexed: 01/03/2023] Open
Abstract
Pigs are a valuable human biomedical model and an important protein source supporting global food security. The transcriptomes of peripheral blood immune cells in pigs were defined at the bulk cell-type and single cell levels. First, eight cell types were isolated in bulk from peripheral blood mononuclear cells (PBMCs) by cell sorting, representing Myeloid, NK cells and specific populations of T and B-cells. Transcriptomes for each bulk population of cells were generated by RNA-seq with 10,974 expressed genes detected. Pairwise comparisons between cell types revealed specific expression, while enrichment analysis identified 1,885 to 3,591 significantly enriched genes across all 8 cell types. Gene Ontology analysis for the top 25% of significantly enriched genes (SEG) showed high enrichment of biological processes related to the nature of each cell type. Comparison of gene expression indicated highly significant correlations between pig cells and corresponding human PBMC bulk RNA-seq data available in Haemopedia. Second, higher resolution of distinct cell populations was obtained by single-cell RNA-sequencing (scRNA-seq) of PBMC. Seven PBMC samples were partitioned and sequenced that produced 28,810 single cell transcriptomes distributed across 36 clusters and classified into 13 general cell types including plasmacytoid dendritic cells (DC), conventional DCs, monocytes, B-cell, conventional CD4 and CD8 αβ T-cells, NK cells, and γδ T-cells. Signature gene sets from the human Haemopedia data were assessed for relative enrichment in genes expressed in pig cells and integration of pig scRNA-seq with a public human scRNA-seq dataset provided further validation for similarity between human and pig data. The sorted porcine bulk RNAseq dataset informed classification of scRNA-seq PBMC populations; specifically, an integration of the datasets showed that the pig bulk RNAseq data helped define the CD4CD8 double-positive T-cell populations in the scRNA-seq data. Overall, the data provides deep and well-validated transcriptomic data from sorted PBMC populations and the first single-cell transcriptomic data for porcine PBMCs. This resource will be invaluable for annotation of pig genes controlling immunogenetic traits as part of the porcine Functional Annotation of Animal Genomes (FAANG) project, as well as further study of, and development of new reagents for, porcine immunology.
Collapse
Affiliation(s)
- Juber Herrera-Uribe
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Jayne E. Wiarda
- Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Ames, IA, United States
- Immunobiology Graduate Program, Iowa State University, Ames, IA, United States
- Oak Ridge Institute for Science and Education, Agricultural Research Service Participation Program, Oak Ridge, TN, United States
| | - Sathesh K. Sivasankaran
- Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Ames, IA, United States
- Genome Informatics Facility, Iowa State University, Ames, IA, United States
| | - Lance Daharsh
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Haibo Liu
- Department of Animal Science, Iowa State University, Ames, IA, United States
| | - Kristen A. Byrne
- Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Ames, IA, United States
| | | | - Joan K. Lunney
- USDA-ARS, Beltsville Agricultural Research Center, Animal Parasitic Diseases Laboratory, Beltsville, MD, United States
| | - Crystal L. Loving
- Food Safety and Enteric Pathogens Research Unit, National Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Ames, IA, United States
| | | |
Collapse
|
14
|
Shemesh N, Jubran J, Dror S, Simonovsky E, Basha O, Argov C, Hekselman I, Abu-Qarn M, Vinogradov E, Mauer O, Tiago T, Carra S, Ben-Zvi A, Yeger-Lotem E. The landscape of molecular chaperones across human tissues reveals a layered architecture of core and variable chaperones. Nat Commun 2021; 12:2180. [PMID: 33846299 PMCID: PMC8042005 DOI: 10.1038/s41467-021-22369-9] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 02/23/2021] [Indexed: 12/13/2022] Open
Abstract
The sensitivity of the protein-folding environment to chaperone disruption can be highly tissue-specific. Yet, the organization of the chaperone system across physiological human tissues has received little attention. Through computational analyses of large-scale tissue transcriptomes, we unveil that the chaperone system is composed of core elements that are uniformly expressed across tissues, and variable elements that are differentially expressed to fit with tissue-specific requirements. We demonstrate via a proteomic analysis that the muscle-specific signature is functional and conserved. Core chaperones are significantly more abundant across tissues and more important for cell survival than variable chaperones. Together with variable chaperones, they form tissue-specific functional networks. Analysis of human organ development and aging brain transcriptomes reveals that these functional networks are established in development and decline with age. In this work, we expand the known functional organization of de novo versus stress-inducible eukaryotic chaperones into a layered core-variable architecture in multi-cellular organisms.
Collapse
Affiliation(s)
- Netta Shemesh
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel.,Department of Life Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Juman Jubran
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Shiran Dror
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Eyal Simonovsky
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Omer Basha
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Chanan Argov
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Idan Hekselman
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Mehtap Abu-Qarn
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ekaterina Vinogradov
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Omry Mauer
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Tatiana Tiago
- Centre for Neuroscience and Nanotechnology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Serena Carra
- Centre for Neuroscience and Nanotechnology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio Emilia, Modena, Italy
| | - Anat Ben-Zvi
- Department of Life Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel.
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology and the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer Sheva, Israel.
| |
Collapse
|
15
|
Isaacs AM, Morton SU, Movassagh M, Zhang Q, Hehnly C, Zhang L, Morales DM, Sinnar SA, Ericson JE, Mbabazi-Kabachelor E, Ssenyonga P, Onen J, Mulondo R, Hornig M, Warf BC, Broach JR, Townsend RR, Limbrick DD, Paulson JN, Schiff SJ. Immune activation during Paenibacillus brain infection in African infants with frequent cytomegalovirus co-infection. iScience 2021; 24:102351. [PMID: 33912816 PMCID: PMC8065213 DOI: 10.1016/j.isci.2021.102351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/24/2021] [Accepted: 03/19/2021] [Indexed: 12/16/2022] Open
Abstract
Inflammation during neonatal brain infections leads to significant secondary sequelae such as hydrocephalus, which often follows neonatal sepsis in the developing world. In 100 African hydrocephalic infants we identified the biological pathways that account for this response. The dominant bacterial pathogen was a Paenibacillus species, with frequent cytomegalovirus co-infection. A proteogenomic strategy was employed to confirm host immune response to Paenibacillus and to define the interplay within the host immune response network. Immune activation emphasized neuroinflammation, oxidative stress reaction, and extracellular matrix organization. The innate immune system response included neutrophil activity, signaling via IL-4, IL-12, IL-13, interferon, and Jak/STAT pathways. Platelet-activating factors and factors involved with microbe recognition such as Class I MHC antigen-presenting complex were also increased. Evidence suggests that dysregulated neuroinflammation propagates inflammatory hydrocephalus, and these pathways are potential targets for adjunctive treatments to reduce the hazards of neuroinflammation and risk of hydrocephalus following neonatal sepsis. There is a characteristic immune response to Paenibacillus brain infection There is a characteristic immune response to CMV brain infection The matching immune response validates pathogen genomic presence The combined results support molecular infection causality
Collapse
Affiliation(s)
- Albert M Isaacs
- Department of Neuroscience, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Clinical Neurosciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Sarah U Morton
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA 02115, USA.,Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA
| | - Mercedeh Movassagh
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Qiang Zhang
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Christine Hehnly
- Institute for Personalized Medicine, Pennsylvania State University, Hershey, PA 17033, USA.,Department of Biochemistry and Molecular Biology, Pennsylvania State University, State College, PA 16801, USA
| | - Lijun Zhang
- Institute for Personalized Medicine, Pennsylvania State University, Hershey, PA 17033, USA
| | - Diego M Morales
- Department of Neurosurgery, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Shamim A Sinnar
- Center for Neural Engineering, Pennsylvania State University, State College, PA 16801, USA.,Department of Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Jessica E Ericson
- Department of Pediatrics, Pennsylvania State College of Medicine, Hershey, PA 17033, USA
| | | | | | - Justin Onen
- CURE Children's Hospital of Uganda, Mbale, Uganda
| | | | - Mady Hornig
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, NY 10032, USA
| | - Benjamin C Warf
- Department of Neurosurgery, Harvard Medical School, Boston, MA 02115, USA
| | - James R Broach
- Institute for Personalized Medicine, Pennsylvania State University, Hershey, PA 17033, USA.,Department of Biochemistry and Molecular Biology, Pennsylvania State University, State College, PA 16801, USA
| | - R Reid Townsend
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - David D Limbrick
- Department of Neurosurgery, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Joseph N Paulson
- Department of Biostatistics, Product Development, Genentech Inc., South San Francisco, CA 94080, USA
| | - Steven J Schiff
- Center for Neural Engineering, Pennsylvania State University, State College, PA 16801, USA.,Center for Infectious Disease Dynamics, Departments of Neurosurgery, Engineering Science and Mechanics, and Physics, The Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
16
|
Kuijjer ML, Fagny M, Marin A, Quackenbush J, Glass K. PUMA: PANDA Using MicroRNA Associations. Bioinformatics 2021; 36:4765-4773. [PMID: 32860050 PMCID: PMC7750953 DOI: 10.1093/bioinformatics/btaa571] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Revised: 05/19/2020] [Accepted: 06/10/2020] [Indexed: 12/27/2022] Open
Abstract
Motivation Conventional methods to analyze genomic data do not make use of the interplay between multiple factors, such as between microRNAs (miRNAs) and the messenger RNA (mRNA) transcripts they regulate, and thereby often fail to identify the cellular processes that are unique to specific tissues. We developed PUMA (PANDA Using MicroRNA Associations), a computational tool that uses message passing to integrate a prior network of miRNA target predictions with target gene co-expression information to model genome-wide gene regulation by miRNAs. We applied PUMA to 38 tissues from the Genotype-Tissue Expression project, integrating RNA-Seq data with two different miRNA target predictions priors, built on predictions from TargetScan and miRanda, respectively. We found that while target predictions obtained from these two different resources are considerably different, PUMA captures similar tissue-specific miRNA–target regulatory interactions in the different network models. Furthermore, the tissue-specific functions of miRNAs we identified based on regulatory profiles (available at: https://kuijjer.shinyapps.io/puma_gtex/) are highly similar between networks modeled on the two target prediction resources. This indicates that PUMA consistently captures important tissue-specific miRNA regulatory processes. In addition, using PUMA we identified miRNAs regulating important tissue-specific processes that, when mutated, may result in disease development in the same tissue. Availability and implementation PUMA is available in C++, MATLAB and Python on GitHub (https://github.com/kuijjerlab and https://netzoo.github.io/). Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Marieke L Kuijjer
- Centre for Molecular Medicine Norway, University of Oslo, Oslo 0318, Norway
| | - Maud Fagny
- UMR7206 Eco-Anthropologie, Muséum National d'Histoire Naturelle, Centre National de la Recherche Scientifique, Université de Paris, Paris 75016, France
| | - Alessandro Marin
- Centre for Computing in Science Education, Department of Physics, University of Oslo, Oslo 0316, Norway
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.,Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA.,Channing Division of Network Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
17
|
Fagny M, Kuijjer ML, Stam M, Joets J, Turc O, Rozière J, Pateyron S, Venon A, Vitte C. Identification of Key Tissue-Specific, Biological Processes by Integrating Enhancer Information in Maize Gene Regulatory Networks. Front Genet 2021; 11:606285. [PMID: 33505431 PMCID: PMC7834273 DOI: 10.3389/fgene.2020.606285] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 12/03/2020] [Indexed: 12/27/2022] Open
Abstract
Enhancers are key players in the spatio-temporal coordination of gene expression during numerous crucial processes, including tissue differentiation across development. Characterizing the transcription factors (TFs) and genes they connect, and the molecular functions underpinned is important to better characterize developmental processes. In plants, the recent molecular characterization of enhancers revealed their capacity to activate the expression of several target genes. Nevertheless, identifying these target genes at a genome-wide level is challenging, particularly for large-genome species, where enhancers and target genes can be hundreds of kilobases away. Therefore, the contribution of enhancers to plant regulatory networks remains poorly understood. Here, we investigate the enhancer-driven regulatory network of two maize tissues at different stages: leaves at seedling stage (V2-IST) and husks (bracts) at flowering. Using systems biology, we integrate genomic, epigenomic, and transcriptomic data to model the regulatory relationships between TFs and their potential target genes, and identify regulatory modules specific to husk and V2-IST. We show that leaves at the V2-IST stage are characterized by the response to hormones and macromolecules biogenesis and assembly, which are regulated by the BBR/BPC and AP2/ERF TF families, respectively. In contrast, husks are characterized by cell wall modification and response to abiotic stresses, which are, respectively, orchestrated by the C2C2/DOF and AP2/EREB families. Analysis of the corresponding enhancer sequences reveals that two different transposable element families (TIR transposon Mutator and MITE Pif/Harbinger) have shaped part of the regulatory network in each tissue, and that MITEs have provided potential new TF binding sites involved in husk tissue-specificity.
Collapse
Affiliation(s)
- Maud Fagny
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Marieke Lydia Kuijjer
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway
- Department of Pathology, Leiden University Medical Center, Leiden, Netherlands
| | - Maike Stam
- Plant Development and (Epi) Genetics, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
| | - Johann Joets
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Olivier Turc
- LEPSE, Univ Montpellier, INRAE, Institut Agro, Montpellier, France
| | - Julien Rozière
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
- Université Paris-Saclay, CNRS, INRAE, Univ Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, France
- Université de Paris, CNRS, INRAE, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, France
| | - Stéphanie Pateyron
- Université Paris-Saclay, CNRS, INRAE, Univ Evry, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, France
- Université de Paris, CNRS, INRAE, Institute of Plant Sciences Paris-Saclay (IPS2), Orsay, France
| | - Anthony Venon
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Clémentine Vitte
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| |
Collapse
|
18
|
Lopes-Ramos CM, Quackenbush J, DeMeo DL. Genome-Wide Sex and Gender Differences in Cancer. Front Oncol 2020; 10:597788. [PMID: 33330090 PMCID: PMC7719817 DOI: 10.3389/fonc.2020.597788] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Accepted: 10/19/2020] [Indexed: 12/12/2022] Open
Abstract
Despite their known importance in clinical medicine, differences based on sex and gender are among the least studied factors affecting cancer susceptibility, progression, survival, and therapeutic response. In particular, the molecular mechanisms driving sex differences are poorly understood and so most approaches to precision medicine use mutational or other genomic data to assign therapy without considering how the sex of the individual might influence therapeutic efficacy. The mandate by the National Institutes of Health that research studies include sex as a biological variable has begun to expand our understanding on its importance. Sex differences in cancer may arise due to a combination of environmental, genetic, and epigenetic factors, as well as differences in gene regulation, and expression. Extensive sex differences occur genome-wide, and ultimately influence cancer biology and outcomes. In this review, we summarize the current state of knowledge about sex-specific genetic and genome-wide influences in cancer, describe how differences in response to environmental exposures and genetic and epigenetic alterations alter the trajectory of the disease, and provide insights into the importance of integrative analyses in understanding the interplay of sex and genomics in cancer. In particular, we will explore some of the emerging analytical approaches, such as the use of network methods, that are providing a deeper understanding of the drivers of differences based on sex and gender. Better understanding these complex factors and their interactions will improve cancer prevention, treatment, and outcomes for all individuals.
Collapse
Affiliation(s)
- Camila M Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.,Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, United States.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, United States
| |
Collapse
|
19
|
Gustafsson J, Held F, Robinson JL, Björnson E, Jörnsten R, Nielsen J. Sources of variation in cell-type RNA-Seq profiles. PLoS One 2020; 15:e0239495. [PMID: 32956417 PMCID: PMC7505444 DOI: 10.1371/journal.pone.0239495] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 09/07/2020] [Indexed: 12/21/2022] Open
Abstract
Cell-type specific gene expression profiles are needed for many computational methods operating on bulk RNA-Seq samples, such as deconvolution of cell-type fractions and digital cytometry. However, the gene expression profile of a cell type can vary substantially due to both technical factors and biological differences in cell state and surroundings, reducing the efficacy of such methods. Here, we investigated which factors contribute most to this variation. We evaluated different normalization methods, quantified the variance explained by different factors, evaluated the effect on deconvolution of cell type fractions, and examined the differences between UMI-based single-cell RNA-Seq and bulk RNA-Seq. We investigated a collection of publicly available bulk and single-cell RNA-Seq datasets containing B and T cells, and found that the technical variation across laboratories is substantial, even for genes specifically selected for deconvolution, and this variation has a confounding effect on deconvolution. Tissue of origin is also a substantial factor, highlighting the challenge of using cell type profiles derived from blood with mixtures from other tissues. We also show that much of the differences between UMI-based single-cell and bulk RNA-Seq methods can be explained by the number of read duplicates per mRNA molecule in the single-cell sample. Our work shows the importance of either matching or correcting for technical factors when creating cell-type specific gene expression profiles that are to be used together with bulk samples.
Collapse
Affiliation(s)
- Johan Gustafsson
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Wallenberg Center for Protein Research, Chalmers University of Technology, Gothenburg, Sweden
| | - Felix Held
- Mathematical Sciences, University of Gothenburg and Chalmers University of Technology, Gothenburg, Sweden
| | - Jonathan L. Robinson
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Wallenberg Center for Protein Research, Chalmers University of Technology, Gothenburg, Sweden
| | - Elias Björnson
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Department of Molecular and Clinical Medicine/Wallenberg Laboratory for Cardiovascular and Metabolic Research, University of Gothenburg, Gothenburg, Sweden
| | - Rebecka Jörnsten
- Mathematical Sciences, University of Gothenburg and Chalmers University of Technology, Gothenburg, Sweden
| | - Jens Nielsen
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Wallenberg Center for Protein Research, Chalmers University of Technology, Gothenburg, Sweden
- BioInnovation Institute, Copenhagen, Denmark
| |
Collapse
|
20
|
Schaum N, Lehallier B, Hahn O, Pálovics R, Hosseinzadeh S, Lee SE, Sit R, Lee DP, Losada PM, Zardeneta ME, Fehlmann T, Webber J, McGeever A, Calcuttawala K, Zhang H, Berdnik D, Mathur V, Tan W, Zee A, Tan M, Pisco A, Karkanias J, Neff NF, Keller A, Darmanis S, Quake SR, Wyss-Coray T. Ageing hallmarks exhibit organ-specific temporal signatures. Nature 2020; 583:596-602. [PMID: 32669715 PMCID: PMC7757734 DOI: 10.1038/s41586-020-2499-y] [Citation(s) in RCA: 287] [Impact Index Per Article: 71.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 05/07/2020] [Indexed: 12/21/2022]
Abstract
Ageing is the single greatest cause of disease and death worldwide, and understanding the associated processes could vastly improve quality of life. Although major categories of ageing damage have been identified-such as altered intercellular communication, loss of proteostasis and eroded mitochondrial function1-these deleterious processes interact with extraordinary complexity within and between organs, and a comprehensive, whole-organism analysis of ageing dynamics has been lacking. Here we performed bulk RNA sequencing of 17 organs and plasma proteomics at 10 ages across the lifespan of Mus musculus, and integrated these findings with data from the accompanying Tabula Muris Senis2-or 'Mouse Ageing Cell Atlas'-which follows on from the original Tabula Muris3. We reveal linear and nonlinear shifts in gene expression during ageing, with the associated genes clustered in consistent trajectory groups with coherent biological functions-including extracellular matrix regulation, unfolded protein binding, mitochondrial function, and inflammatory and immune response. Notably, these gene sets show similar expression across tissues, differing only in the amplitude and the age of onset of expression. Widespread activation of immune cells is especially pronounced, and is first detectable in white adipose depots during middle age. Single-cell RNA sequencing confirms the accumulation of T cells and B cells in adipose tissue-including plasma cells that express immunoglobulin J-which also accrue concurrently across diverse organs. Finally, we show how gene expression shifts in distinct tissues are highly correlated with corresponding protein levels in plasma, thus potentially contributing to the ageing of the systemic circulation. Together, these data demonstrate a similar yet asynchronous inter- and intra-organ progression of ageing, providing a foundation from which to track systemic sources of declining health at old age.
Collapse
Affiliation(s)
- Nicholas Schaum
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Benoit Lehallier
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | - Oliver Hahn
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | - Róbert Pálovics
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | | | - Song E. Lee
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | - Rene Sit
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | - Davis P. Lee
- Veterans Administration Palo Alto Healthcare System, Palo Alto, California, USA
| | - Patricia Morán Losada
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | - Macy E. Zardeneta
- Veterans Administration Palo Alto Healthcare System, Palo Alto, California, USA
| | - Tobias Fehlmann
- Clinical Bioinformatics, Saarland University, Saarbrücken, Germany
| | - James Webber
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | | | - Kruti Calcuttawala
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | - Hui Zhang
- Veterans Administration Palo Alto Healthcare System, Palo Alto, California, USA
| | - Daniela Berdnik
- Veterans Administration Palo Alto Healthcare System, Palo Alto, California, USA
| | - Vidhu Mathur
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
| | - Weilun Tan
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | - Alexander Zee
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | - Michelle Tan
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | | | - Angela Pisco
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | - Jim Karkanias
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | - Norma F. Neff
- Chan Zuckerberg Biohub, San Francisco, California, USA
| | - Andreas Keller
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
- Clinical Bioinformatics, Saarland University, Saarbrücken, Germany
| | | | - Stephen R. Quake
- Chan Zuckerberg Biohub, San Francisco, California, USA
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Tony Wyss-Coray
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Stanford, California, USA
- Veterans Administration Palo Alto Healthcare System, Palo Alto, California, USA
- Paul F. Glenn Center for the Biology of Aging, Stanford University School of Medicine, Stanford, California, USA
- Wu Tsai Neurosciences Institute, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
21
|
Lopes-Ramos CM, Chen CY, Kuijjer ML, Paulson JN, Sonawane AR, Fagny M, Platig J, Glass K, Quackenbush J, DeMeo DL. Sex Differences in Gene Expression and Regulatory Networks across 29 Human Tissues. Cell Rep 2020; 31:107795. [PMID: 32579922 PMCID: PMC7898458 DOI: 10.1016/j.celrep.2020.107795] [Citation(s) in RCA: 172] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 04/01/2020] [Accepted: 05/29/2020] [Indexed: 11/25/2022] Open
Abstract
Sex differences manifest in many diseases and may drive sex-specific therapeutic responses. To understand the molecular basis of sex differences, we evaluated sex-biased gene regulation by constructing sample-specific gene regulatory networks in 29 human healthy tissues using 8,279 whole-genome expression profiles from the Genotype-Tissue Expression (GTEx) project. We find sex-biased regulatory network structures in each tissue. Even though most transcription factors (TFs) are not differentially expressed between males and females, many have sex-biased regulatory targeting patterns. In each tissue, genes that are differentially targeted by TFs between the sexes are enriched for tissue-related functions and diseases. In brain tissue, for example, genes associated with Parkinson's disease and Alzheimer's disease are targeted by different sets of TFs in each sex. Our systems-based analysis identifies a repertoire of TFs that play important roles in sex-specific architecture of gene regulatory networks, and it underlines sex-specific regulatory processes in both health and disease.
Collapse
Affiliation(s)
| | - Cho-Yi Chen
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
| | - Marieke L Kuijjer
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway
| | - Joseph N Paulson
- Department of Biostatistics, Product Development, Genentech Inc., San Francisco, CA, USA
| | - Abhijeet R Sonawane
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Maud Fagny
- Genetique Quantitative et Evolution-Le Moulon, Universite Paris-Saclay, Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Centre National de la Recherche Scientifique, AgroParisTech, Gif-sur-Yvette, France
| | - John Platig
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Kimberly Glass
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA; Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| |
Collapse
|
22
|
Data-dependent normalization strategies for untargeted metabolomics—a case study. Anal Bioanal Chem 2020; 412:6391-6405. [DOI: 10.1007/s00216-020-02594-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Revised: 03/04/2020] [Accepted: 03/10/2020] [Indexed: 12/25/2022]
|
23
|
Luck K, Kim DK, Lambourne L, Spirohn K, Begg BE, Bian W, Brignall R, Cafarelli T, Campos-Laborie FJ, Charloteaux B, Choi D, Coté AG, Daley M, Deimling S, Desbuleux A, Dricot A, Gebbia M, Hardy MF, Kishore N, Knapp JJ, Kovács IA, Lemmens I, Mee MW, Mellor JC, Pollis C, Pons C, Richardson AD, Schlabach S, Teeking B, Yadav A, Babor M, Balcha D, Basha O, Bowman-Colin C, Chin SF, Choi SG, Colabella C, Coppin G, D'Amata C, De Ridder D, De Rouck S, Duran-Frigola M, Ennajdaoui H, Goebels F, Goehring L, Gopal A, Haddad G, Hatchi E, Helmy M, Jacob Y, Kassa Y, Landini S, Li R, van Lieshout N, MacWilliams A, Markey D, Paulson JN, Rangarajan S, Rasla J, Rayhan A, Rolland T, San-Miguel A, Shen Y, Sheykhkarimli D, Sheynkman GM, Simonovsky E, Taşan M, Tejeda A, Tropepe V, Twizere JC, Wang Y, Weatheritt RJ, Weile J, Xia Y, Yang X, Yeger-Lotem E, Zhong Q, Aloy P, Bader GD, De Las Rivas J, Gaudet S, Hao T, Rak J, Tavernier J, Hill DE, Vidal M, Roth FP, Calderwood MA. A reference map of the human binary protein interactome. Nature 2020; 580:402-408. [PMID: 32296183 PMCID: PMC7169983 DOI: 10.1038/s41586-020-2188-x] [Citation(s) in RCA: 628] [Impact Index Per Article: 157.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Accepted: 02/14/2020] [Indexed: 12/14/2022]
Abstract
Global insights into cellular organization and genome function require comprehensive understanding of the interactome networks that mediate genotype-phenotype relationships1,2. Here, we present a human “all-by-all” reference interactome map of human binary protein interactions, or “HuRI”. With ~53,000 high-quality protein-protein interactions (PPIs), HuRI has approximately four times more such interactions than high-quality curated interactions from small-scale studies. Integrating HuRI with genome3, transcriptome4, and proteome5 data enables the study of cellular function within most physiological or pathological cellular contexts. We demonstrate the utility of HuRI in identifying specific subcellular roles of PPIs. Inferred tissue-specific networks reveal general principles for the formation of cellular context-specific functions and elucidate potential molecular mechanisms underlying tissue-specific phenotypes of Mendelian diseases. HuRI represents a systematic proteome-wide reference linking genomic variation to phenotypic outcomes.
Collapse
Affiliation(s)
- Katja Luck
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Dae-Kyum Kim
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Luke Lambourne
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Kerstin Spirohn
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Bridget E Begg
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Wenting Bian
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ruth Brignall
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tiziana Cafarelli
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Francisco J Campos-Laborie
- Cancer Research Center (CiC-IBMCC, CSIC/USAL), Consejo Superior de Investigaciones Científicas (CSIC) and University of Salamanca (USAL), Salamanca, Spain.,Institute for Biomedical Research of Salamanca (IBSAL), Salamanca, Spain
| | - Benoit Charloteaux
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Dongsic Choi
- The Research Institute of the McGill University Health Centre (RI-MUHC), Montreal, Quebec, Canada
| | - Atina G Coté
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Meaghan Daley
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Steven Deimling
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Alice Desbuleux
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Molecular Biology of Diseases, Groupe Interdisciplinaire de Génomique Appliquée (GIGA) and Laboratory of Viral Interactomes, University of Liège, Liège, Belgium
| | - Amélie Dricot
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Marinella Gebbia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Madeleine F Hardy
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Nishka Kishore
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Jennifer J Knapp
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - István A Kovács
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Network Science Institute, Northeastern University, Boston, MA, USA.,Wigner Research Centre for Physics, Institute for Solid State Physics and Optics, Budapest, Hungary
| | - Irma Lemmens
- Center for Medical Biotechnology, Vlaams Instituut voor Biotechnologie (VIB), Ghent, Belgium.,Cytokine Receptor Laboratory (CRL), Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Miles W Mee
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Joseph C Mellor
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada.,seqWell, Beverly, MA, USA
| | - Carl Pollis
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Carles Pons
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Aaron D Richardson
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sadie Schlabach
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Bridget Teeking
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anupama Yadav
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Mariana Babor
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Dawit Balcha
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Omer Basha
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.,National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Christian Bowman-Colin
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Suet-Feung Chin
- Cancer Research UK (CRUK) Cambridge Institute, University of Cambridge, Cambridge, UK
| | - Soon Gang Choi
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Claudia Colabella
- Department of Pharmaceutical Sciences, University of Perugia, Perugia, Italy.,Istituto Zooprofilattico Sperimentale dell'Umbria e delle Marche "Togo Rosati" (IZSUM), Perugia, Italy
| | - Georges Coppin
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Molecular Biology of Diseases, Groupe Interdisciplinaire de Génomique Appliquée (GIGA) and Laboratory of Viral Interactomes, University of Liège, Liège, Belgium
| | - Cassandra D'Amata
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - David De Ridder
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Steffi De Rouck
- Center for Medical Biotechnology, Vlaams Instituut voor Biotechnologie (VIB), Ghent, Belgium.,Cytokine Receptor Laboratory (CRL), Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Miquel Duran-Frigola
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Hanane Ennajdaoui
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Florian Goebels
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Liana Goehring
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Anjali Gopal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Ghazal Haddad
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Elodie Hatchi
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Mohamed Helmy
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Yves Jacob
- Département de Virologie, Unité de Génétique Moléculaire des Virus à ARN (GMVR), Institut Pasteur, UMR3569, Centre National de la Recherche Scientifique (CNRS), Paris, France.,Université Paris Diderot, Paris, France
| | - Yoseph Kassa
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Serena Landini
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Roujia Li
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Natascha van Lieshout
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Andrew MacWilliams
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Dylan Markey
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Joseph N Paulson
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA.,Department of Biostatistics, Product Development, Genentech Inc., South San Francisco, CA, USA
| | - Sudharshan Rangarajan
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - John Rasla
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ashyad Rayhan
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Thomas Rolland
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Adriana San-Miguel
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Yun Shen
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Dayag Sheykhkarimli
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada
| | - Gloria M Sheynkman
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Eyal Simonovsky
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.,National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Murat Taşan
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Alexander Tejeda
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Vincent Tropepe
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Jean-Claude Twizere
- Molecular Biology of Diseases, Groupe Interdisciplinaire de Génomique Appliquée (GIGA) and Laboratory of Viral Interactomes, University of Liège, Liège, Belgium
| | - Yang Wang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | | | - Jochen Weile
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Yu Xia
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Bioengineering, McGill University, Montreal, Quebec, Canada
| | - Xinping Yang
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer-Sheva, Israel.,National Institute for Biotechnology in the Negev (NIBN), Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Quan Zhong
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biological Sciences, Wright State University, Dayton, OH, USA
| | - Patrick Aloy
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain
| | - Gary D Bader
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Javier De Las Rivas
- Cancer Research Center (CiC-IBMCC, CSIC/USAL), Consejo Superior de Investigaciones Científicas (CSIC) and University of Salamanca (USAL), Salamanca, Spain.,Institute for Biomedical Research of Salamanca (IBSAL), Salamanca, Spain
| | - Suzanne Gaudet
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tong Hao
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Janusz Rak
- The Research Institute of the McGill University Health Centre (RI-MUHC), Montreal, Quebec, Canada
| | - Jan Tavernier
- Center for Medical Biotechnology, Vlaams Instituut voor Biotechnologie (VIB), Ghent, Belgium.,Cytokine Receptor Laboratory (CRL), Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA. .,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA. .,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA. .,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA.
| | - Frederick P Roth
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA. .,The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada. .,Lunenfeld-Tanenbaum Research Institute (LTRI), Sinai Health System, Toronto, Ontario, Canada. .,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada. .,Canadian Institute for Advanced Research (CIFAR), Toronto, Ontario, Canada.
| | - Michael A Calderwood
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, MA, USA. .,Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA. .,Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
24
|
Singh U, Hur M, Dorman K, Wurtele ES. MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets. Nucleic Acids Res 2020; 48:e23. [PMID: 31956905 PMCID: PMC7039010 DOI: 10.1093/nar/gkz1209] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 12/05/2019] [Accepted: 12/17/2019] [Indexed: 12/17/2022] Open
Abstract
The diverse and growing omics data in public domains provide researchers with tremendous opportunity to extract hidden, yet undiscovered, knowledge. However, the vast majority of archived data remain unused. Here, we present MetaOmGraph (MOG), a free, open-source, standalone software for exploratory analysis of massive datasets. Researchers, without coding, can interactively visualize and evaluate data in the context of its metadata, honing-in on groups of samples or genes based on attributes such as expression values, statistical associations, metadata terms and ontology annotations. Interaction with data is easy via interactive visualizations such as line charts, box plots, scatter plots, histograms and volcano plots. Statistical analyses include co-expression analysis, differential expression analysis and differential correlation analysis, with significance tests. Researchers can send data subsets to R for additional analyses. Multithreading and indexing enable efficient big data analysis. A researcher can create new MOG projects from any numerical data; or explore an existing MOG project. MOG projects, with history of explorations, can be saved and shared. We illustrate MOG by case studies of large curated datasets from human cancer RNA-Seq, where we identify novel putative biomarker genes in different tumors, and microarray and metabolomics data from Arabidopsis thaliana. MOG executable and code: http://metnetweb.gdcb.iastate.edu/ and https://github.com/urmi-21/MetaOmGraph/.
Collapse
Affiliation(s)
- Urminder Singh
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50011, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA 50011, USA
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA 50011, USA
| | - Manhoi Hur
- Center for Metabolic Biology, Iowa State University, Ames, IA 50011, USA
| | - Karin Dorman
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50011, USA
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA 50011, USA
- Department of Statistics, Iowa State University, Ames, IA 50011, USA
| | - Eve Syrkin Wurtele
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50011, USA
- Center for Metabolic Biology, Iowa State University, Ames, IA 50011, USA
- Department of Genetics Development and Cell Biology, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
25
|
Fagny M, Platig J, Kuijjer ML, Lin X, Quackenbush J. Nongenic cancer-risk SNPs affect oncogenes, tumour-suppressor genes, and immune function. Br J Cancer 2020; 122:569-577. [PMID: 31806877 PMCID: PMC7028992 DOI: 10.1038/s41416-019-0614-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 09/23/2019] [Accepted: 10/07/2019] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Genome-wide association studies (GWASes) have identified many noncoding germline single-nucleotide polymorphisms (SNPs) that are associated with an increased risk of developing cancer. However, how these SNPs affect cancer risk is still largely unknown. METHODS We used a systems biology approach to analyse the regulatory role of cancer-risk SNPs in thirteen tissues. By using data from the Genotype-Tissue Expression (GTEx) project, we performed an expression quantitative trait locus (eQTL) analysis. We represented both significant cis- and trans-eQTLs as edges in tissue-specific eQTL bipartite networks. RESULTS Each tissue-specific eQTL network is organised into communities that group sets of SNPs and functionally related genes. When mapping cancer-risk SNPs to these networks, we find that in each tissue, these SNPs are significantly overrepresented in communities enriched for immune response processes, as well as tissue-specific functions. Moreover, cancer-risk SNPs are more likely to be 'cores' of their communities, influencing the expression of many genes within the same biological processes. Finally, cancer-risk SNPs preferentially target oncogenes and tumour-suppressor genes, suggesting that they may alter the expression of these key cancer genes. CONCLUSIONS This approach provides a new way of understanding genetic effects on cancer risk and provides a biological context for interpreting the results of GWAS cancer studies.
Collapse
Affiliation(s)
- Maud Fagny
- Genetique Quantitative et Evolution-Le Moulon, Institut National de la Recherche agronomique, Université Paris-Sud, Centre National de la Recherche Scientifique, AgroParisTech, Université Paris-Saclay, Paris, France
| | - John Platig
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Marieke Lydia Kuijjer
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - John Quackenbush
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
26
|
Basha O, Argov CM, Artzy R, Zoabi Y, Hekselman I, Alfandari L, Chalifa-Caspi V, Yeger-Lotem E. Differential network analysis of multiple human tissue interactomes highlights tissue-selective processes and genetic disorder genes. Bioinformatics 2020; 36:2821-2828. [DOI: 10.1093/bioinformatics/btaa034] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 01/07/2020] [Accepted: 01/16/2020] [Indexed: 01/19/2023] Open
Abstract
Abstract
Motivation
Differential network analysis, designed to highlight network changes between conditions, is an important paradigm in network biology. However, differential network analysis methods have been typically designed to compare between two conditions and were rarely applied to multiple protein interaction networks (interactomes). Importantly, large-scale benchmarks for their evaluation have been lacking.
Results
Here, we present a framework for assessing the ability of differential network analysis of multiple human tissue interactomes to highlight tissue-selective processes and disorders. For this, we created a benchmark of 6499 curated tissue-specific Gene Ontology biological processes. We applied five methods, including four differential network analysis methods, to construct weighted interactomes for 34 tissues. Rigorous assessment of this benchmark revealed that differential analysis methods perform well in revealing tissue-selective processes (AUCs of 0.82–0.9). Next, we applied differential network analysis to illuminate the genes underlying tissue-selective hereditary disorders. For this, we curated a dataset of 1305 tissue-specific hereditary disorders and their manifesting tissues. Focusing on subnetworks containing the top 1% differential interactions in disease-relevant tissue interactomes revealed significant enrichment for disorder-causing genes in 18.6% of the cases, with a significantly high success rate for blood, nerve, muscle and heart diseases.
Summary
Altogether, we offer a framework that includes expansive manually curated datasets of tissue-selective processes and disorders to be used as benchmarks or to illuminate tissue-selective processes and genes. Our results demonstrate that differential analysis of multiple human tissue interactomes is a powerful tool for highlighting processes and genes with tissue-selective functionality and clinical impact.
Availability and implementation
Datasets are available as part of the Supplementary data.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Omer Basha
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Chanan M Argov
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Raviv Artzy
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Yazeed Zoabi
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Idan Hekselman
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Liad Alfandari
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
| | - Vered Chalifa-Caspi
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| | - Esti Yeger-Lotem
- Department of Clinical Biochemistry and Pharmacology, Faculty of Health Sciences
- National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
| |
Collapse
|
27
|
Anderson D, Baynam G, Blackwell JM, Lassmann T. Personalised analytics for rare disease diagnostics. Nat Commun 2019; 10:5274. [PMID: 31754101 PMCID: PMC6872807 DOI: 10.1038/s41467-019-13345-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Accepted: 11/01/2019] [Indexed: 12/21/2022] Open
Abstract
Whole genome and exome sequencing is a standard tool for the diagnosis of patients suffering from rare and other genetic disorders. The interpretation of the tens of thousands of variants returned from such tests remains a major challenge. Here we focus on the problem of prioritising variants with respect to the observed disease phenotype. We hypothesise that linking patterns of gene expression across multiple tissues to the phenotypes will aid in discovering disease causing variants. To test this, we construct classifiers that learn associations between tissue-specific gene expression and disease phenotypes. We find that using Genotype-Tissue Expression project (GTEx) expression data in conjunction with disease agnostic variant prioritisation methods (CADD or MetaSVM) results in consistent improvements in classification accuracy. Our method represents a previously overlooked avenue of utilising existing expression data for clinical diagnostics, and also opens the door to use of other functional genomic data sets in the same manner.
Collapse
Affiliation(s)
- Denise Anderson
- Telethon Kids Institute, The University of Western Australia, PO Box 855, West Perth, WA, 6872, Australia.
| | - Gareth Baynam
- Office of Population Health Genomics, Department of Health, PO Box 8172, Perth Business Centre, Perth, WA, 6849, Australia
- Genetic Services of Western Australia, King Edward Memorial Hospital, PO Box 134, Subiaco, WA, 6904, Australia
- Western Australian Register of Developmental Anomalies (WARDA), King Edward Memorial Hospital, PO Box 134, Subiaco, WA, 6904, Australia
| | - Jenefer M Blackwell
- Telethon Kids Institute, The University of Western Australia, PO Box 855, West Perth, WA, 6872, Australia
| | - Timo Lassmann
- Telethon Kids Institute, The University of Western Australia, PO Box 855, West Perth, WA, 6872, Australia.
| |
Collapse
|
28
|
Hicks SC, Okrah K, Paulson JN, Quackenbush J, Irizarry RA, Bravo HC. Smooth quantile normalization. Biostatistics 2019; 19:185-198. [PMID: 29036413 DOI: 10.1093/biostatistics/kxx028] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2016] [Accepted: 05/07/2017] [Indexed: 11/14/2022] Open
Abstract
Between-sample normalization is a critical step in genomic data analysis to remove systematic bias and unwanted technical variation in high-throughput data. Global normalization methods are based on the assumption that observed variability in global properties is due to technical reasons and are unrelated to the biology of interest. For example, some methods correct for differences in sequencing read counts by scaling features to have similar median values across samples, but these fail to reduce other forms of unwanted technical variation. Methods such as quantile normalization transform the statistical distributions across samples to be the same and assume global differences in the distribution are induced by only technical variation. However, it remains unclear how to proceed with normalization if these assumptions are violated, for example, if there are global differences in the statistical distributions between biological conditions or groups, and external information, such as negative or control features, is not available. Here, we introduce a generalization of quantile normalization, referred to as smooth quantile normalization (qsmooth), which is based on the assumption that the statistical distribution of each sample should be the same (or have the same distributional shape) within biological groups or conditions, but allowing that they may differ between groups. We illustrate the advantages of our method on several high-throughput datasets with global differences in distributions corresponding to different biological conditions. We also perform a Monte Carlo simulation study to illustrate the bias-variance tradeoff and root mean squared error of qsmooth compared to other global normalization methods. A software implementation is available from https://github.com/stephaniehicks/qsmooth.
Collapse
Affiliation(s)
- Stephanie C Hicks
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA and Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| | - Kwame Okrah
- Genetech, Product Development Biostatistics, 1 DNA Way, South San Francisco, CA 94080, USA
| | - Joseph N Paulson
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA and Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| | - John Quackenbush
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA and Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| | - Rafael A Irizarry
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA and Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| | - Héctor Corrada Bravo
- Department of Computer Science, University of Maryland, College Park, USA and Center for Bioinformatics and Computational Biology, Institute of Advanced Computer Studies, University of Maryland, 8314 Paint Branch Dr., College Park, MD 20742, College Park, USA
| |
Collapse
|
29
|
A test metric for assessing single-cell RNA-seq batch correction. Nat Methods 2018; 16:43-49. [PMID: 30573817 DOI: 10.1038/s41592-018-0254-1] [Citation(s) in RCA: 222] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 10/31/2018] [Indexed: 12/20/2022]
Abstract
Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET ) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.
Collapse
|
30
|
Savova V, Vinogradova S, Pruss D, Gimelbrant AA, Weiss LA. Risk alleles of genes with monoallelic expression are enriched in gain-of-function variants and depleted in loss-of-function variants for neurodevelopmental disorders. Mol Psychiatry 2017; 22:1785-1794. [PMID: 28265118 PMCID: PMC5589474 DOI: 10.1038/mp.2017.13] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 12/01/2016] [Accepted: 01/09/2017] [Indexed: 02/06/2023]
Abstract
Over 3000 human genes can be expressed from a single allele in one cell, and from the other allele-or both-in neighboring cells. Little is known about the consequences of this epigenetic phenomenon, monoallelic expression (MAE). We hypothesized that MAE increases expression variability, with a potential impact on human disease. Here, we use a chromatin signature to infer MAE for genes in lymphoblastoid cell lines and human fetal brain tissue. We confirm that across clones MAE status correlates with expression level, and that in human tissue data sets, MAE genes show increased expression variability. We then compare mono- and biallelic genes at three distinct scales. In the human population, we observe that genes with polymorphisms influencing expression variance are more likely to be MAE (P<1.1 × 10-6). At the trans-species level, we find gene expression differences and directional selection between humans and chimpanzees more common among MAE genes (P<0.05). Extending to human disease, we show that MAE genes are under-represented in neurodevelopmental copy number variants (CNVs) (P<2.2 × 10-10), suggesting that pathogenic variants acting via expression level are less likely to involve MAE genes. Using neuropsychiatric single-nucleotide polymorphism (SNP) and single-nucleotide variant (SNV) data, we see that genes with pathogenic expression-altering or loss-of-function variants are less likely MAE (P<7.5 × 10-11) and genes with only missense or gain-of-function variants are more likely MAE (P<1.4 × 10-6). Together, our results suggest that MAE genes tolerate a greater range of expression level than biallelic expression (BAE) genes, and this information may be useful in prediction of pathogenicity.
Collapse
Affiliation(s)
- V Savova
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - S Vinogradova
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - D Pruss
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - A A Gimelbrant
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - L A Weiss
- Department of Psychiatry and Institute for Human Genetics, University of California San Francisco, Langley Porter Psychiatric Institute, Nina Ireland Lab, San Francisco, CA, USA
| |
Collapse
|
31
|
Sonawane AR, Platig J, Fagny M, Chen CY, Paulson JN, Lopes-Ramos CM, DeMeo DL, Quackenbush J, Glass K, Kuijjer ML. Understanding Tissue-Specific Gene Regulation. Cell Rep 2017; 21:1077-1088. [PMID: 29069589 PMCID: PMC5828531 DOI: 10.1016/j.celrep.2017.10.001] [Citation(s) in RCA: 225] [Impact Index Per Article: 32.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Revised: 08/09/2017] [Accepted: 09/28/2017] [Indexed: 12/20/2022] Open
Abstract
Although all human tissues carry out common processes, tissues are distinguished by gene expression patterns, implying that distinct regulatory programs control tissue specificity. In this study, we investigate gene expression and regulation across 38 tissues profiled in the Genotype-Tissue Expression project. We find that network edges (transcription factor to target gene connections) have higher tissue specificity than network nodes (genes) and that regulating nodes (transcription factors) are less likely to be expressed in a tissue-specific manner as compared to their targets (genes). Gene set enrichment analysis of network targeting also indicates that the regulation of tissue-specific function is largely independent of transcription factor expression. In addition, tissue-specific genes are not highly targeted in their corresponding tissue network. However, they do assume bottleneck positions due to variability in transcription factor targeting and the influence of non-canonical regulatory interactions. These results suggest that tissue specificity is driven by context-dependent regulatory paths, providing transcriptional control of tissue-specific processes.
Collapse
Affiliation(s)
- Abhijeet Rajendra Sonawane
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - John Platig
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Maud Fagny
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Cho-Yi Chen
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Joseph Nathaniel Paulson
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Camila Miranda Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Dawn Lisa DeMeo
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA
| | - John Quackenbush
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA
| | - Kimberly Glass
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA.
| | - Marieke Lydia Kuijjer
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
| |
Collapse
|
32
|
Huang J, Vendramin S, Shi L, McGinnis KM. Construction and Optimization of a Large Gene Coexpression Network in Maize Using RNA-Seq Data. PLANT PHYSIOLOGY 2017; 175:568-583. [PMID: 28768814 PMCID: PMC5580776 DOI: 10.1104/pp.17.00825] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 07/31/2017] [Indexed: 05/22/2023]
Abstract
With the emergence of massively parallel sequencing, genomewide expression data production has reached an unprecedented level. This abundance of data has greatly facilitated maize research, but may not be amenable to traditional analysis techniques that were optimized for other data types. Using publicly available data, a gene coexpression network (GCN) can be constructed and used for gene function prediction, candidate gene selection, and improving understanding of regulatory pathways. Several GCN studies have been done in maize (Zea mays), mostly using microarray datasets. To build an optimal GCN from plant materials RNA-Seq data, parameters for expression data normalization and network inference were evaluated. A comprehensive evaluation of these two parameters and a ranked aggregation strategy on network performance, using libraries from 1266 maize samples, were conducted. Three normalization methods and 10 inference methods, including six correlation and four mutual information methods, were tested. The three normalization methods had very similar performance. For network inference, correlation methods performed better than mutual information methods at some genes. Increasing sample size also had a positive effect on GCN. Aggregating single networks together resulted in improved performance compared to single networks.
Collapse
Affiliation(s)
- Ji Huang
- Department of Biological Science, Florida State University, Tallahassee, Florida 32306
| | - Stefania Vendramin
- Department of Biological Science, Florida State University, Tallahassee, Florida 32306
| | - Lizhen Shi
- Department of Computer Science, Florida State University, Tallahassee, Florida 32306
| | - Karen M McGinnis
- Department of Biological Science, Florida State University, Tallahassee, Florida 32306
| |
Collapse
|