1
|
Nieuwenhuis TO, Giles HH, Arking JVA, Patil AH, Shi W, McCall MN, Halushka MK. Patterns of Unwanted Biological and Technical Expression Variation Among 49 Human Tissues. J Transl Med 2024; 104:102069. [PMID: 38670317 DOI: 10.1016/j.labinv.2024.102069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 03/21/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
Tissue gene expression studies are impacted by biological and technical sources of variation, which can be broadly classified into wanted and unwanted variation. The latter, if not addressed, results in misleading biological conclusions. Methods have been proposed to reduce unwanted variation, such as normalization and batch correction. A more accurate understanding of all causes of variation could significantly improve the ability of these methods to remove unwanted variation while retaining variation corresponding to the biological question of interest. We used 17,282 samples from 49 human tissues in the Genotype-Tissue Expression data set (v8) to investigate patterns and causes of expression variation. Transcript expression was transformed to z-scores, and only the most variable 2% of transcripts were evaluated and clustered based on coexpression patterns. Clustered gene sets were assigned to different biological or technical causes based on histologic appearances and metadata elements. We identified 522 variable transcript clusters (median: 11 per tissue) among the samples. Of these, 63% were confidently explained, 16% were likely explained, 7% were low confidence explanations, and 14% had no clear cause. Histologic analysis annotated 46 clusters. Other common causes of variability included sex, sequencing contamination, immunoglobulin diversity, and compositional tissue differences. Less common biological causes included death interval (Hardy score), disease status, and age. Technical causes included blood draw timing and harvesting differences. Many of the causes of variation in bulk tissue expression were identifiable in the Tabula Sapiens data set of single-cell expression. This is among the largest explorations of the underlying sources of tissue expression variation. It uncovered expected and unexpected causes of variable gene expression and demonstrated the utility of matched histologic specimens. It further demonstrated the value of acquiring meaningful tissue harvesting metadata elements to use for improved normalization, batch correction, and analysis of both bulk and single-cell RNA-seq data.
Collapse
Affiliation(s)
- Tim O Nieuwenhuis
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland; McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Hunter H Giles
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Jeremy V A Arking
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Arun H Patil
- Lieber Institute for Brain Development, Baltimore, Maryland
| | - Wen Shi
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Matthew N McCall
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, New York; Department of Biomedical Genetics, University of Rochester Medical Center, Rochester, New York
| | - Marc K Halushka
- Pathology and Laboratory Medicine Institute, Cleveland Clinic, Cleveland, Ohio.
| |
Collapse
|
2
|
Wang L, Babushkin N, Liu Z, Liu X. Trans-eQTL mapping in gene sets identifies network effects of genetic variants. CELL GENOMICS 2024; 4:100538. [PMID: 38565144 PMCID: PMC11019359 DOI: 10.1016/j.xgen.2024.100538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 12/08/2023] [Accepted: 03/13/2024] [Indexed: 04/04/2024]
Abstract
Nearly all trait-associated variants identified in genome-wide association studies (GWASs) are noncoding. The cis regulatory effects of these variants have been extensively characterized, but how they affect gene regulation in trans has been the subject of fewer studies because of the difficulty in detecting trans-expression quantitative loci (eQTLs). We developed trans-PCO for detecting trans effects of genetic variants on gene networks. Our simulations demonstrate that trans-PCO substantially outperforms existing trans-eQTL mapping methods. We applied trans-PCO to two gene expression datasets from whole blood, DGN (N = 913) and eQTLGen (N = 31,684), and identified 14,985 high-quality trans-eSNP-module pairs associated with 197 co-expression gene modules and biological processes. We performed colocalization analyses between GWAS loci of 46 complex traits and the trans-eQTLs. We demonstrated that the identified trans effects can help us understand how trait-associated variants affect gene regulatory networks and biological pathways.
Collapse
Affiliation(s)
- Lili Wang
- The Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA; Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Nikita Babushkin
- Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Zhonghua Liu
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| | - Xuanyao Liu
- The Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA; Department of Medicine, Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA; Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
3
|
Ravichandran P, Parsana P, Keener R, Hansen KD, Battle A. Aggregation of recount3 RNA-seq data improves inference of consensus and tissue-specific gene co-expression networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576447. [PMID: 38328080 PMCID: PMC10849507 DOI: 10.1101/2024.01.20.576447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Background Gene co-expression networks (GCNs) describe relationships among expressed genes key to maintaining cellular identity and homeostasis. However, the small sample size of typical RNA-seq experiments which is several orders of magnitude fewer than the number of genes is too low to infer GCNs reliably. recount3, a publicly available dataset comprised of 316,443 uniformly processed human RNA-seq samples, provides an opportunity to improve power for accurate network reconstruction and obtain biological insight from the resulting networks. Results We compared alternate aggregation strategies to identify an optimal workflow for GCN inference by data aggregation and inferred three consensus networks: a universal network, a non-cancer network, and a cancer network in addition to 27 tissue context-specific networks. Central network genes from our consensus networks were enriched for evolutionarily constrained genes and ubiquitous biological pathways, whereas central context-specific network genes included tissue-specific transcription factors and factorization based on the hubs led to clustering of related tissue contexts. We discovered that annotations corresponding to context-specific networks inferred from aggregated data were enriched for trait heritability beyond known functional genomic annotations and were significantly more enriched when we aggregated over a larger number of samples. Conclusion This study outlines best practices for network GCN inference and evaluation by data aggregation. We recommend estimating and regressing confounders in each data set before aggregation and prioritizing large sample size studies for GCN reconstruction. Increased statistical power in inferring context-specific networks enabled the derivation of variant annotations that were enriched for concordant trait heritability independent of functional genomic annotations that are context-agnostic. While we observed strictly increasing held-out log-likelihood with data aggregation, we noted diminishing marginal improvements. Future directions aimed at alternate methods for estimating confounders and integrating orthogonal information from modalities such as Hi-C and ChIP-seq can further improve GCN inference.
Collapse
Affiliation(s)
| | - Princy Parsana
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Rebecca Keener
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Kaspar D Hansen
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Biostatistics, Johns Hopkins School of Public Health, Baltimore, MD, USA
- Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, USA
- Data Science and AI Institute, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
4
|
Mapel XM, Kadri NK, Leonard AS, He Q, Lloret-Villas A, Bhati M, Hiltpold M, Pausch H. Molecular quantitative trait loci in reproductive tissues impact male fertility in cattle. Nat Commun 2024; 15:674. [PMID: 38253538 PMCID: PMC10803364 DOI: 10.1038/s41467-024-44935-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 01/08/2024] [Indexed: 01/24/2024] Open
Abstract
Breeding bulls are well suited to investigate inherited variation in male fertility because they are genotyped and their reproductive success is monitored through semen analyses and thousands of artificial inseminations. However, functional data from relevant tissues are lacking in cattle, which prevents fine-mapping fertility-associated genomic regions. Here, we characterize gene expression and splicing variation in testis, epididymis, and vas deferens transcriptomes of 118 mature bulls and conduct association tests between 414,667 molecular phenotypes and 21,501,032 genome-wide variants to identify 41,156 regulatory loci. We show broad consensus in tissue-specific and tissue-enriched gene expression between the three bovine tissues and their human and murine counterparts. Expression- and splicing-mediating variants are more than three times as frequent in testis than epididymis and vas deferens, highlighting the transcriptional complexity of testis. Finally, we identify genes (WDR19, SPATA16, KCTD19, ZDHHC1) and molecular phenotypes that are associated with quantitative variation in male fertility through transcriptome-wide association and colocalization analyses.
Collapse
Affiliation(s)
- Xena Marie Mapel
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | - Naveen Kumar Kadri
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | - Alexander S Leonard
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | - Qiongyu He
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
| | | | - Meenu Bhati
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
- Roslin Institute, The University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
| | - Maya Hiltpold
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland
- GenPhySE, Université de Toulouse, INRAE, ENVT, 31326, Castanet Tolosan, France
| | - Hubert Pausch
- Animal Genomics, ETH Zurich, Universitatstrasse 2, 8092, Zurich, Switzerland.
| |
Collapse
|
5
|
Aygün N, Liang D, Crouse WL, Keele GR, Love MI, Stein JL. Inferring cell-type-specific causal gene regulatory networks during human neurogenesis. Genome Biol 2023; 24:130. [PMID: 37254169 PMCID: PMC10230710 DOI: 10.1186/s13059-023-02959-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 05/05/2023] [Indexed: 06/01/2023] Open
Abstract
BACKGROUND Genetic variation influences both chromatin accessibility, assessed in chromatin accessibility quantitative trait loci (caQTL) studies, and gene expression, assessed in expression QTL (eQTL) studies. Genetic variants can impact either nearby genes (cis-eQTLs) or distal genes (trans-eQTLs). Colocalization between caQTL and eQTL, or cis- and trans-eQTLs suggests that they share causal variants. However, pairwise colocalization between these molecular QTLs does not guarantee a causal relationship. Mediation analysis can be applied to assess the evidence supporting causality versus independence between molecular QTLs. Given that the function of QTLs can be cell-type-specific, we performed mediation analyses to find epigenetic and distal regulatory causal pathways for genes within two major cell types of the developing human cortex, progenitors and neurons. RESULTS We find that the expression of 168 and 38 genes is mediated by chromatin accessibility in progenitors and neurons, respectively. We also find that the expression of 11 and 12 downstream genes is mediated by upstream genes in progenitors and neurons. Moreover, we discover that a genetic locus associated with inter-individual differences in brain structure shows evidence for mediation of SLC26A7 through chromatin accessibility, identifying molecular mechanisms of a common variant association to a brain trait. CONCLUSIONS In this study, we identify cell-type-specific causal gene regulatory networks whereby the impacts of variants on gene expression were mediated by chromatin accessibility or distal gene expression. Identification of these causal paths will enable identifying and prioritizing actionable regulatory targets perturbing these key processes during neurodevelopment.
Collapse
Affiliation(s)
- Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Dan Liang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Wesley L Crouse
- Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA
| | - Gregory R Keele
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
6
|
Nieuwenhuis TO, Giles HH, McCall MN, Halushka MK. Patterns of unwanted biological and technical expression variation across 49 human tissues. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.09.531935. [PMID: 36945408 PMCID: PMC10028996 DOI: 10.1101/2023.03.09.531935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
All tissue-based gene expression studies are impacted by biological and technical sources of variation. Numerous methods are used to normalize and batch correct these datasets. A more accurate understanding of all causes of variation could further optimize these approaches. We used 17,282 samples from 49 tissues in the Genotype Tissue Expression (GTEx) dataset (v8) to investigate patterns and causes of expression variation. Transcript expression was normalized to Z-scores and only the most variable 2% of transcripts were evaluated and clustered based on co-expression patterns. Clustered gene sets were solved to different biological or technical causes related to metadata elements and histologic images. We identified 522 variable transcript clusters (median 11 per tissue) across the samples. Of these, 64% were confidently explained, 15% were likely explained, 7% were low confidence explanations and 14% had no clear cause. Common causes included sex, sequencing contamination, immunoglobulin diversity, and compositional tissue differences. Less common biological causes included death interval (Hardy score), muscle atrophy, diabetes status, and menopause. Technical causes included brain pH and harvesting differences. Many of the causes of variation in bulk tissue expression were identifiable in the Tabula Sapiens dataset of single cell expression. This is the largest exploration of the underlying sources of tissue expression variation. It uncovered expected and unexpected causes of variable gene expression. These identified sources of variation will inform which metadata to acquire with tissue harvesting and can be used to improve normalization, batch correction, and analysis of both bulk and single cell RNA-seq data.
Collapse
Affiliation(s)
- Tim O Nieuwenhuis
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Hunter H Giles
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Matthew N McCall
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
- Department of Biomedical Genetics, University of Rochester Medical Center, Rochester, NY, USA
| | - Marc K Halushka
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
7
|
Park S, Lee J, Kim J, Kim D, Lee JH, Pack SP, Seo M. Benchmark study for evaluating the quality of reference genomes and gene annotations in 114 species. Front Vet Sci 2023; 10:1128570. [PMID: 36896291 PMCID: PMC9988948 DOI: 10.3389/fvets.2023.1128570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 02/02/2023] [Indexed: 02/23/2023] Open
Abstract
Introduction For reference genomes and gene annotations are key materials that can determine the limits of the molecular biology research of a species; however, systematic research on their quality assessment remains insufficient. Methods We collected reference assemblies, gene annotations, and 3,420 RNA-sequencing (RNA-seq) data from 114 species and selected effective indicators to simultaneously evaluate the reference genome quality of various species, including statistics that can be obtained empirically during the mapping process of short reads. Furthermore, we newly presented and applied transcript diversity and quantification success rates that can relatively evaluate the quality of gene annotations of various species. Finally, we proposed a next-generation sequencing (NGS) applicability index by integrating a total of 10 effective indicators that can evaluate the genome and gene annotation of a specific species. Results and discussion Based on these effective evaluation indicators, we successfully evaluated and demonstrated the relative accessibility of NGS applications in all species, which will directly contribute to determining the technological boundaries in each species. Simultaneously, we expect that it will be a key indicator to examine the direction of future development through relative quality evaluation of genomes and gene annotations in each species, including countless organisms whose genomes and gene annotations will be constructed in the future.
Collapse
Affiliation(s)
- Sinwoo Park
- Department of Computer and Information Science, Korea University, Sejong City, Republic of Korea
| | - Jinbaek Lee
- Department of Computer Convergence Software, Korea University, Sejong City, Republic of Korea
| | - Jaeryeong Kim
- Department of Computer and Information Science, Korea University, Sejong City, Republic of Korea
| | - Dohyeon Kim
- Department of Computer and Information Science, Korea University, Sejong City, Republic of Korea
| | - Jin Hyup Lee
- Department of Food and Biotechnology, Korea University, Sejong City, Republic of Korea
| | - Seung Pil Pack
- Department of Biotechnology and Bioinformatics, Korea University, Sejong City, Republic of Korea
| | - Minseok Seo
- Department of Computer and Information Science, Korea University, Sejong City, Republic of Korea.,Department of Computer Convergence Software, Korea University, Sejong City, Republic of Korea
| |
Collapse
|
8
|
Chaar DL, Nguyen K, Wang YZ, Ratliff SM, Mosley TH, Kardia SLR, Smith JA, Zhao W. SNP-by-CpG Site Interactions in ABCA7 Are Associated with Cognition in Older African Americans. Genes (Basel) 2022; 13:2150. [PMID: 36421824 PMCID: PMC9691156 DOI: 10.3390/genes13112150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 10/21/2022] [Accepted: 11/10/2022] [Indexed: 06/28/2024] Open
Abstract
SNPs in ABCA7 confer the largest genetic risk for Alzheimer's Disease (AD) in African Americans (AA) after APOE ε4. However, the relationship between ABCA7 and cognitive function has not been thoroughly examined. We investigated the effects of five known AD risk SNPs and 72 CpGs in ABCA7, as well as their interactions, on general cognitive function (cognition) in 634 older AA without dementia from Genetic Epidemiology Network of Arteriopathy (GENOA). Using linear mixed models, no SNP or CpG was associated with cognition after multiple testing correction, but five CpGs were nominally associated (p < 0.05). Four SNP-by-CpG interactions were associated with cognition (FDR q < 0.1). Contrast tests show that methylation is associated with cognition in some genotype groups (p < 0.05): a 1% increase at cg00135882 and cg22271697 is associated with a 0.68 SD decrease and 0.14 SD increase in cognition for those with the rs3764647 GG/AG (p = 0.004) and AA (p = 2 × 10-4) genotypes, respectively. In addition, a 1% increase at cg06169110 and cg17316918 is associated with a 0.37 SD decrease (p = 2 × 10-4) and 0.33 SD increase (p = 0.004), respectively, in cognition for those with the rs115550680 GG/AG genotype. While AD risk SNPs in ABCA7 were not associated with cognition in this sample, some have interactions with proximal methylation on cognition.
Collapse
Affiliation(s)
- Dima L. Chaar
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Kim Nguyen
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yi-Zhe Wang
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Scott M. Ratliff
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Thomas H. Mosley
- Memory Impairment and Neurodegenerative Dementia (MIND) Center, University of Mississippi Medical Center, Jackson, MI 39216, USA
| | - Sharon L. R. Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jennifer A. Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI 48104, USA
| | - Wei Zhao
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI 48104, USA
| |
Collapse
|
9
|
Munro D, Wang T, Chitre AS, Polesskaya O, Ehsan N, Gao J, Gusev A, Woods LS, Saba L, Chen H, Palmer A, Mohammadi P. The regulatory landscape of multiple brain regions in outbred heterogeneous stock rats. Nucleic Acids Res 2022; 50:10882-10895. [PMID: 36263809 PMCID: PMC9638908 DOI: 10.1093/nar/gkac912] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 08/17/2022] [Accepted: 10/05/2022] [Indexed: 11/14/2022] Open
Abstract
Heterogeneous Stock (HS) rats are a genetically diverse outbred rat population that is widely used for studying genetics of behavioral and physiological traits. Mapping Quantitative Trait Loci (QTL) associated with transcriptional changes would help to identify mechanisms underlying these traits. We generated genotype and transcriptome data for five brain regions from 88 HS rats. We identified 21 392 cis-QTLs associated with expression and splicing changes across all five brain regions and validated their effects using allele specific expression data. We identified 80 cases where eQTLs were colocalized with genome-wide association study (GWAS) results from nine physiological traits. Comparing our dataset to human data from the Genotype-Tissue Expression (GTEx) project, we found that the HS rat data yields twice as many significant eQTLs as a similarly sized human dataset. We also identified a modest but highly significant correlation between genetic regulatory variation among orthologous genes. Surprisingly, we found less genetic variation in gene regulation in HS rats relative to humans, though we still found eQTLs for the orthologs of many human genes for which eQTLs had not been found. These data are available from the RatGTEx data portal (RatGTEx.org) and will enable new discoveries of the genetic influences of complex traits.
Collapse
Affiliation(s)
- Daniel Munro
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA,Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Tengfei Wang
- Department of Pharmacology, Addiction Science and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Apurva S Chitre
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Oksana Polesskaya
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Nava Ehsan
- Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA, USA
| | - Jianjun Gao
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
| | - Leah C Solberg Woods
- Section of Molecular Medicine, Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Laura M Saba
- Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Hao Chen
- Department of Pharmacology, Addiction Science and Toxicology, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Abraham A Palmer
- Correspondence may also be addressed to Abraham A. Palmer. Tel: +1 858 534 2093;
| | - Pejman Mohammadi
- To whom correspondence should be addressed. Tel: +1 858 784 8746;
| |
Collapse
|
10
|
Liu S, Gao Y, Canela-Xandri O, Wang S, Yu Y, Cai W, Li B, Xiang R, Chamberlain AJ, Pairo-Castineira E, D’Mellow K, Rawlik K, Xia C, Yao Y, Navarro P, Rocha D, Li X, Yan Z, Li C, Rosen BD, Van Tassell CP, Vanraden PM, Zhang S, Ma L, Cole JB, Liu GE, Tenesa A, Fang L. A multi-tissue atlas of regulatory variants in cattle. Nat Genet 2022; 54:1438-1447. [PMID: 35953587 PMCID: PMC7613894 DOI: 10.1038/s41588-022-01153-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 07/07/2022] [Indexed: 12/12/2022]
Abstract
Characterization of genetic regulatory variants acting on livestock gene expression is essential for interpreting the molecular mechanisms underlying traits of economic value and for increasing the rate of genetic gain through artificial selection. Here we build a Cattle Genotype-Tissue Expression atlas (CattleGTEx) as part of the pilot phase of the Farm animal GTEx (FarmGTEx) project for the research community based on 7,180 publicly available RNA-sequencing (RNA-seq) samples. We describe the transcriptomic landscape of more than 100 tissues/cell types and report hundreds of thousands of genetic associations with gene expression and alternative splicing for 23 distinct tissues. We evaluate the tissue-sharing patterns of these genetic regulatory effects, and functionally annotate them using multiomics data. Finally, we link gene expression in different tissues to 43 economically important traits using both transcriptome-wide association and colocalization analyses to decipher the molecular regulatory mechanisms underpinning such agronomic traits in cattle.
Collapse
Affiliation(s)
- Shuli Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Yahui Gao
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
- Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Oriol Canela-Xandri
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Sheng Wang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| | - Ying Yu
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Wentao Cai
- Institute of Animal Science, Chinese Academy of Agricultural Science, Beijing 100193, China
| | - Bingjie Li
- Scotland’s Rural College (SRUC), Roslin Institute Building, Midlothian EH25 9RG, UK
| | - Ruidong Xiang
- Faculty of Veterinary & Agricultural Science, The University of Melbourne, Parkville 3052, Victoria, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria 3083, Australia
| | - Amanda J. Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBiosciences, Bundoora, Victoria 3083, Australia
| | - Erola Pairo-Castineira
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Kenton D’Mellow
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Konrad Rawlik
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
| | - Charley Xia
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
| | - Yuelin Yao
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Pau Navarro
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Dominique Rocha
- INRAE, AgroParisTech, GABI, Université Paris-Saclay, Jouy-en-Josas, F-78350, France
| | - Xiujin Li
- Guangdong Provincial Key Laboratory of Waterfowl Healthy Breeding, College of Animal Science & Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, Guangdong 510225, China
| | - Ze Yan
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Congjun Li
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Benjamin D. Rosen
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Curtis P. Van Tassell
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Paul M. Vanraden
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Shengli Zhang
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Li Ma
- Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - John B. Cole
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - George E. Liu
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Albert Tenesa
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Midlothian EH25 9RG, UK
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Lingzhao Fang
- Animal Genomics and Improvement Laboratory, Henry A. Wallace Beltsville Agricultural Research Center, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
- MRC Human Genetics Unit at the Institute of Genetics and Cancer, The University of Edinburgh, Edinburgh EH4 2XU, UK
| |
Collapse
|
11
|
Shared regulation and functional relevance of local gene co-expression revealed by single cell analysis. Commun Biol 2022; 5:876. [PMID: 36028576 PMCID: PMC9418141 DOI: 10.1038/s42003-022-03831-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 08/10/2022] [Indexed: 02/01/2023] Open
Abstract
Most human genes are co-expressed with a nearby gene. Previous studies have revealed this local gene co-expression to be widespread across chromosomes and across dozens of tissues. Yet, so far these studies used bulk RNA-seq, averaging gene expression measurements across millions of cells, thus being unclear if this co-expression stems from transcription events in single cells. Here, we leverage single cell datasets in >85 individuals to identify gene co-expression across cells, unbiased by cell-type heterogeneity and benefiting from the co-occurrence of transcription events in single cells. We discover >3800 co-expressed gene pairs in two human cell types, induced pluripotent stem cells (iPSCs) and lymphoblastoid cell lines (LCLs) and (i) compare single cell to bulk RNA-seq in identifying local gene co-expression, (ii) show that many co-expressed genes – but not the majority – are composed of functionally related genes and (iii) using proteomics data, provide evidence that their co-expression is maintained up to the protein level. Finally, using single cell RNA-sequencing (scRNA-seq) and single cell ATAC-sequencing (scATAC-seq) data for the same single cells, we identify gene-enhancer associations and reveal that >95% of co-expressed gene pairs share regulatory elements. These results elucidate the potential reasons for co-expression in single cell gene regulatory networks and warrant a deeper study of shared regulatory elements, in view of explaining disease comorbidity due to affecting several genes. Our in-depth view of local gene co-expression and regulatory element co-activity advances our understanding of the shared regulatory architecture between genes. Using single-cell data from cell lines, the co-expression of genes and co-activity of regulatory elements is analyzed, providing insight into shared architecture and regulation between genes.
Collapse
|
12
|
Genetic control of RNA splicing and its distinct role in complex trait variation. Nat Genet 2022; 54:1355-1363. [PMID: 35982161 PMCID: PMC9470536 DOI: 10.1038/s41588-022-01154-4] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 07/08/2022] [Indexed: 12/11/2022]
Abstract
Most genetic variants identified from genome-wide association studies (GWAS) in humans are noncoding, indicating their role in gene regulation. Previous studies have shown considerable links of GWAS signals to expression quantitative trait loci (eQTLs) but the links to other genetic regulatory mechanisms, such as splicing QTLs (sQTLs), are underexplored. Here, we introduce an sQTL mapping method, testing for heterogeneity between isoform-eQTLeffects (THISTLE), with improved power over competing methods. Applying THISTLE together with a complementary sQTL mapping strategy to brain transcriptomic (n = 2,865) and genotype data, we identified 12,794 genes with cis-sQTLs at P < 5 × 10−8, approximately 61% of which were distinct from eQTLs. Integrating the sQTL data into GWAS for 12 brain-related complex traits (including diseases), we identified 244 genes associated with the traits through cis-sQTLs, approximately 61% of which could not be discovered using the corresponding eQTL data. Our study demonstrates the distinct role of most sQTLs in the genetic regulation of transcription and complex trait variation. A powerful method for splicing quantitative trait loci (sQTL) mapping, THISTLE, is presented and applied to a collection of 2,865 brain samples. Integration with GWAS identifies 244 genes associated via cis-sQTLs, of which 61% were not identified using expression QTLs.
Collapse
|
13
|
Aggregative trans-eQTL analysis detects trait-specific target gene sets in whole blood. Nat Commun 2022; 13:4323. [PMID: 35882830 PMCID: PMC9325868 DOI: 10.1038/s41467-022-31845-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 07/06/2022] [Indexed: 01/13/2023] Open
Abstract
Large scale genetic association studies have identified many trait-associated variants and understanding the role of these variants in the downstream regulation of gene-expressions can uncover important mediating biological mechanisms. Here we propose ARCHIE, a summary statistic based sparse canonical correlation analysis method to identify sets of gene-expressions trans-regulated by sets of known trait-related genetic variants. Simulation studies show that compared to standard methods, ARCHIE is better suited to identify "core"-like genes through which effects of many other genes may be mediated and can capture disease-specific patterns of genetic associations. By applying ARCHIE to publicly available summary statistics from the eQTLGen consortium, we identify gene sets which have significant evidence of trans-association with groups of known genetic variants across 29 complex traits. Around half (50.7%) of the selected genes do not have any strong trans-associations and are not detected by standard methods. We provide further evidence for causal basis of the target genes through a series of follow-up analyses. These results show ARCHIE is a powerful tool for identifying sets of genes whose trans-regulation may be related to specific complex traits.
Collapse
|
14
|
Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021; 22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open
Abstract
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.
Collapse
Affiliation(s)
- Mohammed Alser
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Jeremy Rotman
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Dhrithi Deshpande
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | - Kodi Taraszka
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Pelin Icer Baykal
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Victor Xue
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
- Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA
- Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - David Koslicki
- Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA
- Biology Department, Pennsylvania State University, University Park, PA, 16801, USA
- The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Can Alkan
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
| | - Onur Mutlu
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
15
|
Mao W, Rahimikollu J, Hausler R, Chikina M. DataRemix: a universal data transformation for optimal inference from gene expression datasets. Bioinformatics 2021; 37:984-991. [PMID: 32821903 DOI: 10.1093/bioinformatics/btaa745] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 08/01/2020] [Accepted: 08/17/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION RNA-seq technology provides unprecedented power in the assessment of the transcription abundance and can be used to perform a variety of downstream tasks such as inference of gene-correlation network and eQTL discovery. However, raw gene expression values have to be normalized for nuisance biological variation and technical covariates, and different normalization strategies can lead to dramatically different results in the downstream study. RESULTS We describe a generalization of singular value decomposition-based reconstruction for which the common techniques of whitening, rank-k approximation and removing the top k principal components are special cases. Our simple three-parameter transformation, DataRemix, can be tuned to reweigh the contribution of hidden factors and reveal otherwise hidden biological signals. In particular, we demonstrate that the method can effectively prioritize biological signals over noise without leveraging external dataset-specific knowledge, and can outperform normalization methods that make explicit use of known technical factors. We also show that DataRemix can be efficiently optimized via Thompson sampling approach, which makes it feasible for computationally expensive objectives such as eQTL analysis. Finally, we apply our method to the Religious Orders Study and Memory and Aging Project dataset, and we report what to our knowledge is the first replicable trans-eQTL effect in human brain. AVAILABILITYAND IMPLEMENTATION DataRemix is an R package which is freely available at GitHub (https://github.com/wgmao/DataRemix). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weiguang Mao
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15260, USA.,Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Javad Rahimikollu
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15260, USA.,Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Ryan Hausler
- Department of Medicine, Division of Hematology/Oncology,, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Maria Chikina
- Joint Carnegie Mellon-University of Pittsburgh Ph.D. Program in Computational Biology, Pittsburgh, PA 15260, USA.,Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| |
Collapse
|
16
|
Banerjee S, Simonetti FL, Detrois KE, Kaphle A, Mitra R, Nagial R, Söding J. Tejaas: reverse regression increases power for detecting trans-eQTLs. Genome Biol 2021; 22:142. [PMID: 33957961 PMCID: PMC8101255 DOI: 10.1186/s13059-021-02361-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 04/22/2021] [Indexed: 12/18/2022] Open
Abstract
Trans-acting expression quantitative trait loci (trans-eQTLs) account for ≥70% expression heritability and could therefore facilitate uncovering mechanisms underlying the origination of complex diseases. Identifying trans-eQTLs is challenging because of small effect sizes, tissue specificity, and a severe multiple-testing burden. Tejaas predicts trans-eQTLs by performing L2-regularized “reverse” multiple regression of each SNP on all genes, aggregating evidence from many small trans-effects while being unaffected by the strong expression correlations. Combined with a novel unsupervised k-nearest neighbor method to remove confounders, Tejaas predicts 18851 unique trans-eQTLs across 49 tissues from GTEx. They are enriched in open chromatin, enhancers, and other regulatory regions. Many overlap with disease-associated SNPs, pointing to tissue-specific transcriptional regulation mechanisms.
Collapse
Affiliation(s)
- Saikat Banerjee
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.
| | - Franco L Simonetti
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany
| | - Kira E Detrois
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.,Georg-August University, Göttingen, 37075, Germany
| | - Anubhav Kaphle
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany.,Georg-August University, Göttingen, 37075, Germany
| | | | | | - Johannes Söding
- Quantitative and Computational Biology, Max-Planck Institute for Biophysical Chemistry, Göttingen, 37077, Germany. .,Campus-Institut Data Science (CIDAS), University of Göttingen, Göttingen, 37073, Germany. .,Cluster of Excellence "Multiscale Bioimaging" (MBExC), University of Göttingen, Göttingen, 37075, Germany.
| |
Collapse
|
17
|
Ota M, Nagafuchi Y, Hatano H, Ishigaki K, Terao C, Takeshima Y, Yanaoka H, Kobayashi S, Okubo M, Shirai H, Sugimori Y, Maeda J, Nakano M, Yamada S, Yoshida R, Tsuchiya H, Tsuchida Y, Akizuki S, Yoshifuji H, Ohmura K, Mimori T, Yoshida K, Kurosaka D, Okada M, Setoguchi K, Kaneko H, Ban N, Yabuki N, Matsuki K, Mutoh H, Oyama S, Okazaki M, Tsunoda H, Iwasaki Y, Sumitomo S, Shoda H, Kochi Y, Okada Y, Yamamoto K, Okamura T, Fujio K. Dynamic landscape of immune cell-specific gene regulation in immune-mediated diseases. Cell 2021; 184:3006-3021.e17. [PMID: 33930287 DOI: 10.1016/j.cell.2021.03.056] [Citation(s) in RCA: 141] [Impact Index Per Article: 47.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/25/2021] [Accepted: 03/28/2021] [Indexed: 02/07/2023]
Abstract
Genetic studies have revealed many variant loci that are associated with immune-mediated diseases. To elucidate the disease pathogenesis, it is essential to understand the function of these variants, especially under disease-associated conditions. Here, we performed a large-scale immune cell gene-expression analysis, together with whole-genome sequence analysis. Our dataset consists of 28 distinct immune cell subsets from 337 patients diagnosed with 10 categories of immune-mediated diseases and 79 healthy volunteers. Our dataset captured distinctive gene-expression profiles across immune cell types and diseases. Expression quantitative trait loci (eQTL) analysis revealed dynamic variations of eQTL effects in the context of immunological conditions, as well as cell types. These cell-type-specific and context-dependent eQTLs showed significant enrichment in immune disease-associated genetic variants, and they implicated the disease-relevant cell types, genes, and environment. This atlas deepens our understanding of the immunogenetic functions of disease-associated variants under in vivo disease conditions.
Collapse
Affiliation(s)
- Mineto Ota
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan; Department of Functional Genomics and Immunological Diseases, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan.
| | - Yasuo Nagafuchi
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan; Department of Functional Genomics and Immunological Diseases, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Hiroaki Hatano
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Kazuyoshi Ishigaki
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Yusuke Takeshima
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan; Department of Functional Genomics and Immunological Diseases, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Haruyuki Yanaoka
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Satomi Kobayashi
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Mai Okubo
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Harumi Shirai
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Yusuke Sugimori
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Junko Maeda
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Masahiro Nakano
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Saeko Yamada
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Ryochi Yoshida
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Haruka Tsuchiya
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Yumi Tsuchida
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Shuji Akizuki
- Department of Rheumatology and Clinical Immunology, Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan
| | - Hajime Yoshifuji
- Department of Rheumatology and Clinical Immunology, Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan
| | - Koichiro Ohmura
- Department of Rheumatology and Clinical Immunology, Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan
| | - Tsuneyo Mimori
- Department of Rheumatology and Clinical Immunology, Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan
| | - Ken Yoshida
- Division of Rheumatology, Department of Internal Medicine, The Jikei University School of Medicine, Tokyo 105-8461, Japan
| | - Daitaro Kurosaka
- Division of Rheumatology, Department of Internal Medicine, The Jikei University School of Medicine, Tokyo 105-8461, Japan
| | - Masato Okada
- Immuno-Rheumatology Center, St. Luke's International Hospital, Tokyo 104-8560, Japan
| | - Keigo Setoguchi
- Division of Collagen Disease, Department of Medicine, Tokyo Metropolitan Komagome Hospital, Tokyo 113-0021, Japan
| | - Hiroshi Kaneko
- Division of Rheumatic Diseases, National Center for Global Health and Medicine, Tokyo 162-8655, Japan
| | - Nobuhiro Ban
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Nami Yabuki
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Kosuke Matsuki
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Hironori Mutoh
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Sohei Oyama
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Makoto Okazaki
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Hiroyuki Tsunoda
- Research Division, Chugai Pharmaceutical Co., Ltd., Kamakura, Kanagawa 247-8530, Japan
| | - Yukiko Iwasaki
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Shuji Sumitomo
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Hirofumi Shoda
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Yuta Kochi
- Department of Genomic Function and Diversity, Medical Research Institute, Tokyo Medical and Dental University, Tokyo 113-8510, Japan; Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan
| | - Kazuhiko Yamamoto
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan; Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa 230-0045, Japan
| | - Tomohisa Okamura
- Department of Functional Genomics and Immunological Diseases, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan
| | - Keishi Fujio
- Department of Allergy and Rheumatology, Graduate School of Medicine, The University of Tokyo, Tokyo 113-0033, Japan.
| |
Collapse
|
18
|
Fan Y, Zhu H, Song Y, Peng Q, Zhou X. Efficient and effective control of confounding in eQTL mapping studies through joint differential expression and Mendelian randomization analyses. Bioinformatics 2021; 37:296-302. [PMID: 32790868 PMCID: PMC8058772 DOI: 10.1093/bioinformatics/btaa715] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 07/09/2020] [Accepted: 08/06/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Identifying cis-acting genetic variants associated with gene expression levels-an analysis commonly referred to as expression quantitative trait loci (eQTLs) mapping-is an important first step toward understanding the genetic determinant of gene expression variation. Successful eQTL mapping requires effective control of confounding factors. A common method for confounding effects control in eQTL mapping studies is the probabilistic estimation of expression residual (PEER) analysis. PEER analysis extracts PEER factors to serve as surrogates for confounding factors, which is further included in the subsequent eQTL mapping analysis. However, it is computationally challenging to determine the optimal number of PEER factors used for eQTL mapping. In particular, the standard approach to determine the optimal number of PEER factors examines one number at a time and chooses a number that optimizes eQTLs discovery. Unfortunately, this standard approach involves multiple repetitive eQTL mapping procedures that are computationally expensive, restricting its use in large-scale eQTL mapping studies that being collected today. RESULTS Here, we present a simple and computationally scalable alternative, Effect size Correlation for COnfounding determination (ECCO), to determine the optimal number of PEER factors used for eQTL mapping studies. Instead of performing repetitive eQTL mapping, ECCO jointly applies differential expression analysis and Mendelian randomization analysis, leading to substantial computational savings. In simulations and real data applications, we show that ECCO identifies a similar number of PEER factors required for eQTL mapping analysis as the standard approach but is two orders of magnitude faster. The computational scalability of ECCO allows for optimized eQTL discovery across 48 GTEx tissues for the first time, yielding an overall 5.89% power gain on the number of eQTL harboring genes (eGenes) discovered as compared to the previous GTEx recommendation that does not attempt to determine tissue-specific optimal number of PEER factors. AVAILABILITYAND IMPLEMENTATION Our method is implemented in the ECCO software, which, along with its GTEx mapping results, is freely available at www.xzlab.org/software.html. All R scripts used in this study are also available at this site. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yue Fan
- Key Laboratory of Trace Elements and Endemic Diseases of National Health and Family Planning Commission, School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, China.,Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Huanhuan Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yanyi Song
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Qinke Peng
- Systems Engineering Institute, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.,Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
19
|
Bonder MJ, Smail C, Gloudemans MJ, Frésard L, Jakubosky D, D'Antonio M, Li X, Ferraro NM, Carcamo-Orive I, Mirauta B, Seaton DD, Cai N, Vakili D, Horta D, Zhao C, Zastrow DB, Bonner DE, Wheeler MT, Kilpinen H, Knowles JW, Smith EN, Frazer KA, Montgomery SB, Stegle O. Identification of rare and common regulatory variants in pluripotent cells using population-scale transcriptomics. Nat Genet 2021; 53:313-321. [PMID: 33664507 PMCID: PMC7944648 DOI: 10.1038/s41588-021-00800-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 01/25/2021] [Indexed: 12/18/2022]
Abstract
Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of novel colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.
Collapse
Affiliation(s)
- Marc Jan Bonder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| | - Craig Smail
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA. .,Genomic Medicine Center, Children's Mercy Research Institute and Children's Mercy Kansas City, Kansas City, MO, USA.
| | - Michael J Gloudemans
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Laure Frésard
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - David Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA.,Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, USA
| | - Matteo D'Antonio
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, USA
| | - Xin Li
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Nicole M Ferraro
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Ivan Carcamo-Orive
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Bogdan Mirauta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Daniel D Seaton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Na Cai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK.,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.,Helmholtz Pioneer Campus, Helmholtz Zentrum München, Neuherberg, Germany
| | - Dara Vakili
- UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,Faculty of Medicine, Imperial College London, London, UK
| | - Danilo Horta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK
| | - Chunli Zhao
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Diane B Zastrow
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Devon E Bonner
- Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | | | | | | | | | - Matthew T Wheeler
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA.,Stanford Center for Undiagnosed Diseases, Stanford University, Stanford, CA, USA
| | - Helena Kilpinen
- Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.,UCL Great Ormond Street Institute of Child Health, University College London, London, UK.,Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland.,Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland
| | - Joshua W Knowles
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Erin N Smith
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Kelly A Frazer
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA, USA.,Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Stephen B Montgomery
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA. .,Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, UK. .,European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany. .,Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany. .,Wellcome Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.
| |
Collapse
|
20
|
Yao Y, Yang J, Qin Q, Tang C, Li Z, Chen L, Li K, Ren C, Chen L, Rao S. Functional annotation of genetic associations by transcriptome-wide association analysis provides insights into neutrophil development regulation. Commun Biol 2020; 3:790. [PMID: 33340029 PMCID: PMC7749173 DOI: 10.1038/s42003-020-01527-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 11/22/2020] [Indexed: 12/26/2022] Open
Abstract
Genome-wide association studies (GWAS) have identified multiple genomic loci linked to blood cell traits, however understanding the biological relevance of these genetic loci has proven to be challenging. Here, we performed a transcriptome-wide association study (TWAS) integrating gene expression and splice junction usage in neutrophils (N = 196) with a neutrophil count GWAS (N = 173,480 individuals). We identified a total of 174 TWAS-significant genes enriched in target genes of master transcription factors governing neutrophil specification. Knockout of a TWAS candidate at chromosome 5q13.2, TAF9, in CD34+ hematopoietic and progenitor cells (HSPCs) using CRISPR/Cas9 technology showed a significant effect on neutrophil production in vitro. In addition, we identified 89 unique genes significant only for splice junction usage, thus emphasizing the importance of alternative splicing beyond gene expression underlying granulopoiesis. Our results highlight the advantages of TWAS, followed by gene editing, to determine the functions of GWAS loci implicated in hematopoiesis.
Collapse
Affiliation(s)
- Yao Yao
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Medicine, West China Second Hospital, State Key Laboratory of Biotherapy and Collaborative Innovation Center for Biotherapy, Sichuan University, Chengdu, China.,School of Basic Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Jia Yang
- Department of Dermatology, University of Californian San Francisco, San Francisco, CA, 94110, USA
| | - Qian Qin
- Molecular Pathology Unit, Center for Cancer Research, Center for Computational and Integrative Biology, Massachusetts General Hospital, Department of Pathology, Harvard Medical School, Boston, MA, 02115, USA
| | - Chao Tang
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Medicine, West China Second Hospital, State Key Laboratory of Biotherapy and Collaborative Innovation Center for Biotherapy, Sichuan University, Chengdu, China
| | - Zhidan Li
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Medicine, West China Second Hospital, State Key Laboratory of Biotherapy and Collaborative Innovation Center for Biotherapy, Sichuan University, Chengdu, China
| | - Li Chen
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Medicine, West China Second Hospital, State Key Laboratory of Biotherapy and Collaborative Innovation Center for Biotherapy, Sichuan University, Chengdu, China
| | - Kailong Li
- Children's Medical Center Research Institute, Department of Pediatrics, Harold C. Simmons Comprehensive Cancer Center, Hamon Center for Regenerative Science and Medicine, The University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
| | - Chunyan Ren
- Division of Hematology/Oncology, Boston Children's Hospital, Department of Pediatric Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, 02115, USA
| | - Lu Chen
- Key Laboratory of Birth Defects and Related Diseases of Women and Children, Department of Medicine, West China Second Hospital, State Key Laboratory of Biotherapy and Collaborative Innovation Center for Biotherapy, Sichuan University, Chengdu, China.
| | - Shuquan Rao
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & School of Basic Medicine, Peking Union Medical College, Beijing, 100005, China.
| |
Collapse
|
21
|
Oliva M, Muñoz-Aguirre M, Kim-Hellmuth S, Wucher V, Gewirtz ADH, Cotter DJ, Parsana P, Kasela S, Balliu B, Viñuela A, Castel SE, Mohammadi P, Aguet F, Zou Y, Khramtsova EA, Skol AD, Garrido-Martín D, Reverter F, Brown A, Evans P, Gamazon ER, Payne A, Bonazzola R, Barbeira AN, Hamel AR, Martinez-Perez A, Soria JM, Pierce BL, Stephens M, Eskin E, Dermitzakis ET, Segrè AV, Im HK, Engelhardt BE, Ardlie KG, Montgomery SB, Battle AJ, Lappalainen T, Guigó R, Stranger BE. The impact of sex on gene expression across human tissues. Science 2020; 369:eaba3066. [PMID: 32913072 PMCID: PMC8136152 DOI: 10.1126/science.aba3066] [Citation(s) in RCA: 326] [Impact Index Per Article: 81.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 08/03/2020] [Indexed: 12/12/2022]
Abstract
Many complex human phenotypes exhibit sex-differentiated characteristics. However, the molecular mechanisms underlying these differences remain largely unknown. We generated a catalog of sex differences in gene expression and in the genetic regulation of gene expression across 44 human tissue sources surveyed by the Genotype-Tissue Expression project (GTEx, v8 release). We demonstrate that sex influences gene expression levels and cellular composition of tissue samples across the human body. A total of 37% of all genes exhibit sex-biased expression in at least one tissue. We identify cis expression quantitative trait loci (eQTLs) with sex-differentiated effects and characterize their cellular origin. By integrating sex-biased eQTLs with genome-wide association study data, we identify 58 gene-trait associations that are driven by genetic regulation of gene expression in a single sex. These findings provide an extensive characterization of sex differences in the human transcriptome and its genetic regulation.
Collapse
Affiliation(s)
- Meritxell Oliva
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Manuel Muñoz-Aguirre
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
- Department of Statistics and Operations Research, Universitat Politècnica de Catalunya, Barcelona, Catalonia, Spain
| | - Sarah Kim-Hellmuth
- Statistical Genetics, Max Planck Institute of Psychiatry, Munich, Germany
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Valentin Wucher
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Ariel D H Gewirtz
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Daniel J Cotter
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Princy Parsana
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Silva Kasela
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California, Los Angeles, CA, USA
| | - Ana Viñuela
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Stephane E Castel
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, Scripps Research Translational Institute, La Jolla, CA, USA
| | | | - Yuxin Zou
- Department of Statistics, University of Chicago, Chicago, IL, USA
| | - Ekaterina A Khramtsova
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Computational Sciences, Janssen Pharmaceuticals, Spring House, PA, USA
| | - Andrew D Skol
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA
- Center for Translational Data Science, University of Chicago, Chicago, IL, USA
- Department of Pathology and Laboratory Medicine, Ann and Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Diego Garrido-Martín
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
| | - Ferran Reverter
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, University of Barcelona, Barcelona, Spain
| | | | - Patrick Evans
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Eric R Gamazon
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Clare Hall, University of Cambridge, Cambridge, UK
| | - Anthony Payne
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Alvaro N Barbeira
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Andrew R Hamel
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Angel Martinez-Perez
- Genomics of Complex Diseases Group, Research Institute Hospital de la Sant Creu i Sant Pau, IIB Sant Pau, Barcelona, Spain
| | - José Manuel Soria
- Genomics of Complex Diseases Group, Research Institute Hospital de la Sant Creu i Sant Pau, IIB Sant Pau, Barcelona, Spain
| | - Brandon L Pierce
- Department of Public Health Sciences, University of Chicago, Chicago, IL, USA
| | - Matthew Stephens
- Department of Statistics, University of Chicago, Chicago, IL, USA
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Eleazar Eskin
- Departments of Computational Medicine, Computer Science, and Human Genetics, University of California, Los Angeles, CA, USA
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Ayellet V Segrè
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Massachusetts Eye and Ear, Harvard Medical School, Boston, MA, USA
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Barbara E Engelhardt
- Department of Computer Science, Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
- Genomics plc, Oxford, UK
| | | | - Stephen B Montgomery
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Pathology, Stanford University, Stanford, CA, USA
| | - Alexis J Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Tuuli Lappalainen
- New York Genome Center, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Roderic Guigó
- Centre for Genomic Regulation, Barcelona Institute for Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Barbara E Stranger
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA.
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, IL, USA
- Center for Translational Data Science, University of Chicago, Chicago, IL, USA
- Center for Genetic Medicine, Department of Pharmacology, Northwestern University, Chicago, IL, USA
| |
Collapse
|
22
|
The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 2020; 369:1318-1330. [PMID: 32913098 PMCID: PMC7737656 DOI: 10.1126/science.aaz1776] [Citation(s) in RCA: 2188] [Impact Index Per Article: 547.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 07/30/2020] [Indexed: 02/06/2023]
Abstract
The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the version 8 data, examining 15,201 RNA-sequencing samples from 49 tissues of 838 postmortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans, showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large diversity of tissues, we provide insights into the tissue specificity of genetic effects and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues.
Collapse
|
23
|
Srivastava A, Malik L, Sarkar H, Zakeri M, Almodaresi F, Soneson C, Love MI, Kingsford C, Patro R. Alignment and mapping methodology influence transcript abundance estimation. Genome Biol 2020; 21:239. [PMID: 32894187 PMCID: PMC7487471 DOI: 10.1186/s13059-020-02151-8] [Citation(s) in RCA: 74] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 08/19/2020] [Indexed: 01/23/2023] Open
Abstract
Background The accuracy of transcript quantification using RNA-seq data depends on many factors, such as the choice of alignment or mapping method and the quantification model being adopted. While the choice of quantification model has been shown to be important, considerably less attention has been given to comparing the effect of various read alignment approaches on quantification accuracy. Results We investigate the influence of mapping and alignment on the accuracy of transcript quantification in both simulated and experimental data, as well as the effect on subsequent differential expression analysis. We observe that, even when the quantification model itself is held fixed, the effect of choosing a different alignment methodology, or aligning reads using different parameters, on quantification estimates can sometimes be large and can affect downstream differential expression analyses as well. These effects can go unnoticed when assessment is focused too heavily on simulated data, where the alignment task is often simpler than in experimentally acquired samples. We also introduce a new alignment methodology, called selective alignment, to overcome the shortcomings of lightweight approaches without incurring the computational cost of traditional alignment. Conclusion We observe that, on experimental datasets, the performance of lightweight mapping and alignment-based approaches varies significantly, and highlight some of the underlying factors. We show this variation both in terms of quantification and downstream differential expression analysis. In all comparisons, we also show the improved performance of our proposed selective alignment method and suggest best practices for performing RNA-seq quantification.
Collapse
Affiliation(s)
- Avi Srivastava
- Department of Computer Science, Stony Brook University, Stony Brook, USA
| | - Laraib Malik
- Department of Computer Science, Stony Brook University, Stony Brook, USA
| | - Hirak Sarkar
- Department of Computer Science, University of Maryland, College Park, USA
| | - Mohsen Zakeri
- Department of Computer Science, University of Maryland, College Park, USA
| | - Fatemeh Almodaresi
- Department of Computer Science, University of Maryland, College Park, USA
| | - Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Michael I Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, USA.,Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Carl Kingsford
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, USA
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, USA.
| |
Collapse
|
24
|
Kolberg L, Kerimov N, Peterson H, Alasoo K. Co-expression analysis reveals interpretable gene modules controlled by trans-acting genetic variants. eLife 2020; 9:e58705. [PMID: 32880574 PMCID: PMC7470823 DOI: 10.7554/elife.58705] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 08/20/2020] [Indexed: 12/16/2022] Open
Abstract
Understanding the causal processes that contribute to disease onset and progression is essential for developing novel therapies. Although trans-acting expression quantitative trait loci (trans-eQTLs) can directly reveal cellular processes modulated by disease variants, detecting trans-eQTLs remains challenging due to their small effect sizes. Here, we analysed gene expression and genotype data from six blood cell types from 226 to 710 individuals. We used co-expression modules inferred from gene expression data with five methods as traits in trans-eQTL analysis to limit multiple testing and improve interpretability. In addition to replicating three established associations, we discovered a novel trans-eQTL near SLC39A8 regulating a module of metallothionein genes in LPS-stimulated monocytes. Interestingly, this effect was mediated by a transient cis-eQTL present only in early LPS response and lost before the trans effect appeared. Our analyses highlight how co-expression combined with functional enrichment analysis improves the identification and prioritisation of trans-eQTLs when applied to emerging cell-type-specific datasets.
Collapse
Affiliation(s)
- Liis Kolberg
- Institute of Computer Science, University of TartuTartuEstonia
| | - Nurlan Kerimov
- Institute of Computer Science, University of TartuTartuEstonia
| | - Hedi Peterson
- Institute of Computer Science, University of TartuTartuEstonia
| | - Kaur Alasoo
- Institute of Computer Science, University of TartuTartuEstonia
| |
Collapse
|
25
|
Yang HS, White CC, Klein HU, Yu L, Gaiteri C, Ma Y, Felsky D, Mostafavi S, Petyuk VA, Sperling RA, Ertekin-Taner N, Schneider JA, Bennett DA, De Jager PL. Genetics of Gene Expression in the Aging Human Brain Reveal TDP-43 Proteinopathy Pathophysiology. Neuron 2020; 107:496-508.e6. [PMID: 32526197 PMCID: PMC7416464 DOI: 10.1016/j.neuron.2020.05.010] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 03/20/2020] [Accepted: 05/07/2020] [Indexed: 12/14/2022]
Abstract
Here, we perform a genome-wide screen for variants that regulate the expression of gene co-expression modules in the aging human brain; we discover and replicate such variants in the TMEM106B and RBFOX1 loci. The TMEM106B haplotype is known to influence the accumulation of TAR DNA-binding protein 43 kDa (TDP-43) proteinopathy, and the haplotype's large-scale transcriptomic effects include the dysregulation of lysosomal genes and alterations in synaptic gene splicing that are also seen in the pathophysiology of TDP-43 proteinopathy. Further, a variant near GRN, another TDP-43 proteinopathy susceptibility gene, shows concordant effects with the TMEM106B haplotype. Leveraging neuropathology data from the same participants, we also show that TMEM106B and APOE-amyloid-β effects converge to alter myelination and lysosomal gene expression, which then contributes to TDP-43 accumulation. These results advance our mechanistic understanding of the TMEM106B TDP-43 risk haplotype and uncover a transcriptional program that mediates the converging effects of APOE-amyloid-β and TMEM106B on TDP-43 aggregation in older adults.
Collapse
Affiliation(s)
- Hyun-Sik Yang
- Center for Alzheimer Research and Treatment, Department of Neurology, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Harvard Medical School, Boston, MA 02115, USA; The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Charles C White
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Translational and Computational Neuroimmunology, Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA; Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Hans-Ulrich Klein
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Translational and Computational Neuroimmunology, Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA; Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Lei Yu
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA; Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612, USA
| | - Christopher Gaiteri
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA; Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612, USA
| | - Yiyi Ma
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Translational and Computational Neuroimmunology, Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA; Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Daniel Felsky
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Translational and Computational Neuroimmunology, Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA; Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY 10032, USA
| | - Sara Mostafavi
- Department of Statistics, Department of Medical Genetics, University of British Columbia, Vancouver, BC V6H 3N1, Canada; Canadian Institute for Advanced Research, Toronto, ON M5G 1M1, Canada
| | | | - Reisa A Sperling
- Center for Alzheimer Research and Treatment, Department of Neurology, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Nilüfer Ertekin-Taner
- Department of Neurology, Mayo Clinic, Jacksonville, FL 32224, USA; Department of Neuroscience, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Julie A Schneider
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA; Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL 60612, USA; Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612, USA
| | - Philip L De Jager
- The Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Center for Translational and Computational Neuroimmunology, Department of Neurology, Columbia University Irving Medical Center, New York, NY 10032, USA; Taub Institute for Research on Alzheimer's Disease and the Aging Brain, Columbia University Irving Medical Center, New York, NY 10032, USA.
| |
Collapse
|
26
|
Shang L, Smith JA, Zhao W, Kho M, Turner ST, Mosley TH, Kardia SLR, Zhou X. Genetic Architecture of Gene Expression in European and African Americans: An eQTL Mapping Study in GENOA. Am J Hum Genet 2020; 106:496-512. [PMID: 32220292 PMCID: PMC7118581 DOI: 10.1016/j.ajhg.2020.03.002] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Accepted: 03/06/2020] [Indexed: 12/20/2022] Open
Abstract
Most existing expression quantitative trait locus (eQTL) mapping studies have been focused on individuals of European ancestry and are underrepresented in other populations including populations with African ancestry. Lack of large-scale well-powered eQTL mapping studies in populations with African ancestry can both impede the dissemination of eQTL mapping results that would otherwise benefit individuals with African ancestry and hinder the comparable analysis for understanding how gene regulation is shaped through evolution. We fill this critical knowledge gap by performing a large-scale in-depth eQTL mapping study on 1,032 African Americans (AA) and 801 European Americans (EA) in the GENOA cohort. We identified a total of 354,931 eSNPs in AA and 371,309 eSNPs in EA, with 112,316 eSNPs overlapped between the two. We found that eQTL harboring genes (eGenes) are enriched in metabolic pathways and tend to have higher SNP heritability compared to non-eGenes. We found that eGenes that are common in the two populations tend to be less conserved than eGenes that are unique to one population, which are less conserved than non-eGenes. Through conditional analysis, we found that eGenes in AA tend to harbor more independent eQTLs than eGenes in EA, suggesting potentially diverse genetic architecture underlying expression variation in the two populations. Finally, the large sample sizes in GENOA allow us to construct accurate expression prediction models in both AA and EA, facilitating powerful transcriptome-wide association studies. Overall, our results represent an important step toward revealing the genetic architecture underlying expression variation in African Americans.
Collapse
Affiliation(s)
- Lulu Shang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Zhao
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Minjung Kho
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA
| | - Stephen T Turner
- Division of Nephrology and Hypertension, Mayo Clinic, Rochester, MN 55905, USA
| | - Thomas H Mosley
- Memory Impairment and Neurodegenerative Dementia (MIND) Center, University of Mississippi Medical Center, Jackson, MS 39126, USA
| | - Sharon L R Kardia
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Xiang Zhou
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
27
|
Wheeler HE, Ploch S, Barbeira AN, Bonazzola R, Andaleon A, Fotuhi Siahpirani A, Saha A, Battle A, Roy S, Im HK. Imputed gene associations identify replicable trans-acting genes enriched in transcription pathways and complex traits. Genet Epidemiol 2019; 43:596-608. [PMID: 30950127 PMCID: PMC6687523 DOI: 10.1002/gepi.22205] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 02/15/2019] [Accepted: 03/18/2019] [Indexed: 11/17/2022]
Abstract
Regulation of gene expression is an important mechanism through which genetic variation can affect complex traits. A substantial portion of gene expression variation can be explained by both local (cis) and distal (trans) genetic variation. Much progress has been made in uncovering cis-acting expression quantitative trait loci (cis-eQTL), but trans-eQTL have been more difficult to identify and replicate. Here we take advantage of our ability to predict the cis component of gene expression coupled with gene mapping methods such as PrediXcan to identify high confidence candidate trans-acting genes and their targets. That is, we correlate the cis component of gene expression with observed expression of genes in different chromosomes. Leveraging the shared cis-acting regulation across tissues, we combine the evidence of association across all available Genotype-Tissue Expression Project tissues and find 2,356 trans-acting/target gene pairs with high mappability scores. Reassuringly, trans-acting genes are enriched in transcription and nucleic acid binding pathways and target genes are enriched in known transcription factor binding sites. Interestingly, trans-acting genes are more significantly associated with selected complex traits and diseases than target or background genes, consistent with percolating trans effects. Our scripts and summary statistics are publicly available for future studies of trans-acting gene regulation.
Collapse
Affiliation(s)
- Heather E. Wheeler
- Department of BiologyLoyola University ChicagoChicagoIllinois
- Department of Computer ScienceLoyola University ChicagoChicagoIllinois
- Department of Public Health SciencesStritch School of Medicine, Loyola University ChicagoMaywoodIllinois
| | - Sally Ploch
- Department of BiologyLoyola University ChicagoChicagoIllinois
| | - Alvaro N. Barbeira
- Section of Genetic Medicine, Department of MedicineUniversity of ChicagoChicagoIllinois
| | - Rodrigo Bonazzola
- Section of Genetic Medicine, Department of MedicineUniversity of ChicagoChicagoIllinois
| | - Angela Andaleon
- Department of BiologyLoyola University ChicagoChicagoIllinois
| | | | - Ashis Saha
- Department of Computer ScienceJohns Hopkins UniversityBaltimoreMaryland
| | - Alexis Battle
- Department of Computer ScienceJohns Hopkins UniversityBaltimoreMaryland
- Department of Biomedical EngineeringJohns Hopkins UniversityBaltimoreMaryland
| | - Sushmita Roy
- Department of Biostatistics and Medical InformaticsUniversity of Wisconsin‐MadisonMadisonWisconsin
| | - Hae Kyung Im
- Section of Genetic Medicine, Department of MedicineUniversity of ChicagoChicagoIllinois
| |
Collapse
|
28
|
Rotival M. Characterising the genetic basis of immune response variation to identify causal mechanisms underlying disease susceptibility. HLA 2019; 94:275-284. [PMID: 31115186 DOI: 10.1111/tan.13598] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Accepted: 05/15/2019] [Indexed: 12/12/2022]
Abstract
Over the last 10 years, genome-wide association studies (GWAS) have identified hundreds of susceptibility loci for autoimmune diseases. However, despite increasing power for the detection of both common and rare coding variants affecting disease susceptibility, a large fraction of disease heritability has remained unexplained. In addition, a majority of the identified loci are located in noncoding regions, and translation of disease-associated loci into new biological insights on the etiology of immune disorders has been lagging. This highlights the need for a better understanding of noncoding variation and new strategies to identify causal genes at disease loci. In this review, I will first detail the molecular basis of gene expression and review the various mechanisms that contribute to alter gene activity at the transcriptional and post-transcriptional level. I will then review the findings from 10 years of functional genomics studies regarding the genetics on gene expression, in particular in the context of infection. Finally, I will discuss the extent to which genetic variants that modulate gene expression at transcriptional and post-transcriptional level contribute to disease susceptibility and present strategies to leverage this information for the identification of causal mechanisms at disease loci in the era of whole genome sequencing.
Collapse
Affiliation(s)
- Maxime Rotival
- Unit of Human Evolutionary Genetics, CNRS UMR2000, Institut Pasteur, Paris, France
| |
Collapse
|
29
|
Dooley CM, Wali N, Sealy IM, White RJ, Stemple DL, Collins JE, Busch-Nentwich EM. The gene regulatory basis of genetic compensation during neural crest induction. PLoS Genet 2019; 15:e1008213. [PMID: 31199790 PMCID: PMC6594659 DOI: 10.1371/journal.pgen.1008213] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 06/26/2019] [Accepted: 05/27/2019] [Indexed: 12/15/2022] Open
Abstract
The neural crest (NC) is a vertebrate-specific cell type that contributes to a wide range of different tissues across all three germ layers. The gene regulatory network (GRN) responsible for the formation of neural crest is conserved across vertebrates. Central to the induction of the NC GRN are AP-2 and SoxE transcription factors. NC induction robustness is ensured through the ability of some of these transcription factors to compensate loss of function of gene family members. However the gene regulatory events underlying compensation are poorly understood. We have used gene knockout and RNA sequencing strategies to dissect NC induction and compensation in zebrafish. We genetically ablate the NC using double mutants of tfap2a;tfap2c or remove specific subsets of the NC with sox10 and mitfa knockouts and characterise genome-wide gene expression levels across multiple time points. We find that compensation through a single wild-type allele of tfap2c is capable of maintaining early NC induction and differentiation in the absence of tfap2a function, but many target genes have abnormal expression levels and therefore show sensitivity to the reduced tfap2 dosage. This separation of morphological and molecular phenotypes identifies a core set of genes required for early NC development. We also identify the 15 somites stage as the peak of the molecular phenotype which strongly diminishes at 24 hpf even as the morphological phenotype becomes more apparent. Using gene knockouts, we associate previously uncharacterised genes with pigment cell development and establish a role for maternal Hippo signalling in melanocyte differentiation. This work extends and refines the NC GRN while also uncovering the transcriptional basis of genetic compensation via paralogues. Embryonic development is an intricate process that requires genes to be active at the right time and place. Organisms have evolved mechanisms that ensure faithful execution of developmental programmes even if genes fail to function. For example, in a process called genetic compensation, one or more genes become activated in response to loss of function of another. In this work we use the zebrafish model to investigate how two related genes, tfap2a and tfap2c, interact to ensure establishment of the neural crest, a vertebrate-specific cell type that contributes to many different tissues. Losing tfap2a activity causes mild morphological defects and losing tfap2c has no visible effect. Yet when both are inactive, embryos are severely abnormal due to lack of neural crest-derived tissues. Here we show that loss of tfap2a triggers upregulation of tfap2c which prevents the loss of neural crest tissue. However, the genes normally regulated by tfap2a respond differently to tfap2c allowing us to identify the first tier of the Ap2 network and new players in neural crest biology. Our work demonstrates that the expression signature of partial, but morphologically sufficient, genetic compensation provides an opportunity to dissect gene regulatory networks.
Collapse
Affiliation(s)
| | - Neha Wali
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Ian M. Sealy
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Richard J. White
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Derek L. Stemple
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - John E. Collins
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Elisabeth M. Busch-Nentwich
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
30
|
Ali AT, Boehme L, Carbajosa G, Seitan VC, Small KS, Hodgkinson A. Nuclear genetic regulation of the human mitochondrial transcriptome. eLife 2019; 8:e41927. [PMID: 30775970 PMCID: PMC6420317 DOI: 10.7554/elife.41927] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 02/14/2019] [Indexed: 12/21/2022] Open
Abstract
Mitochondria play important roles in cellular processes and disease, yet little is known about how the transcriptional regime of the mitochondrial genome varies across individuals and tissues. By analyzing >11,000 RNA-sequencing libraries across 36 tissue/cell types, we find considerable variation in mitochondrial-encoded gene expression along the mitochondrial transcriptome, across tissues and between individuals, highlighting the importance of cell-type specific and post-transcriptional processes in shaping mitochondrial-encoded RNA levels. Using whole-genome genetic data we identify 64 nuclear loci associated with expression levels of 14 genes encoded in the mitochondrial genome, including missense variants within genes involved in mitochondrial function (TBRG4, MTPAP and LONP1), implicating genetic mechanisms that act in trans across the two genomes. We replicate ~21% of associations with independent tissue-matched datasets and find genetic variants linked to these nuclear loci that are associated with cardio-metabolic phenotypes and Vitiligo, supporting a potential role for variable mitochondrial-encoded gene expression in complex disease.
Collapse
Affiliation(s)
- Aminah T Ali
- Department of Medical and Molecular Genetics, School of Basic and Medical BiosciencesKing’s College LondonLondonUnited Kingdom
| | - Lena Boehme
- Department of Medical and Molecular Genetics, School of Basic and Medical BiosciencesKing’s College LondonLondonUnited Kingdom
| | - Guillermo Carbajosa
- Department of Medical and Molecular Genetics, School of Basic and Medical BiosciencesKing’s College LondonLondonUnited Kingdom
| | - Vlad C Seitan
- Department of Medical and Molecular Genetics, School of Basic and Medical BiosciencesKing’s College LondonLondonUnited Kingdom
| | - Kerrin S Small
- Department of Twin Research and Genetic Epidemiology, School of Life Course SciencesKing’s College LondonLondonUnited Kingdom
| | - Alan Hodgkinson
- Department of Medical and Molecular Genetics, School of Basic and Medical BiosciencesKing’s College LondonLondonUnited Kingdom
| |
Collapse
|