1
|
Matoba N, Le BD, Valone JM, Wolter JM, Mory JT, Liang D, Aygün N, Broadaway KA, Bond ML, Mohlke KL, Zylka MJ, Love MI, Stein JL. Stimulating Wnt signaling reveals context-dependent genetic effects on gene regulation in primary human neural progenitors. Nat Neurosci 2024:10.1038/s41593-024-01773-6. [PMID: 39349663 DOI: 10.1038/s41593-024-01773-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 08/28/2024] [Indexed: 10/09/2024]
Abstract
Gene regulatory effects have been difficult to detect at many non-coding loci associated with brain-related traits, likely because some genetic variants have distinct functions in specific contexts. To explore context-dependent gene regulation, we measured chromatin accessibility and gene expression after activation of the canonical Wnt pathway in primary human neural progenitors (n = 82 donors). We found that TCF/LEF motifs and brain-structure-associated and neuropsychiatric-disorder-associated variants were enriched within Wnt-responsive regulatory elements. Genetically influenced regulatory elements were enriched in genomic regions under positive selection along the human lineage. Wnt pathway stimulation increased detection of genetically influenced regulatory elements/genes by 66%/53% and enabled identification of 397 regulatory elements primed to regulate gene expression. Stimulation also increased identification of shared genetic effects on molecular and complex brain traits by up to 70%, suggesting that genetic variant function during neurodevelopmental patterning can lead to differences in adult brain and behavioral traits.
Collapse
Grants
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- 51145R01MH118349, R01MH120125, R01MH12143383 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
- R01MH118349, R01MH120125, R01MH121433 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
Collapse
Affiliation(s)
- Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Brandon D Le
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jordan M Valone
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Justin M Wolter
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Carolina Institute for Developmental Disabilities, Carrboro, NC, USA
| | - Jessica T Mory
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Dan Liang
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - K Alaine Broadaway
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Marielle L Bond
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Mark J Zylka
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Carolina Institute for Developmental Disabilities, Carrboro, NC, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Carolina Institute for Developmental Disabilities, Carrboro, NC, USA.
| |
Collapse
|
2
|
Qi T, Song L, Guo Y, Chen C, Yang J. From genetic associations to genes: methods, applications, and challenges. Trends Genet 2024; 40:642-667. [PMID: 38734482 DOI: 10.1016/j.tig.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 04/15/2024] [Accepted: 04/16/2024] [Indexed: 05/13/2024]
Abstract
Genome-wide association studies (GWASs) have identified numerous genetic loci associated with human traits and diseases. However, pinpointing the causal genes remains a challenge, which impedes the translation of GWAS findings into biological insights and medical applications. In this review, we provide an in-depth overview of the methods and technologies used for prioritizing genes from GWAS loci, including gene-based association tests, integrative analysis of GWAS and molecular quantitative trait loci (xQTL) data, linking GWAS variants to target genes through enhancer-gene connection maps, and network-based prioritization. We also outline strategies for generating context-dependent xQTL data and their applications in gene prioritization. We further highlight the potential of gene prioritization in drug repurposing. Lastly, we discuss future challenges and opportunities in this field.
Collapse
Affiliation(s)
- Ting Qi
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| | - Liyang Song
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Yazhou Guo
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Chang Chen
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China
| | - Jian Yang
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China; School of Life Sciences, Westlake University, Hangzhou 310024, China.
| |
Collapse
|
3
|
Wang Q, Antone J, Alsop E, Reiman R, Funk C, Bendl J, Dudley JT, Liang WS, Karr TL, Roussos P, Bennett DA, De Jager PL, Serrano GE, Beach TG, Van Keuren-Jensen K, Mastroeni D, Reiman EM, Readhead BP. Single cell transcriptomes and multiscale networks from persons with and without Alzheimer's disease. Nat Commun 2024; 15:5815. [PMID: 38987616 PMCID: PMC11237088 DOI: 10.1038/s41467-024-49790-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 06/13/2024] [Indexed: 07/12/2024] Open
Abstract
The emergence of single nucleus RNA sequencing (snRNA-seq) offers to revolutionize the study of Alzheimer's disease (AD). Integration with complementary multiomics data such as genetics, proteomics and clinical data provides powerful opportunities to link cell subpopulations and molecular networks with a broader disease-relevant context. We report snRNA-seq profiles from superior frontal gyrus samples from 101 well characterized subjects from the Banner Brain and Body Donation Program in combination with whole genome sequences. We report findings that link common AD risk variants with CR1 expression in oligodendrocytes as well as alterations in hematological parameters. We observed an AD-associated CD83(+) microglial subtype with unique molecular networks and which is associated with immunoglobulin IgG4 production in the transverse colon. Our major observations were replicated in two additional, independent snRNA-seq data sets. These findings illustrate the power of multi-tissue molecular profiling to contextualize snRNA-seq brain transcriptomics and reveal disease biology.
Collapse
Affiliation(s)
- Qi Wang
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, 85281, USA
| | - Jerry Antone
- Division of Neurogenomics, The Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | - Eric Alsop
- Division of Neurogenomics, The Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | - Rebecca Reiman
- Division of Neurogenomics, The Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | - Cory Funk
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | - Jaroslav Bendl
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Joel T Dudley
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, 85281, USA
| | - Winnie S Liang
- Division of Neurogenomics, The Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | - Timothy L Karr
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, 85281, USA
| | - Panos Roussos
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center, Rush University Medical Center, Chicago, IL, 60612, USA
| | - Philip L De Jager
- Department of Neurology, Center for Translational and Computational Neuroimmunology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Geidy E Serrano
- Civin Laboratory for Neuropathology, Banner Sun Health Research Institute, Sun City, AZ, 85351, USA
| | - Thomas G Beach
- Civin Laboratory for Neuropathology, Banner Sun Health Research Institute, Sun City, AZ, 85351, USA
| | | | - Diego Mastroeni
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, 85281, USA
| | - Eric M Reiman
- Banner Alzheimer's Institute, Phoenix, AZ, 85006, USA
| | - Benjamin P Readhead
- ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, 85281, USA.
| |
Collapse
|
4
|
Curion F, Theis FJ. Machine learning integrative approaches to advance computational immunology. Genome Med 2024; 16:80. [PMID: 38862979 PMCID: PMC11165829 DOI: 10.1186/s13073-024-01350-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 05/23/2024] [Indexed: 06/13/2024] Open
Abstract
The study of immunology, traditionally reliant on proteomics to evaluate individual immune cells, has been revolutionized by single-cell RNA sequencing. Computational immunologists play a crucial role in analysing these datasets, moving beyond traditional protein marker identification to encompass a more detailed view of cellular phenotypes and their functional roles. Recent technological advancements allow the simultaneous measurements of multiple cellular components-transcriptome, proteome, chromatin, epigenetic modifications and metabolites-within single cells, including in spatial contexts within tissues. This has led to the generation of complex multiscale datasets that can include multimodal measurements from the same cells or a mix of paired and unpaired modalities. Modern machine learning (ML) techniques allow for the integration of multiple "omics" data without the need for extensive independent modelling of each modality. This review focuses on recent advancements in ML integrative approaches applied to immunological studies. We highlight the importance of these methods in creating a unified representation of multiscale data collections, particularly for single-cell and spatial profiling technologies. Finally, we discuss the challenges of these holistic approaches and how they will be instrumental in the development of a common coordinate framework for multiscale studies, thereby accelerating research and enabling discoveries in the computational immunology field.
Collapse
Affiliation(s)
- Fabiola Curion
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Munich, Germany.
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
| |
Collapse
|
5
|
Zhou W, Cuomo ASE, Xue A, Kanai M, Chau G, Krishna C, Xavier RJ, MacArthur DG, Powell JE, Daly MJ, Neale BM. Efficient and accurate mixed model association tool for single-cell eQTL analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.15.24307317. [PMID: 38798318 PMCID: PMC11118640 DOI: 10.1101/2024.05.15.24307317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Understanding the genetic basis of gene expression can help us understand the molecular underpinnings of human traits and disease. Expression quantitative trait locus (eQTL) mapping can help in studying this relationship but have been shown to be very cell-type specific, motivating the use of single-cell RNA sequencing and single-cell eQTLs to obtain a more granular view of genetic regulation. Current methods for single-cell eQTL mapping either rely on the "pseudobulk" approach and traditional pipelines for bulk transcriptomics or do not scale well to large datasets. Here, we propose SAIGE-QTL, a robust and scalable tool that can directly map eQTLs using single-cell profiles without needing aggregation at the pseudobulk level. Additionally, SAIGE-QTL allows for testing the effects of less frequent/rare genetic variation through set-based tests, which is traditionally excluded from eQTL mapping studies. We evaluate the performance of SAIGE-QTL on both real and simulated data and demonstrate the improved power for eQTL mapping over existing pipelines.
Collapse
|
6
|
Popp JM, Rhodes K, Jangi R, Li M, Barr K, Tayeb K, Battle A, Gilad Y. Cell-type and dynamic state govern genetic regulation of gene expression in heterogeneous differentiating cultures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.02.592174. [PMID: 38746382 PMCID: PMC11092595 DOI: 10.1101/2024.05.02.592174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Identifying the molecular effects of human genetic variation across cellular contexts is crucial for understanding the mechanisms underlying disease-associated loci, yet many cell-types and developmental stages remain underexplored. Here we harnessed the potential of heterogeneous differentiating cultures ( HDCs ), an in vitro system in which pluripotent cells asynchronously differentiate into a broad spectrum of cell-types. We generated HDCs for 53 human donors and collected single-cell RNA-sequencing data from over 900,000 cells. We identified expression quantitative trait loci in 29 cell-types and characterized regulatory dynamics across diverse differentiation trajectories. This revealed novel regulatory variants for genes involved in key developmental and disease-related processes while replicating known effects from primary tissues, and dynamic regulatory effects associated with a range of complex traits.
Collapse
|
7
|
Natri HM, Del Azodi CB, Peter L, Taylor CJ, Chugh S, Kendle R, Chung MI, Flaherty DK, Matlock BK, Calvi CL, Blackwell TS, Ware LB, Bacchetta M, Walia R, Shaver CM, Kropski JA, McCarthy DJ, Banovich NE. Cell-type-specific and disease-associated expression quantitative trait loci in the human lung. Nat Genet 2024; 56:595-604. [PMID: 38548990 PMCID: PMC11018522 DOI: 10.1038/s41588-024-01702-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 02/28/2024] [Indexed: 04/04/2024]
Abstract
Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis. Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA sequencing of lung tissue from 66 individuals with pulmonary fibrosis and 48 unaffected donors. Using a pseudobulk approach, we mapped expression quantitative trait loci (eQTLs) across 38 cell types, observing both shared and cell-type-specific regulatory effects. Furthermore, we identified disease interaction eQTLs and demonstrated that this class of associations is more likely to be cell-type-specific and linked to cellular dysregulation in pulmonary fibrosis. Finally, we connected lung disease risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression and implicates context-specific eQTLs as key regulators of lung homeostasis and disease.
Collapse
Affiliation(s)
- Heini M Natri
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Christina B Del Azodi
- St. Vincent's Institute of Medical Research, Melbourne, Victoria, Australia
- Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
| | - Lance Peter
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Chase J Taylor
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sagrika Chugh
- St. Vincent's Institute of Medical Research, Melbourne, Victoria, Australia
- Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, Victoria, Australia
| | - Robert Kendle
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - Mei-I Chung
- Translational Genomics Research Institute, Phoenix, AZ, USA
| | - David K Flaherty
- Flow Cytometry Shared Resource, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Brittany K Matlock
- Flow Cytometry Shared Resource, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Carla L Calvi
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Timothy S Blackwell
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
- Department of Veterans Affairs Medical Center, Nashville, TN, USA
| | - Lorraine B Ware
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Matthew Bacchetta
- Department of Cardiac Surgery, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Rajat Walia
- Department of Thoracic Disease and Transplantation, Norton Thoracic Institute, Phoenix, AZ, USA
| | - Ciara M Shaver
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jonathan A Kropski
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Cell and Developmental Biology, Vanderbilt University, Nashville, TN, USA
- Department of Veterans Affairs Medical Center, Nashville, TN, USA
| | - Davis J McCarthy
- St. Vincent's Institute of Medical Research, Melbourne, Victoria, Australia
- Melbourne Integrative Genomics, University of Melbourne, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Melbourne, Victoria, Australia
| | | |
Collapse
|
8
|
Ban M, Bredikhin D, Huang Y, Bonder MJ, Katarzyna K, Oliver AJ, Wilson NK, Coupland P, Hadfield J, Göttgens B, Madissoon E, Stegle O, Sawcer S. Expression profiling of cerebrospinal fluid identifies dysregulated antiviral mechanisms in multiple sclerosis. Brain 2024; 147:554-565. [PMID: 38038362 PMCID: PMC10834244 DOI: 10.1093/brain/awad404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 11/06/2023] [Accepted: 11/18/2023] [Indexed: 12/02/2023] Open
Abstract
Despite the overwhelming evidence that multiple sclerosis is an autoimmune disease, relatively little is known about the precise nature of the immune dysregulation underlying the development of the disease. Reasoning that the CSF from patients might be enriched for cells relevant in pathogenesis, we have completed a high-resolution single-cell analysis of 96 732 CSF cells collected from 33 patients with multiple sclerosis (n = 48 675) and 48 patients with other neurological diseases (n = 48 057). Completing comprehensive cell type annotation, we identified a rare population of CD8+ T cells, characterized by the upregulation of inhibitory receptors, increased in patients with multiple sclerosis. Applying a Multi-Omics Factor Analysis to these single-cell data further revealed that activity in pathways responsible for controlling inflammatory and type 1 interferon responses are altered in multiple sclerosis in both T cells and myeloid cells. We also undertook a systematic search for expression quantitative trait loci in the CSF cells. Of particular interest were two expression quantitative trait loci in CD8+ T cells that were fine mapped to multiple sclerosis susceptibility variants in the viral control genes ZC3HAV1 (rs10271373) and IFITM2 (rs1059091). Further analysis suggests that these associations likely reflect genetic effects on RNA splicing and cell-type specific gene expression respectively. Collectively, our study suggests that alterations in viral control mechanisms might be important in the development of multiple sclerosis.
Collapse
Affiliation(s)
- Maria Ban
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Danila Bredikhin
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Yuanhua Huang
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0QQ, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, UK
| | - Marc Jan Bonder
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Kania Katarzyna
- University of Cambridge, CRUK Cambridge Institute, Cambridge CB2 0RE, UK
| | - Amanda J Oliver
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK
| | - Nicola K Wilson
- Department of Haematology, University of Cambridge, Cambridge CB2 0AW, UK
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Paul Coupland
- University of Cambridge, CRUK Cambridge Institute, Cambridge CB2 0RE, UK
| | - James Hadfield
- University of Cambridge, CRUK Cambridge Institute, Cambridge CB2 0RE, UK
| | - Berthold Göttgens
- Department of Haematology, University of Cambridge, Cambridge CB2 0AW, UK
- Wellcome-Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Elo Madissoon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK
| | - Oliver Stegle
- European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, UK
| | - Stephen Sawcer
- Department of Clinical Neurosciences, University of Cambridge, Cambridge CB2 0QQ, UK
| |
Collapse
|
9
|
Abe H, Lin P, Zhou D, Ruderfer DM, Gamazon ER. Mapping the landscape of lineage-specific dynamic regulation of gene expression using single-cell transcriptomics and application to genetics of complex disease. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.24.23297476. [PMID: 37961453 PMCID: PMC10635195 DOI: 10.1101/2023.10.24.23297476] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Single-cell transcriptome data can provide insights into how genetic variation influences biological processes involved in human biology and disease. However, the identification of gene-level associations in distinct cell types faces several challenges, including the limited reference resource from population scale studies, data sparsity in single-cell RNA sequencing, and the complex cell-state pattern of expression within individual cell types. Here we develop genetic models of cell type specific and cell state adjusted gene expression in mid-brain neurons in the process of specializing from induced pluripotent stem cells. The resulting framework quantifies the dynamics of the genetic regulation of gene expression and estimates its cell type specificity. As an application, we show that the approach detects known and new genes associated with schizophrenia and enables insights into context-dependent disease mechanisms. We provide a genomic resource from a phenome-wide application of our models to more than 1500 phenotypes from the UK Biobank. Using longitudinal genetically determined expression, we implement a predictive causality framework, evaluating the prediction of future values of a target gene expression using prior values of a putative regulatory gene. Collectively, this work demonstrates the insights that can be gained into the molecular underpinnings of diseases by quantifying the genetic control of gene expression at single-cell resolution.
Collapse
Affiliation(s)
- Hanna Abe
- Vanderbilt University, Nashville, TN
| | - Phillip Lin
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Dan Zhou
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
| | - Douglas M Ruderfer
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
- Department of Biomedical Informatics and Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN
| | - Eric R Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN
- Clare Hall, University of Cambridge, Cambridge, England
| |
Collapse
|
10
|
Zhong J, Pan R, Gao M, Mo Y, Peng X, Liang G, Chen Z, Du J, Huang Z. Identification and validation of a T cell marker gene-based signature to predict prognosis and immunotherapy response in gastric cancer. Sci Rep 2023; 13:21357. [PMID: 38049463 PMCID: PMC10696024 DOI: 10.1038/s41598-023-48930-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 12/01/2023] [Indexed: 12/06/2023] Open
Abstract
Although the role of T cells in tumor immunity and modulation of the tumor microenvironment (TME) has been extensively studied, their precise involvement in gastric adenocarcinoma remains inadequately explored. In this work, we analyzed the single-cell RNA sequencing data set in GSE183904 and identified 322 T cell marker genes using the "FindAllMarkers" method of the R package "Seurat". STAD patients in the TCGA database were divided into high-risk and low-risk categories based on risk scores. The five-gene prediction signature based on T cell marker genes can predict the prognosis of gastric cancer patients with high accuracy. In the training cohort, the areas under the receiver operating characteristic (ROC) curve were 0.667, 0.73, and 0.818 at 1, 3, and 5 years. External validation of the predictive signature was also performed using multiple clinical subgroups and GEO cohorts. To help with practical application, a diagnostic model was created that shows values of 0.732, 0.752, and 0.816 for the relevant areas under the ROC curve at 1, 3, and 5 years. The T cell marker genes identified in this study may serve as potential therapeutic targets, and the developed predictive signatures and nomograms may aid in the clinical management of gastric cancer.
Collapse
Affiliation(s)
- Jinlin Zhong
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Rongling Pan
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Miao Gao
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Yuqian Mo
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Xin Peng
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Guoxiao Liang
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Zixuan Chen
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Jinlin Du
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China
| | - Zhigang Huang
- Department of Epidemiology and Health Statistics, School of Public Health, Guangdong Medical University, Dongguan, Guangdong, People's Republic of China.
- Key Laboratory of Noncommunicable Diseases Control and Health Data Statistics of Guangdong Medical University, Dongguan, Guangdong, People's Republic of China.
| |
Collapse
|
11
|
Kang JB, Shen AZ, Gurajala S, Nathan A, Rumker L, Aguiar VRC, Valencia C, Lagattuta KA, Zhang F, Jonsson AH, Yazar S, Alquicira-Hernandez J, Khalili H, Ananthakrishnan AN, Jagadeesh K, Dey K, Daly MJ, Xavier RJ, Donlin LT, Anolik JH, Powell JE, Rao DA, Brenner MB, Gutierrez-Arcelus M, Luo Y, Sakaue S, Raychaudhuri S. Mapping the dynamic genetic regulatory architecture of HLA genes at single-cell resolution. Nat Genet 2023; 55:2255-2268. [PMID: 38036787 PMCID: PMC10787945 DOI: 10.1038/s41588-023-01586-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 10/19/2023] [Indexed: 12/02/2023]
Abstract
The human leukocyte antigen (HLA) locus plays a critical role in complex traits spanning autoimmune and infectious diseases, transplantation and cancer. While coding variation in HLA genes has been extensively documented, regulatory genetic variation modulating HLA expression levels has not been comprehensively investigated. Here we mapped expression quantitative trait loci (eQTLs) for classical HLA genes across 1,073 individuals and 1,131,414 single cells from three tissues. To mitigate technical confounding, we developed scHLApers, a pipeline to accurately quantify single-cell HLA expression using personalized reference genomes. We identified cell-type-specific cis-eQTLs for every classical HLA gene. Modeling eQTLs at single-cell resolution revealed that many eQTL effects are dynamic across cell states even within a cell type. HLA-DQ genes exhibit particularly cell-state-dependent effects within myeloid, B and T cells. For example, a T cell HLA-DQA1 eQTL ( rs3104371 ) is strongest in cytotoxic cells. Dynamic HLA regulation may underlie important interindividual variability in immune responses.
Collapse
Affiliation(s)
- Joyce B Kang
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Amber Z Shen
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Saisriram Gurajala
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Aparna Nathan
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Laurie Rumker
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Vitor R C Aguiar
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Immunology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Cristian Valencia
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Kaitlyn A Lagattuta
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Fan Zhang
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology and the Center for Health Artificial Intelligence, University of Colorado School of Medicine, Aurora, CO, USA
| | - Anna Helena Jonsson
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Seyhan Yazar
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | | | - Hamed Khalili
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Ashwin N Ananthakrishnan
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Kushal Dey
- Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Physiology, Biophysics and Systems Biology Program, Weill Cornell Medicine, New York, NY, USA
| | - Mark J Daly
- Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- The Stanley Center for Psychiatric Research, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Ramnik J Xavier
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Molecular Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Laura T Donlin
- Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medicine, New York, NY, USA
| | - Jennifer H Anolik
- Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
| | - Joseph E Powell
- Garvan Institute of Medical Research, Sydney, New South Wales, Australia
| | - Deepak A Rao
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael B Brenner
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Maria Gutierrez-Arcelus
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Immunology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
| | - Yang Luo
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Kennedy Institute of Rheumatology, University of Oxford, Oxford, UK
| | - Saori Sakaue
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA, USA.
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
12
|
Wang Q, Antone J, Alsop E, Reiman R, Funk C, Bendl J, Dudley JT, Liang WS, Karr TL, Roussos P, Bennett DA, De Jager PL, Serrano GE, Beach TG, Keuren-Jensen KV, Mastroeni D, Reiman EM, Readhead BP. A public resource of single cell transcriptomes and multiscale networks from persons with and without Alzheimer's disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.20.563319. [PMID: 37961404 PMCID: PMC10634692 DOI: 10.1101/2023.10.20.563319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The emergence of technologies that can support high-throughput profiling of single cell transcriptomes offers to revolutionize the study of brain tissue from persons with and without Alzheimer's disease (AD). Integration of these data with additional complementary multiomics data such as genetics, proteomics and clinical data provides powerful opportunities to link observed cell subpopulations and molecular network features within a broader disease-relevant context. We report here single nucleus RNA sequencing (snRNA-seq) profiles generated from superior frontal gyrus cortical tissue samples from 101 exceptionally well characterized, aged subjects from the Banner Brain and Body Donation Program in combination with whole genome sequences. We report findings that link common AD risk variants with CR1 expression in oligodendrocytes as well as alterations in peripheral hematological lab parameters, with these observations replicated in an independent, prospective cohort study of ageing and dementia. We also observed an AD-associated CD83(+) microglial subtype with unique molecular networks that encompass many known regulators of AD-relevant microglial biology, and which are associated with immunoglobulin IgG4 production in the transverse colon. These findings illustrate the power of multi-tissue molecular profiling to contextualize snRNA-seq brain transcriptomics and reveal novel disease biology. The transcriptomic, genetic, phenotypic, and network data resources described within this study are available for access and utilization by the scientific community.
Collapse
|
13
|
Hallast P, Ebert P, Loftus M, Yilmaz F, Audano PA, Logsdon GA, Bonder MJ, Zhou W, Höps W, Kim K, Li C, Hoyt SJ, Dishuck PC, Porubsky D, Tsetsos F, Kwon JY, Zhu Q, Munson KM, Hasenfeld P, Harvey WT, Lewis AP, Kordosky J, Hoekzema K, O'Neill RJ, Korbel JO, Tyler-Smith C, Eichler EE, Shi X, Beck CR, Marschall T, Konkel MK, Lee C. Assembly of 43 human Y chromosomes reveals extensive complexity and variation. Nature 2023; 621:355-364. [PMID: 37612510 PMCID: PMC10726138 DOI: 10.1038/s41586-023-06425-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 07/11/2023] [Indexed: 08/25/2023]
Abstract
The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.
Collapse
Affiliation(s)
- Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Peter Ebert
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Core Unit Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Mark Loftus
- Department of Genetics & Biochemistry, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Feyza Yilmaz
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Peter A Audano
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marc Jan Bonder
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Weichen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Wolfram Höps
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Kwondo Kim
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Chong Li
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Savannah J Hoyt
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Fotios Tsetsos
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Jee Young Kwon
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Qihui Zhu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Patrick Hasenfeld
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Jennifer Kordosky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- The University of Connecticut Health Center, Farmington, CT, USA
| | - Jan O Korbel
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Xinghua Shi
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- The University of Connecticut Health Center, Farmington, CT, USA
| | - Tobias Marschall
- Institute for Medical Biometry and Bioinformatics, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
- Center for Digital Medicine, Heinrich Heine University, Düsseldorf, Germany
| | - Miriam K Konkel
- Department of Genetics & Biochemistry, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Greenwood, SC, USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
| |
Collapse
|
14
|
Kang JB, Raveane A, Nathan A, Soranzo N, Raychaudhuri S. Methods and Insights from Single-Cell Expression Quantitative Trait Loci. Annu Rev Genomics Hum Genet 2023; 24:277-303. [PMID: 37196361 PMCID: PMC10784788 DOI: 10.1146/annurev-genom-101422-100437] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Recent advancements in single-cell technologies have enabled expression quantitative trait locus (eQTL) analysis across many individuals at single-cell resolution. Compared with bulk RNA sequencing, which averages gene expression across cell types and cell states, single-cell assays capture the transcriptional states of individual cells, including fine-grained, transient, and difficult-to-isolate populations at unprecedented scale and resolution. Single-cell eQTL (sc-eQTL) mapping can identify context-dependent eQTLs that vary with cell states, including some that colocalize with disease variants identified in genome-wide association studies. By uncovering the precise contexts in which these eQTLs act, single-cell approaches can unveil previously hidden regulatory effects and pinpoint important cell states underlying molecular mechanisms of disease. Here, we present an overview of recently deployed experimental designs in sc-eQTL studies. In the process, we consider the influence of study design choices such as cohort, cell states, and ex vivo perturbations. We then discuss current methodologies, modeling approaches, and technical challenges as well as future opportunities and applications.
Collapse
Affiliation(s)
- Joyce B Kang
- Center for Data Sciences and Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA; ,
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA;
| | | | - Aparna Nathan
- Center for Data Sciences and Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA; ,
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA;
| | - Nicole Soranzo
- Human Technopole, Milan, Italy; ,
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, United Kingdom
- British Heart Foundation Centre of Research Excellence and Department of Haematology, University of Cambridge, Cambridge, United Kingdom
| | - Soumya Raychaudhuri
- Center for Data Sciences and Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts, USA; ,
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA;
- Centre for Genetics and Genomics Versus Arthritis, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
15
|
Cuomo ASE, Nathan A, Raychaudhuri S, MacArthur DG, Powell JE. Single-cell genomics meets human genetics. Nat Rev Genet 2023; 24:535-549. [PMID: 37085594 PMCID: PMC10784789 DOI: 10.1038/s41576-023-00599-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/29/2023] [Indexed: 04/23/2023]
Abstract
Single-cell genomic technologies are revealing the cellular composition, identities and states in tissues at unprecedented resolution. They have now scaled to the point that it is possible to query samples at the population level, across thousands of individuals. Combining single-cell information with genotype data at this scale provides opportunities to link genetic variation to the cellular processes underpinning key aspects of human biology and disease. This strategy has potential implications for disease diagnosis, risk prediction and development of therapeutic solutions. But, effectively integrating large-scale single-cell genomic data, genetic variation and additional phenotypic data will require advances in data generation and analysis methods. As single-cell genetics begins to emerge as a field in its own right, we review its current state and the challenges and opportunities ahead.
Collapse
Affiliation(s)
- Anna S E Cuomo
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia.
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia.
| | - Aparna Nathan
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Divisions of Rheumatology and Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Divisions of Rheumatology and Genetics, Department of Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Daniel G MacArthur
- Centre for Population Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia
- Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
| | - Joseph E Powell
- Garvan Institute of Medical Research, Darlinghurst, Sydney, New South Wales, Australia.
- UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, New South Wales, Australia.
| |
Collapse
|
16
|
Natri HM, Del Azodi CB, Peter L, Taylor CJ, Chugh S, Kendle R, Chung MI, Flaherty DK, Matlock BK, Calvi CL, Blackwell TS, Ware LB, Bacchetta M, Walia R, Shaver CM, Kropski JA, McCarthy DJ, Banovich NE. Cell type-specific and disease-associated eQTL in the human lung. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.17.533161. [PMID: 36993211 PMCID: PMC10055257 DOI: 10.1101/2023.03.17.533161] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis (PF). Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA-sequencing of lung tissue from 67 PF and 49 unaffected donors. Employing a pseudo-bulk approach, we mapped expression quantitative trait loci (eQTL) across 38 cell types, observing both shared and cell type-specific regulatory effects. Further, we identified disease-interaction eQTL and demonstrated that this class of associations is more likely to be cell-type specific and linked to cellular dysregulation in PF. Finally, we connected PF risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression, and implicates context-specific eQTL as key regulators of lung homeostasis and disease.
Collapse
|
17
|
Van de Sande B, Lee JS, Mutasa-Gottgens E, Naughton B, Bacon W, Manning J, Wang Y, Pollard J, Mendez M, Hill J, Kumar N, Cao X, Chen X, Khaladkar M, Wen J, Leach A, Ferran E. Applications of single-cell RNA sequencing in drug discovery and development. Nat Rev Drug Discov 2023; 22:496-520. [PMID: 37117846 PMCID: PMC10141847 DOI: 10.1038/s41573-023-00688-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/10/2023] [Indexed: 04/30/2023]
Abstract
Single-cell technologies, particularly single-cell RNA sequencing (scRNA-seq) methods, together with associated computational tools and the growing availability of public data resources, are transforming drug discovery and development. New opportunities are emerging in target identification owing to improved disease understanding through cell subtyping, and highly multiplexed functional genomics screens incorporating scRNA-seq are enhancing target credentialling and prioritization. ScRNA-seq is also aiding the selection of relevant preclinical disease models and providing new insights into drug mechanisms of action. In clinical development, scRNA-seq can inform decision-making via improved biomarker identification for patient stratification and more precise monitoring of drug response and disease progression. Here, we illustrate how scRNA-seq methods are being applied in key steps in drug discovery and development, and discuss ongoing challenges for their implementation in the pharmaceutical industry.
Collapse
Affiliation(s)
| | | | | | - Bart Naughton
- Computational Neurobiology, Eisai, Cambridge, MA, USA
| | - Wendi Bacon
- EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
- The Open University, Milton Keynes, UK
| | | | - Yong Wang
- Precision Bioinformatics, Prometheus Biosciences, San Diego, CA, USA
| | | | - Melissa Mendez
- Genomic Sciences, GlaxoSmithKline, Collegeville, PA, USA
| | - Jon Hill
- Global Computational Biology and Digital Sciences, Boehringer Ingelheim Pharmaceuticals Inc., Ridgefield, CT, USA
| | - Namit Kumar
- Informatics & Predictive Sciences, Bristol Myers Squibb, San Diego, CA, USA
| | - Xiaohong Cao
- Genomic Research Center, AbbVie Inc., Cambridge, MA, USA
| | - Xiao Chen
- Magnet Biomedicine, Cambridge, MA, USA
| | - Mugdha Khaladkar
- Human Genetics and Computational Biology, GlaxoSmithKline, Collegeville, PA, USA
| | - Ji Wen
- Oncology Research and Development Unit, Pfizer, La Jolla, CA, USA
| | | | | |
Collapse
|
18
|
Luo J, Wu X, Cheng Y, Chen G, Wang J, Song X. Expression quantitative trait locus studies in the era of single-cell omics. Front Genet 2023; 14:1182579. [PMID: 37284065 PMCID: PMC10239882 DOI: 10.3389/fgene.2023.1182579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 04/26/2023] [Indexed: 06/08/2023] Open
Abstract
Genome-wide association studies have revealed that the regulation of gene expression bridges genetic variants and complex phenotypes. Profiling of the bulk transcriptome coupled with linkage analysis (expression quantitative trait locus (eQTL) mapping) has advanced our understanding of the relationship between genetic variants and gene regulation in the context of complex phenotypes. However, bulk transcriptomics has inherited limitations as the regulation of gene expression tends to be cell-type-specific. The advent of single-cell RNA-seq technology now enables the identification of the cell-type-specific regulation of gene expression through a single-cell eQTL (sc-eQTL). In this review, we first provide an overview of sc-eQTL studies, including data processing and the mapping procedure of the sc-eQTL. We then discuss the benefits and limitations of sc-eQTL analyses. Finally, we present an overview of the current and future applications of sc-eQTL discoveries.
Collapse
Affiliation(s)
- Jie Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Yuan Cheng
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Guang Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Jian Wang
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xijiao Song
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
19
|
Kang JB, Shen AZ, Sakaue S, Luo Y, Gurajala S, Nathan A, Rumker L, Aguiar VRC, Valencia C, Lagattuta K, Zhang F, Jonsson AH, Yazar S, Alquicira-Hernandez J, Khalili H, Ananthakrishnan AN, Jagadeesh K, Dey K, Daly MJ, Xavier RJ, Donlin LT, Anolik JH, Powell JE, Rao DA, Brenner MB, Gutierrez-Arcelus M, Raychaudhuri S. Mapping the dynamic genetic regulatory architecture of HLA genes at single-cell resolution. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.14.23287257. [PMID: 36993194 PMCID: PMC10055604 DOI: 10.1101/2023.03.14.23287257] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The human leukocyte antigen (HLA) locus plays a critical role in complex traits spanning autoimmune and infectious diseases, transplantation, and cancer. While coding variation in HLA genes has been extensively documented, regulatory genetic variation modulating HLA expression levels has not been comprehensively investigated. Here, we mapped expression quantitative trait loci (eQTLs) for classical HLA genes across 1,073 individuals and 1,131,414 single cells from three tissues, using personalized reference genomes to mitigate technical confounding. We identified cell-type-specific cis-eQTLs for every classical HLA gene. Modeling eQTLs at single-cell resolution revealed that many eQTL effects are dynamic across cell states even within a cell type. HLA-DQ genes exhibit particularly cell-state-dependent effects within myeloid, B, and T cells. Dynamic HLA regulation may underlie important interindividual variability in immune responses.
Collapse
Affiliation(s)
- Joyce B. Kang
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Amber Z. Shen
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Saori Sakaue
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Yang Luo
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Kennedy Institute of Rheumatology, University of Oxford, Oxford, UK
| | - Saisriram Gurajala
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Aparna Nathan
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Laurie Rumker
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Vitor R. C. Aguiar
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Immunology, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Cristian Valencia
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Kaitlyn Lagattuta
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Fan Zhang
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology and the Center for Health Artificial Intelligence, University of Colorado School of Medicine, Aurora, CO, USA
| | - Anna Helena Jonsson
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Seyhan Yazar
- Garvan Institute of Medical Research, Sydney, NSW, Australia
| | | | - Hamed Khalili
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | - Ashwin N. Ananthakrishnan
- Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, USA
| | | | - Kushal Dey
- Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | | | - Mark J. Daly
- Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- The Stanley Center for Psychiatric Research, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Ramnik J. Xavier
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Molecular Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Laura T. Donlin
- Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medicine, New York, NY, USA
| | - Jennifer H. Anolik
- Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
| | | | - Deepak A. Rao
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael B. Brenner
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Maria Gutierrez-Arcelus
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Immunology, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
20
|
Xue A, Yazar S, Neavin D, Powell JE. Pitfalls and opportunities for applying latent variables in single-cell eQTL analyses. Genome Biol 2023; 24:33. [PMID: 36823676 PMCID: PMC9948363 DOI: 10.1186/s13059-023-02873-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 02/13/2023] [Indexed: 02/25/2023] Open
Abstract
Using latent variables in gene expression data can help correct unobserved confounders and increase statistical power for expression quantitative trait Loci (eQTL) detection. The probabilistic estimation of expression residuals (PEER) and principal component analysis (PCA) are widely used methods that can remove unwanted variation and improve eQTL discovery power in bulk RNA-seq analysis. However, their performance has not been evaluated extensively in single-cell eQTL analysis, especially for different cell types. Potential challenges arise due to the structure of single-cell RNA-seq data, including sparsity, skewness, and mean-variance relationship. Here, we show by a series of analyses that PEER and PCA require additional quality control and data transformation steps on the pseudo-bulk matrix to obtain valid latent variables; otherwise, it can result in highly correlated factors (Pearson's correlation r = 0.63 ~ 0.99). Incorporating valid PFs/PCs in the eQTL association model would identify 1.7 ~ 13.3% more eGenes. Sensitivity analysis showed that the pattern of change between the number of eGenes detected and fitted PFs/PCs varied significantly in different cell types. In addition, using highly variable genes to generate latent variables could achieve similar eGenes discovery power as using all genes but save considerable computational resources (~ 6.2-fold faster).
Collapse
Affiliation(s)
- Angli Xue
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW, 2010, Australia.
- School of Biomedical Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.
| | - Seyhan Yazar
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW, 2010, Australia
| | - Drew Neavin
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW, 2010, Australia
| | - Joseph E Powell
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Sydney, NSW, 2010, Australia.
- UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, NSW, 2052, Australia.
| |
Collapse
|
21
|
Zhou HJ, Li L, Li Y, Li W, Li JJ. PCA outperforms popular hidden variable inference methods for molecular QTL mapping. Genome Biol 2022; 23:210. [PMID: 36221136 PMCID: PMC9552461 DOI: 10.1186/s13059-022-02761-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 08/26/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Estimating and accounting for hidden variables is widely practiced as an important step in molecular quantitative trait locus (molecular QTL, henceforth "QTL") analysis for improving the power of QTL identification. However, few benchmark studies have been performed to evaluate the efficacy of the various methods developed for this purpose. RESULTS Here we benchmark popular hidden variable inference methods including surrogate variable analysis (SVA), probabilistic estimation of expression residuals (PEER), and hidden covariates with prior (HCP) against principal component analysis (PCA)-a well-established dimension reduction and factor discovery method-via 362 synthetic and 110 real data sets. We show that PCA not only underlies the statistical methodology behind the popular methods but is also orders of magnitude faster, better-performing, and much easier to interpret and use. CONCLUSIONS To help researchers use PCA in their QTL analysis, we provide an R package PCAForQTL along with a detailed guide, both of which are freely available at https://github.com/heatherjzhou/PCAForQTL . We believe that using PCA rather than SVA, PEER, or HCP will substantially improve and simplify hidden variable inference in QTL mapping as well as increase the transparency and reproducibility of QTL research.
Collapse
Affiliation(s)
- Heather J Zhou
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA
| | - Lei Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518055, China
| | - Yumei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Wei Li
- Division of Computational Biomedicine, Department of Biological Chemistry, School of Medicine, University of California, Irvine, Irvine, CA, 92697, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA, 90095, USA.
| |
Collapse
|
22
|
Shared regulation and functional relevance of local gene co-expression revealed by single cell analysis. Commun Biol 2022; 5:876. [PMID: 36028576 PMCID: PMC9418141 DOI: 10.1038/s42003-022-03831-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 08/10/2022] [Indexed: 02/01/2023] Open
Abstract
Most human genes are co-expressed with a nearby gene. Previous studies have revealed this local gene co-expression to be widespread across chromosomes and across dozens of tissues. Yet, so far these studies used bulk RNA-seq, averaging gene expression measurements across millions of cells, thus being unclear if this co-expression stems from transcription events in single cells. Here, we leverage single cell datasets in >85 individuals to identify gene co-expression across cells, unbiased by cell-type heterogeneity and benefiting from the co-occurrence of transcription events in single cells. We discover >3800 co-expressed gene pairs in two human cell types, induced pluripotent stem cells (iPSCs) and lymphoblastoid cell lines (LCLs) and (i) compare single cell to bulk RNA-seq in identifying local gene co-expression, (ii) show that many co-expressed genes – but not the majority – are composed of functionally related genes and (iii) using proteomics data, provide evidence that their co-expression is maintained up to the protein level. Finally, using single cell RNA-sequencing (scRNA-seq) and single cell ATAC-sequencing (scATAC-seq) data for the same single cells, we identify gene-enhancer associations and reveal that >95% of co-expressed gene pairs share regulatory elements. These results elucidate the potential reasons for co-expression in single cell gene regulatory networks and warrant a deeper study of shared regulatory elements, in view of explaining disease comorbidity due to affecting several genes. Our in-depth view of local gene co-expression and regulatory element co-activity advances our understanding of the shared regulatory architecture between genes. Using single-cell data from cell lines, the co-expression of genes and co-activity of regulatory elements is analyzed, providing insight into shared architecture and regulation between genes.
Collapse
|
23
|
Bankier S, Michoel T. eQTLs as causal instruments for the reconstruction of hormone linked gene networks. Front Endocrinol (Lausanne) 2022; 13:949061. [PMID: 36060942 PMCID: PMC9428692 DOI: 10.3389/fendo.2022.949061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 07/25/2022] [Indexed: 11/17/2022] Open
Abstract
Hormones act within in highly dynamic systems and much of the phenotypic response to variation in hormone levels is mediated by changes in gene expression. The increase in the number and power of large genetic association studies has led to the identification of hormone linked genetic variants. However, the biological mechanisms underpinning the majority of these loci are poorly understood. The advent of affordable, high throughput next generation sequencing and readily available transcriptomic databases has shown that many of these genetic variants also associate with variation in gene expression levels as expression Quantitative Trait Loci (eQTLs). In addition to further dissecting complex genetic variation, eQTLs have been applied as tools for causal inference. Many hormone networks are driven by transcription factors, and many of these genes can be linked to eQTLs. In this mini-review, we demonstrate how causal inference and gene networks can be used to describe the impact of hormone linked genetic variation upon the transcriptome within an endocrinology context.
Collapse
Affiliation(s)
- Sean Bankier
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | | |
Collapse
|
24
|
Cuomo ASE, Heinen T, Vagiaki D, Horta D, Marioni JC, Stegle O. CellRegMap: a statistical framework for mapping context-specific regulatory variants using scRNA-seq. Mol Syst Biol 2022; 18:e10663. [PMID: 35972065 PMCID: PMC9380406 DOI: 10.15252/msb.202110663] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 06/28/2022] [Accepted: 07/01/2022] [Indexed: 11/11/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) enables characterizing the cellular heterogeneity in human tissues. Recent technological advances have enabled the first population-scale scRNA-seq studies in hundreds of individuals, allowing to assay genetic effects with single-cell resolution. However, existing strategies to analyze these data remain based on principles established for the genetic analysis of bulk RNA-seq. In particular, current methods depend on a priori definitions of discrete cell types, and hence cannot assess allelic effects across subtle cell types and cell states. To address this, we propose the Cell Regulatory Map (CellRegMap), a statistical framework to test for and quantify genetic effects on gene expression in individual cells. CellRegMap provides a principled approach to identify and characterize genotype-context interactions of known eQTL variants using scRNA-seq data. This model-based approach resolves allelic effects across cellular contexts of different granularity, including genetic effects specific to cell subtypes and continuous cell transitions. We validate CellRegMap using simulated data and apply it to previously identified eQTL from two recent studies of differentiating iPSCs, where we uncover hundreds of eQTL displaying heterogeneity of genetic effects across cellular contexts. Finally, we identify fine-grained genetic regulation in neuronal subtypes for eQTL that are colocalized with human disease variants.
Collapse
Affiliation(s)
- Anna S E Cuomo
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Sanger InstituteCambridgeUK
- Present address:
Garvan Institute of Medical ScienceSydneyNSWAustralia
| | - Tobias Heinen
- Division of Computational Genomics and Systems GeneticsGerman Cancer Research Centre (DKFZ)HeidelbergGermany
- European Molecular Biology Laboratory (EMBL)Genome BiologyHeidelbergGermany
- Faculty of Mathematics and Computer ScienceHeidelberg UniversityHeidelbergGermany
| | - Danai Vagiaki
- Division of Computational Genomics and Systems GeneticsGerman Cancer Research Centre (DKFZ)HeidelbergGermany
- European Molecular Biology Laboratory (EMBL)Genome BiologyHeidelbergGermany
- Faculty of BiosciencesHeidelberg UniversityHeidelbergGermany
| | - Danilo Horta
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - John C Marioni
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Sanger InstituteCambridgeUK
- Cancer Research UKCambridge InstituteCambridgeUK
| | - Oliver Stegle
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Sanger InstituteCambridgeUK
- Division of Computational Genomics and Systems GeneticsGerman Cancer Research Centre (DKFZ)HeidelbergGermany
- European Molecular Biology Laboratory (EMBL)Genome BiologyHeidelbergGermany
| |
Collapse
|
25
|
Nathan A, Asgari S, Ishigaki K, Valencia C, Amariuta T, Luo Y, Beynor JI, Baglaenko Y, Suliman S, Price AL, Lecca L, Murray MB, Moody DB, Raychaudhuri S. Single-cell eQTL models reveal dynamic T cell state dependence of disease loci. Nature 2022; 606:120-128. [PMID: 35545678 PMCID: PMC9842455 DOI: 10.1038/s41586-022-04713-1] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 03/31/2022] [Indexed: 02/02/2023]
Abstract
Non-coding genetic variants may cause disease by modulating gene expression. However, identifying these expression quantitative trait loci (eQTLs) is complicated by differences in gene regulation across fluid functional cell states within cell types. These states-for example, neurotransmitter-driven programs in astrocytes or perivascular fibroblast differentiation-are obscured in eQTL studies that aggregate cells1,2. Here we modelled eQTLs at single-cell resolution in one complex cell type: memory T cells. Using more than 500,000 unstimulated memory T cells from 259 Peruvian individuals, we show that around one-third of 6,511 cis-eQTLs had effects that were mediated by continuous multimodally defined cell states, such as cytotoxicity and regulatory capacity. In some loci, independent eQTL variants had opposing cell-state relationships. Autoimmune variants were enriched in cell-state-dependent eQTLs, including risk variants for rheumatoid arthritis near ORMDL3 and CTLA4; this indicates that cell-state context is crucial to understanding potential eQTL pathogenicity. Moreover, continuous cell states explained more variation in eQTLs than did conventional discrete categories, such as CD4+ versus CD8+, suggesting that modelling eQTLs and cell states at single-cell resolution can expand insight into gene regulation in functionally heterogeneous cell types.
Collapse
Affiliation(s)
- Aparna Nathan
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Samira Asgari
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kazuyoshi Ishigaki
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Cristian Valencia
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Tiffany Amariuta
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Yang Luo
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jessica I Beynor
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Yuriy Baglaenko
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Sara Suliman
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Leonid Lecca
- Department of Global Health and Social Medicine, Harvard Medical School, Boston, MA, USA
- Socios En Salud Sucursal Peru, Lima, Peru
| | - Megan B Murray
- Department of Global Health and Social Medicine, Harvard Medical School, Boston, MA, USA
- Division of Global Health Equity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - D Branch Moody
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Centre for Genetics and Genomics Versus Arthritis, Manchester Academic Health Science Centre, University of Manchester, Manchester, UK.
| |
Collapse
|
26
|
Mu W, Sarkar H, Srivastava A, Choi K, Patro R, Love MI. Airpart: interpretable statistical models for analyzing allelic imbalance in single-cell datasets. Bioinformatics 2022; 38:2773-2780. [PMID: 35561168 PMCID: PMC9113279 DOI: 10.1093/bioinformatics/btac212] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 03/05/2022] [Accepted: 04/05/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Allelic expression analysis aids in detection of cis-regulatory mechanisms of genetic variation, which produce allelic imbalance (AI) in heterozygotes. Measuring AI in bulk data lacking time or spatial resolution has the limitation that cell-type-specific (CTS), spatial- or time-dependent AI signals may be dampened or not detected. RESULTS We introduce a statistical method airpart for identifying differential CTS AI from single-cell RNA-sequencing data, or dynamics AI from other spatially or time-resolved datasets. airpart outputs discrete partitions of data, pointing to groups of genes and cells under common mechanisms of cis-genetic regulation. In order to account for low counts in single-cell data, our method uses a Generalized Fused Lasso with Binomial likelihood for partitioning groups of cells by AI signal, and a hierarchical Bayesian model for AI statistical inference. In simulation, airpart accurately detected partitions of cell types by their AI and had lower Root Mean Square Error (RMSE) of allelic ratio estimates than existing methods. In real data, airpart identified differential allelic imbalance patterns across cell states and could be used to define trends of AI signal over spatial or time axes. AVAILABILITY AND IMPLEMENTATION The airpart package is available as an R/Bioconductor package at https://bioconductor.org/packages/airpart. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wancen Mu
- To whom correspondence should be addressed. or
| | - Hirak Sarkar
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | | |
Collapse
|
27
|
Olayinka OA, O'Neill NK, Farrer LA, Wang G, Zhang X. Molecular Quantitative Trait Locus Mapping in Human Complex Diseases. Curr Protoc 2022; 2:e426. [PMID: 35587224 PMCID: PMC9186089 DOI: 10.1002/cpz1.426] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Mapping quantitative trait loci (QTLs) for molecular traits from chromatin to metabolites (i.e., xQTLs) provides insight into the locations and effect modes of genetic variants that influence these molecular phenotypes and the propagation of functional consequences of each variant. xQTL studies indirectly interrogate the functional landscape of the molecular basis of complex diseases, including the impact of non-coding regulatory variants, the tissue specificity of regulatory elements, and their contribution to disease by integrating with genome-wide association studies (GWAS). We summarize a variety of molecular xQTL studies in human tissues and cells. In addition, using the Alzheimer's Disease Sequencing Project (ADSP) as an example, we describe the ADSP xQTL project, a collaborative effort across the ADSP Functional Genomics Consortium (ADSP-FGC). The project's ultimate goal is a reference map of Alzheimer's-related QTLs using existing datasets from multiple omics layers to help us study the consequences of genetic variants identified in the ADSP. xQTL studies enable the identification of the causal genes and pathways in GWAS loci, which will likely aid in the discovery of novel biomarkers and therapeutic targets for complex diseases. © 2022 Wiley Periodicals LLC.
Collapse
Affiliation(s)
- Oluwatosin A Olayinka
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
| | - Nicholas K O'Neill
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
| | - Lindsay A Farrer
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts
- Department of Ophthalmology, Boston University School of Medicine, Boston, Massachusetts
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
- Department of Epidemiology, Boston University School of Public Health, Boston, Massachusetts
| | - Gao Wang
- Department of Neurology, Columbia University, New York, New York
- Gertrude H. Sergievsky Center, Columbia University, New York, New York
| | - Xiaoling Zhang
- Bioinformatics Program, Boston University, Boston, Massachusetts
- Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, Massachusetts
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts
| |
Collapse
|
28
|
Maria M, Pouyanfar N, Örd T, Kaikkonen MU. The Power of Single-Cell RNA Sequencing in eQTL Discovery. Genes (Basel) 2022; 13:502. [PMID: 35328055 PMCID: PMC8949403 DOI: 10.3390/genes13030502] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/07/2022] [Accepted: 03/10/2022] [Indexed: 02/05/2023] Open
Abstract
Genome-wide association studies have successfully mapped thousands of loci associated with complex traits. During the last decade, functional genomics approaches combining genotype information with bulk RNA-sequencing data have identified genes regulated by GWAS loci through expression quantitative trait locus (eQTL) analysis. Single-cell RNA-Sequencing (scRNA-Seq) technologies have created new exciting opportunities for spatiotemporal assessment of changes in gene expression at the single-cell level in complex and inherited conditions. A growing number of studies have demonstrated the power of scRNA-Seq in eQTL mapping across different cell types, developmental stages and stimuli that could be obscured when using bulk RNA-Seq methods. In this review, we outline the methodological principles, advantages, limitations and the future experimental and analytical considerations of single-cell eQTL studies. We look forward to the explosion of single-cell eQTL studies applied to large-scale population genetics to take us one step closer to understanding the molecular mechanisms of disease.
Collapse
Affiliation(s)
| | | | | | - Minna U. Kaikkonen
- A. I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, 70211 Kuopio, Finland; (M.M.); (N.P.); (T.Ö.)
| |
Collapse
|
29
|
Heinen T, Secchia S, Reddington JP, Zhao B, Furlong EEM, Stegle O. scDALI: modeling allelic heterogeneity in single cells reveals context-specific genetic regulation. Genome Biol 2022; 23:8. [PMID: 34991671 PMCID: PMC8734213 DOI: 10.1186/s13059-021-02593-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 12/27/2021] [Indexed: 01/04/2023] Open
Abstract
While it is established that the functional impact of genetic variation can vary across cell types and states, capturing this diversity remains challenging. Current studies using bulk sequencing either ignore this heterogeneity or use sorted cell populations, reducing discovery and explanatory power. Here, we develop scDALI, a versatile computational framework that integrates information on cellular states with allelic quantifications of single-cell sequencing data to characterize cell-state-specific genetic effects. We apply scDALI to scATAC-seq profiles from developing F1 Drosophila embryos and scRNA-seq from differentiating human iPSCs, uncovering heterogeneous genetic effects in specific lineages, developmental stages, or cell types.
Collapse
Affiliation(s)
- Tobias Heinen
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
| | - Stefano Secchia
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Faculty of Biosciences, Collaboration for Joint PhD Degree between EMBL and Heidelberg University, Heidelberg, Germany
| | - James P Reddington
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Bingqing Zhao
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Eileen E M Furlong
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| | - Oliver Stegle
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| |
Collapse
|
30
|
Elorbany R, Popp JM, Rhodes K, Strober BJ, Barr K, Qi G, Gilad Y, Battle A. Single-cell sequencing reveals lineage-specific dynamic genetic regulation of gene expression during human cardiomyocyte differentiation. PLoS Genet 2022; 18:e1009666. [PMID: 35061661 PMCID: PMC8809621 DOI: 10.1371/journal.pgen.1009666] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 02/02/2022] [Accepted: 12/21/2021] [Indexed: 12/13/2022] Open
Abstract
Dynamic and temporally specific gene regulatory changes may underlie unexplained genetic associations with complex disease. During a dynamic process such as cellular differentiation, the overall cell type composition of a tissue (or an in vitro culture) and the gene regulatory profile of each cell can both experience significant changes over time. To identify these dynamic effects in high resolution, we collected single-cell RNA-sequencing data over a differentiation time course from induced pluripotent stem cells to cardiomyocytes, sampled at 7 unique time points in 19 human cell lines. We employed a flexible approach to map dynamic eQTLs whose effects vary significantly over the course of bifurcating differentiation trajectories, including many whose effects are specific to one of these two lineages. Our study design allowed us to distinguish true dynamic eQTLs affecting a specific cell lineage from expression changes driven by potentially non-genetic differences between cell lines such as cell composition. Additionally, we used the cell type profiles learned from single-cell data to deconvolve and re-analyze data from matched bulk RNA-seq samples. Using this approach, we were able to identify a large number of novel dynamic eQTLs in single cell data while also attributing dynamic effects in bulk to a particular lineage. Overall, we found that using single cell data to uncover dynamic eQTLs can provide new insight into the gene regulatory changes that occur among heterogeneous cell types during cardiomyocyte differentiation.
Collapse
Affiliation(s)
- Reem Elorbany
- Interdisciplinary Scientist Training Program, University of Chicago, Chicago, Illinois, United States of America
| | - Joshua M. Popp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Katherine Rhodes
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Benjamin J. Strober
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Kenneth Barr
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Guanghao Qi
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Yoav Gilad
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
- Department of Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Alexis Battle
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
31
|
Azodi CB, Zappia L, Oshlack A, McCarthy DJ. splatPop: simulating population scale single-cell RNA sequencing data. Genome Biol 2021; 22:341. [PMID: 34911537 DOI: 10.1186/s13059-021-02546-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 11/19/2021] [Indexed: 11/10/2022] Open
Abstract
Population-scale single-cell RNA sequencing (scRNA-seq) is now viable, enabling finer resolution functional genomics studies and leading to a rush to adapt bulk methods and develop new single-cell-specific methods to perform these studies. Simulations are useful for developing, testing, and benchmarking methods but current scRNA-seq simulation frameworks do not simulate population-scale data with genetic effects. Here, we present splatPop, a model for flexible, reproducible, and well-documented simulation of population-scale scRNA-seq data with known expression quantitative trait loci. splatPop can also simulate complex batch, cell group, and conditional effects between individuals from different cohorts as well as genetically-driven co-expression.
Collapse
Affiliation(s)
- Christina B Azodi
- St. Vincent's Institute of Medical Research, 9 Princes Street, Fitzroy, 3065, VIC, Australia.,University of Melbourne, Royal Parade, Parkville, 3010, VIC, Australia
| | - Luke Zappia
- Department of Mathematics, Technical University of Munich, Boltzmannstraße 3, Garching bei München, 85748, Germany.,Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstraße 1, Neuherberg, 85764, Germany
| | - Alicia Oshlack
- University of Melbourne, Royal Parade, Parkville, 3010, VIC, Australia.,Peter MacCallum Cancer Centre, Grattan Street, Melbourne, 3000, VIC, Australia
| | - Davis J McCarthy
- St. Vincent's Institute of Medical Research, 9 Princes Street, Fitzroy, 3065, VIC, Australia. .,University of Melbourne, Royal Parade, Parkville, 3010, VIC, Australia.
| |
Collapse
|
32
|
Kerimov N, Hayhurst JD, Peikova K, Manning JR, Walter P, Kolberg L, Samoviča M, Sakthivel MP, Kuzmin I, Trevanion SJ, Burdett T, Jupp S, Parkinson H, Papatheodorou I, Yates AD, Zerbino DR, Alasoo K. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat Genet 2021; 53:1290-1299. [PMID: 34493866 PMCID: PMC8423625 DOI: 10.1038/s41588-021-00924-w] [Citation(s) in RCA: 160] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 07/26/2021] [Indexed: 12/15/2022]
Abstract
Many gene expression quantitative trait locus (eQTL) studies have published their summary statistics, which can be used to gain insight into complex human traits by downstream analyses, such as fine mapping and co-localization. However, technical differences between these datasets are a barrier to their widespread use. Consequently, target genes for most genome-wide association study (GWAS) signals have still not been identified. In the present study, we present the eQTL Catalogue ( https://www.ebi.ac.uk/eqtl ), a resource of quality-controlled, uniformly re-computed gene expression and splicing QTLs from 21 studies. We find that, for matching cell types and tissues, the eQTL effect sizes are highly reproducible between studies. Although most QTLs were shared between most bulk tissues, we identified a greater diversity of cell-type-specific QTLs from purified cell types, a subset of which also manifested as new disease co-localizations. Our summary statistics are freely available to enable the systematic interpretation of human GWAS associations across many cell types and tissues.
Collapse
Affiliation(s)
- Nurlan Kerimov
- Institute of Computer Science, University of Tartu, Tartu, Estonia
- Open Targets, Wellcome Genome Campus, Cambridge, UK
| | - James D Hayhurst
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Kateryna Peikova
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Jonathan R Manning
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Peter Walter
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Liis Kolberg
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Marija Samoviča
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Manoj Pandian Sakthivel
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, Tartu, Estonia
| | - Stephen J Trevanion
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Tony Burdett
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Simon Jupp
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Helen Parkinson
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Irene Papatheodorou
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Andrew D Yates
- Open Targets, Wellcome Genome Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Daniel R Zerbino
- Open Targets, Wellcome Genome Campus, Cambridge, UK.
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
| | - Kaur Alasoo
- Institute of Computer Science, University of Tartu, Tartu, Estonia.
- Open Targets, Wellcome Genome Campus, Cambridge, UK.
| |
Collapse
|